Panic Mode On (77) Server Problems?

Message boards : Number crunching : Panic Mode On (77) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 22 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14645
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1296371 - Posted: 17 Oct 2012, 22:39:03 UTC

Eric was dealing with the problem that led to new WU numbering scheme

It was a compiler bug, messing up the splitter code.
ID: 1296371 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1296423 - Posted: 18 Oct 2012, 0:48:54 UTC

Not complaining nor panic.. but my RAC has dropped in half due to no APs. Might be a good thing though. I knew when I stole my PSU from another machine that the squealing/chirping that was intermittently coming from the capacitors meant that it wouldn't have too terribly long to live. It is still alive, but my board has been beeping some error alert for 30 seconds every 6-8 hours and I'm not sure why, but it probably has something to do with some voltage being below a threshold.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1296423 · Report as offensive
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 1296448 - Posted: 18 Oct 2012, 3:15:07 UTC - in response to Message 1296371.  

Excerpts from another thread:

...I noticed a number of new units I am receiving have unusually long names, as in:
21ap11af.25749.2934.140733193388044.10.40_2
versus a more typical:
09au12ag.28990.14407.13.10.216_0
Alternatively, does anyone know what that new 15 digit number in the fourth position means?

...I hope we get an official explanation soon, and I hope these "longnames" aren't wreaking havoc with the SETI database.

...I was also thinking along the line of, if these names are in error, as was suggested in the thread referenced by Wiggo, then it might cause serious problems later when the SETI staff attempt to identify and work with the returned results. If so, then I hope it turns out to be a non-issue, or at least a minor "ho-hum...". :)


Now Eric is "dealing with the problem that led to new WU numbering scheme" that came from a "compiler bug, messing up the splitter code".

Based on all the above, I wonder if it would help Eric to send a BOINC Notice asking those of us who feel comfortable with manual manipulation of our caches to suspend until further notice all cached WUs with the apparently erroneous "longnames"?
ID: 1296448 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14645
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1296471 - Posted: 18 Oct 2012, 6:29:39 UTC - in response to Message 1296448.  

Now Eric is "dealing with the problem that led to new WU numbering scheme" that came from a "compiler bug, messing up the splitter code".

Based on all the above, I wonder if it would help Eric to send a BOINC Notice asking those of us who feel comfortable with manual manipulation of our caches to suspend until further notice all cached WUs with the apparently erroneous "longnames"?

No, no need to do anything like that. Eric dealt with the long names in passing, while working on testing for the new v7 application at SETI Beta. Here's what he said:

That huge number is supposed to be the ID of the receiver we used, which is obtained from a library routine. I'm guessing that at some point the library was compiled with a version of g++ that doesn't have the same storage format for the data type used.

Fortunately the correct values is used in the workunit header, so we don't have to do all this work over.

(my emphasis)

The technically minded can read the gory details from beta message 44104 onwards.
ID: 1296471 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1296480 - Posted: 18 Oct 2012, 7:13:12 UTC - in response to Message 1296471.  


Once again the splitters are unable to keep up with work demand.
Grant
Darwin NT
ID: 1296480 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1296526 - Posted: 18 Oct 2012, 14:28:18 UTC

Why is it there are 6 AP splitters and only 5 MB splitters? AP gets split faster than MB anyway, so it seems to me it would make more sense to have more MB splitters than APs. Could lando run a couple more MBs, especially since he's not running his APs right now?

Okay, now I'm wondering... isn't the SSP supposed to update every 10 minutes? It's now 1427 UTC and the page is still showing 1410.

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1296526 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1296543 - Posted: 18 Oct 2012, 15:14:44 UTC - in response to Message 1296526.  

Why is it there are 6 AP splitters and only 5 MB splitters? AP gets split faster than MB anyway, so it seems to me it would make more sense to have more MB splitters than APs. Could lando run a couple more MBs, especially since he's not running his APs right now?

Okay, now I'm wondering... isn't the SSP supposed to update every 10 minutes? It's now 1427 UTC and the page is still showing 1410.

Sometimes it's every 10 minutes and sometimes every 20...
Last update 15:00.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1296543 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1296997 - Posted: 19 Oct 2012, 21:25:53 UTC


Bit of a backlog of MB validations building up.
Grant
Darwin NT
ID: 1296997 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1297343 - Posted: 20 Oct 2012, 21:56:39 UTC - in response to Message 1296997.  


MB validators appear to be clearing their backlog, now the assimilators are falling behind.
Grant
Darwin NT
ID: 1297343 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 69
United Kingdom
Message 1298419 - Posted: 24 Oct 2012, 21:17:21 UTC
Last modified: 24 Oct 2012, 21:20:01 UTC

Hay man what happend.
Sumfing is broke
The cricket is dead, gon all white :(
ID: 1298419 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1298424 - Posted: 24 Oct 2012, 22:30:59 UTC

Project is temporarily shut down for maintenance


.. and then ..

Scheduler request failed: Couldn't connect to server	
Scheduler request failed: HTTP internal server error	
Scheduler request failed: Timeout was reached



* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1298424 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1298435 - Posted: 24 Oct 2012, 22:56:24 UTC - in response to Message 1298424.  

Things have been working well here since the servers come back online (approx. 90mins now).

Cheers.
ID: 1298435 · Report as offensive
Profile Blurf
Volunteer tester

Send message
Joined: 2 Sep 06
Posts: 8962
Credit: 12,678,685
RAC: 0
United States
Message 1298445 - Posted: 24 Oct 2012, 23:17:54 UTC

ID: 1298445 · Report as offensive
Profile Mad Fritz
Avatar

Send message
Joined: 20 Jul 01
Posts: 87
Credit: 11,334,904
RAC: 0
Switzerland
Message 1298460 - Posted: 25 Oct 2012, 0:05:30 UTC - in response to Message 1298445.  
Last modified: 25 Oct 2012, 0:45:39 UTC

I knew it wouldn't last long... no, not the outage ;-)
The amazingly super-fast DL that happened the last days I mean :-(

Edit:
It seems to be better now... not as fast as before, but acceptable.
ID: 1298460 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1298634 - Posted: 25 Oct 2012, 15:33:52 UTC - in response to Message 1298460.  

I knew it wouldn't last long... no, not the outage ;-)
The amazingly super-fast DL that happened the last days I mean :-(

Edit:
It seems to be better now... not as fast as before, but acceptable.

Looks like a "shorty storm", got nothing else than about 40 of those today on my dual core machine.
ID: 1298634 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1299124 - Posted: 26 Oct 2012, 21:35:18 UTC - in response to Message 1298634.  


Getting lots of "Scheduler request failed: Timeout was reached" messages again. Around 4 in 5 Scheduler requests over the last 12 hours.
Grant
Darwin NT
ID: 1299124 · Report as offensive
musicplayer

Send message
Joined: 17 May 10
Posts: 2430
Credit: 926,046
RAC: 0
Message 1299130 - Posted: 26 Oct 2012, 21:48:27 UTC
Last modified: 26 Oct 2012, 21:55:47 UTC

I am only receiving CUDA-tasks right now.

If so many tasks being run returns inconclusive results (not necessarily errors), what is the point of having these types of tasks downloaded right now?

How high percentage level is the server willing to accept from the users clients in total when it comes to the number of returned tasks?

Edit: Perhaps I should report the tasks I have finished up first.
ID: 1299130 · Report as offensive
chromespringer
Avatar

Send message
Joined: 3 Dec 05
Posts: 296
Credit: 55,183,482
RAC: 0
United States
Message 1299154 - Posted: 26 Oct 2012, 23:10:04 UTC - in response to Message 1299124.  


Getting lots of "Scheduler request failed: Timeout was reached" messages again. Around 4 in 5 Scheduler requests over the last 12 hours.


10/26/2012 5:00:43 PM | SETI@home | update requested by user
10/26/2012 5:02:26 PM | SETI@home | Scheduler request failed: Timeout was reached
10/26/2012 5:02:30 PM | | Project communication failed: attempting access to reference site
10/26/2012 5:02:31 PM | | Internet access OK - project servers may be temporarily down.

me too :( have 0 tasks for cpu now
ID: 1299154 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1299164 - Posted: 26 Oct 2012, 23:48:39 UTC - in response to Message 1299124.  
Last modified: 26 Oct 2012, 23:59:38 UTC


Getting lots of "Scheduler request failed: Timeout was reached" messages again. Around 4 in 5 Scheduler requests over the last 12 hours.


And now every request for more work in the last hour has timed out.

EDIT- 6th request finally got some work. Naturally the next request timed out.
Grant
Darwin NT
ID: 1299164 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1299177 - Posted: 27 Oct 2012, 0:45:02 UTC

This is all I've gotten for like two weeks (totally not complaining):
2012-10-26 19:29:50|SETI@home|Sending scheduler request: To fetch work. Requesting 2592003 seconds of work, reporting 0 completed tasks
2012-10-26 19:29:55|SETI@home|Scheduler request succeeded: got 0 new tasks
2012-10-26 19:29:55|SETI@home|Message from server: No tasks sent
2012-10-26 19:29:55|SETI@home|Message from server: No tasks are available for AstroPulse v6
2012-10-26 19:29:55|SETI@home|Message from server: Your app_info.xml file doesn't have a usable version of SETI@home Enhanced.


AP will live once again.. eventually.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1299177 · Report as offensive
Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (77) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.