Panic Mode On (14) Server problems

Message boards : Number crunching : Panic Mode On (14) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 · Next

AuthorMessage
Andy Williams
Volunteer tester
Avatar

Send message
Joined: 11 May 01
Posts: 187
Credit: 112,464,820
RAC: 0
United States
Message 884105 - Posted: 10 Apr 2009, 23:57:24 UTC - in response to Message 884098.  

The AP splitters have not been running for several hours. I suspect something other than disk space issues.
--
Classic 82353 WU / 400979 h
ID: 884105 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13904
Credit: 208,696,464
RAC: 304
Australia
Message 884111 - Posted: 11 Apr 2009, 0:08:15 UTC - in response to Message 884105.  

The AP splitters have not been running for several hours. I suspect something other than disk space issues.

Yep.
The threshold setting for producing more work. Until the Ready to Send buffer gets below 4,500 i wouldn't expect the splitters to start up again. If it gets below 2,000 & they don't start up, then you can start panicing.
Grant
Darwin NT
ID: 884111 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 884185 - Posted: 11 Apr 2009, 4:33:33 UTC - in response to Message 884111.  

The AP splitters have not been running for several hours. I suspect something other than disk space issues.

Yep.
The threshold setting for producing more work. Until the Ready to Send buffer gets below 4,500 i wouldn't expect the splitters to start up again. If it gets below 2,000 & they don't start up, then you can start panicing.

I believe you're partly right, but the AP splitters had been set to stop producing work when the "Ready to send" queue got to about 2500, see the 30 day Scarecrow graphs. Earlier today "Ready to send" reached about 20000 indicating the automatic limiting not working. I think someone had to just shut them down remotely, or maybe there's a last resort safety script which killed them.

When automatic limiting is working the status of the ap_splitter processes doesn't go to "Not running". I think it takes human intervention to start them again, if the queue is below 1200 or so tomorrow perhaps someone will do so.
                                                                Joe
ID: 884185 · Report as offensive
HarryM
Volunteer tester

Send message
Joined: 24 Jul 08
Posts: 68
Credit: 3,812,695
RAC: 0
United States
Message 884228 - Posted: 11 Apr 2009, 11:45:51 UTC

Their back up running now.
ID: 884228 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1646
Credit: 12,921,799
RAC: 89
New Zealand
Message 884447 - Posted: 12 Apr 2009, 2:06:03 UTC - in response to Message 881366.  




Ehere can I the above graph in real time?
ID: 884447 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 884464 - Posted: 12 Apr 2009, 3:18:14 UTC - in response to Message 884447.  

Ehere can I the above graph in real time?

Copy the link out of the message you quoted, or use the standard page which contains the graph: http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;view=UcastPackets;ranges=d.
                                                                Joe
ID: 884464 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 884466 - Posted: 12 Apr 2009, 3:26:20 UTC - in response to Message 884464.  

Ehere can I the above graph in real time?

Copy the link out of the message you quoted, or use the standard page which contains the graph: http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;view=UcastPackets;ranges=d.
                                                                Joe

Yeah, either unicast packets/sec (above) or octets/sec can be used. There's different preferences for different people. I find octets to be a better measure of bandwidth consumption, seeing how packets can range in their size, but octets (bytes) are always the same.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 884466 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1646
Credit: 12,921,799
RAC: 89
New Zealand
Message 884513 - Posted: 12 Apr 2009, 9:34:47 UTC - in response to Message 884464.  
Last modified: 12 Apr 2009, 9:36:27 UTC

Ehere can I the above graph in real time?

Copy the link out of the message you quoted, or use the standard page which contains the graph: http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;view=UcastPackets;ranges=d.

Thanks for that Joe & Cosmic_Ocean
ID: 884513 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 884587 - Posted: 12 Apr 2009, 15:27:22 UTC

I have the octet page bookmarked on my machines.

ID: 884587 · Report as offensive
Profile White Mountain Wes
Avatar

Send message
Joined: 24 Jul 08
Posts: 259
Credit: 6,607,678
RAC: 4
United States
Message 884726 - Posted: 12 Apr 2009, 21:41:55 UTC - in response to Message 884723.  

Is It My imagination or is the server taking time to refresh and to update?


Yes... it's... running... very... slowly.
ID: 884726 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1646
Credit: 12,921,799
RAC: 89
New Zealand
Message 884727 - Posted: 12 Apr 2009, 21:42:45 UTC

It could be your imagination I'm not sure. Here is the traffic graph
[img] http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;view=Octets;ranges=d[/img] Work is flowing
ID: 884727 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13904
Credit: 208,696,464
RAC: 304
Australia
Message 884728 - Posted: 12 Apr 2009, 21:47:24 UTC - in response to Message 884723.  

Is It My imagination or is the server taking a lot of time to refresh and to update?
Something needs to get out and push, As output and input seem to have almost dropped to nothing.

Possibly. There was a similar dive in the traffic at the same time yesterday. And at the moment the forums are slower than a month of wet Sundays.
Grant
Darwin NT
ID: 884728 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 884736 - Posted: 12 Apr 2009, 23:38:53 UTC
Last modified: 12 Apr 2009, 23:50:47 UTC

Posting is now somwhat tedious so looks like time for bed.
Hope it is resolved by tomorrow.

Ooops Today
ID: 884736 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13904
Credit: 208,696,464
RAC: 304
Australia
Message 884816 - Posted: 13 Apr 2009, 3:06:52 UTC - in response to Message 884728.  


Ah, we're back. Something certainly broke for a while there. No uploads, downloads, or even the Seti site.
Grant
Darwin NT
ID: 884816 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 884838 - Posted: 13 Apr 2009, 4:24:38 UTC

Of course now the servers are in full tilt download and uploads are not getting through.

ID: 884838 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19545
Credit: 40,757,560
RAC: 67
United Kingdom
Message 884842 - Posted: 13 Apr 2009, 4:30:21 UTC - in response to Message 884838.  
Last modified: 13 Apr 2009, 4:31:55 UTC

Of course now the servers are in full tilt download and uploads are not getting through.

Not had any problems here, uploaded, reported and downloaded in just a few seconds.

13/04/2009 05:18:48|SETI@home|[file_xfer] Started upload of file ap_26fe09aa_B4_P1_00283_20090409_31426.wu_0_0
13/04/2009 05:18:49|Einstein@Home|Sending scheduler request: To report completed tasks
13/04/2009 05:18:49|Einstein@Home|Requesting 57839 seconds of new work, and reporting 4 completed tasks
13/04/2009 05:18:52|SETI@home|[file_xfer] Finished upload of file ap_26fe09aa_B4_P1_00283_20090409_31426.wu_0_0
13/04/2009 05:18:52|SETI@home|[file_xfer] Throughput 13500 bytes/sec
13/04/2009 05:18:53|Einstein@Home|Scheduler RPC succeeded
13/04/2009 05:18:53|Einstein@Home|Message from server: Project is temporarily shut down for maintenance
13/04/2009 05:18:53|Einstein@Home|Deferring communication for 1 hr 0 min 0 sec
13/04/2009 05:18:53|Einstein@Home|Reason: project is down
13/04/2009 05:18:53|Einstein@Home|Deferring communication for 2 hr 33 min 0 sec
13/04/2009 05:18:53|Einstein@Home|Reason: project is down
13/04/2009 05:18:59|SETI@home|Sending scheduler request: To report completed tasks
13/04/2009 05:18:59|SETI@home|Requesting 11202 seconds of new work, and reporting 1 completed tasks
13/04/2009 05:19:04|SETI@home|Scheduler RPC succeeded [server version 607]
13/04/2009 05:19:04|SETI@home|Deferring communication for 11 sec
13/04/2009 05:19:04|SETI@home|Reason: requested by project
13/04/2009 05:19:06|SETI@home|[file_xfer] Started download of file ap_21fe09ad_B6_P1_00102_20090412_31520.wu
13/04/2009 05:19:40|SETI@home|[file_xfer] Finished download of file ap_21fe09ad_B6_P1_00102_20090412_31520.wu
13/04/2009 05:19:40|SETI@home|[file_xfer] Throughput 253008 bytes/sec
ID: 884842 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 884929 - Posted: 13 Apr 2009, 14:13:00 UTC

Might be me. Availability is enough to give me at least one. But not according to my messages:

13-Apr-09 16:06:56 SETI@home [sched_op_debug] CPU work request: 14688.59 seconds; 2 idle CPUs
13-Apr-09 16:07:01 SETI@home Scheduler request completed: got 0 new tasks
13-Apr-09 16:07:01 SETI@home [sched_op_debug] Server version 607
13-Apr-09 16:07:01 SETI@home Message from server: No work sent
13-Apr-09 16:07:01 SETI@home Message from server: No work is available for SETI@home Enhanced
13-Apr-09 16:07:01 SETI@home Message from server: No work available for the applications you have selected. Please check your settings on the web site.
13-Apr-09 16:07:01 SETI@home Project requested delay of 11 seconds
13-Apr-09 16:07:01 SETI@home [sched_op_debug] Deferring communication for 11 sec
13-Apr-09 16:07:01 SETI@home [sched_op_debug] Reason: requested by project

Only asking a bit of MB. ;-)
ID: 884929 · Report as offensive
Profile Bill Walker
Avatar

Send message
Joined: 4 Sep 99
Posts: 3868
Credit: 2,697,267
RAC: 0
Canada
Message 884943 - Posted: 13 Apr 2009, 15:20:33 UTC
Last modified: 13 Apr 2009, 15:21:56 UTC

Maybe this is just a coincidence, but since switching to BOINC 6.6.20 I've gone from usually having one or two APs waiting, as well as one running, to now having one AP 84% complete, zero waiting, and getting this when I update SETI:

13/04/2009 11:18:15 AM SETI@home Sending scheduler request: Requested by user.
13/04/2009 11:18:15 AM SETI@home Not reporting or requesting tasks
13/04/2009 11:18:21 AM SETI@home Scheduler request completed: got 0 new tasks

Is this the SETI servers, or BOINC, or what? I'm running Rosetta and WCG on the same computer, with no big change in their amount of work.

ID: 884943 · Report as offensive
Profile Bill Walker
Avatar

Send message
Joined: 4 Sep 99
Posts: 3868
Credit: 2,697,267
RAC: 0
Canada
Message 885063 - Posted: 13 Apr 2009, 19:47:49 UTC - in response to Message 884943.  

Maybe this is just a coincidence, but since switching to BOINC 6.6.20 I've gone from usually having one or two APs waiting, as well as one running, to now having one AP 84% complete, zero waiting, and getting this when I update SETI:

13/04/2009 11:18:15 AM SETI@home Sending scheduler request: Requested by user.
13/04/2009 11:18:15 AM SETI@home Not reporting or requesting tasks
13/04/2009 11:18:21 AM SETI@home Scheduler request completed: got 0 new tasks

Is this the SETI servers, or BOINC, or what? I'm running Rosetta and WCG on the same computer, with no big change in their amount of work.


Cancel the panic, BOINC decided to download 3 APs after my last finished:

13/04/2009 3:40:19 PM SETI@home Started upload of ap_26fe09ab_B0_P0_00137_20090409_17510.wu_2_0
13/04/2009 3:40:24 PM SETI@home Finished upload of ap_26fe09ab_B0_P0_00137_20090409_17510.wu_2_0
13/04/2009 3:40:34 PM SETI@home Sending scheduler request: To fetch work.
13/04/2009 3:40:34 PM SETI@home Reporting 1 completed tasks, requesting new tasks



ID: 885063 · Report as offensive
Profile KW2E
Avatar

Send message
Joined: 18 May 99
Posts: 346
Credit: 104,396,190
RAC: 34
United States
Message 886081 - Posted: 17 Apr 2009, 15:45:17 UTC

Holy database crash Batman!

Even the cricket reports are offline.

Rob
ID: 886081 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 · Next

Message boards : Number crunching : Panic Mode On (14) Server problems


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.