Panic Mode On (22) Server problems

Message boards : Number crunching : Panic Mode On (22) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 11 · Next

AuthorMessage
Profile cliff west

Send message
Joined: 7 May 01
Posts: 211
Credit: 16,180,728
RAC: 15
United States
Message 925465 - Posted: 12 Aug 2009, 2:00:46 UTC - in response to Message 925315.  

and now WU are not uploading. has anyone down load any today?
ID: 925465 · Report as offensive
Pablo_ARG

Send message
Joined: 13 Aug 03
Posts: 12
Credit: 2,041,544
RAC: 0
Argentina
Message 925467 - Posted: 12 Aug 2009, 2:13:56 UTC

same here. more than 30 WU waiting in upload queue.
ID: 925467 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 925477 - Posted: 12 Aug 2009, 3:12:16 UTC - in response to Message 925465.  
Last modified: 12 Aug 2009, 3:17:11 UTC

Can't report either. If they do magage to upload, will I be able to get new downloads without the others reporting? I did get some downloads earlier but they all haven't downloaded yet.
ID: 925477 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 925480 - Posted: 12 Aug 2009, 3:30:19 UTC - in response to Message 925477.  

Can't report either. If they do magage to upload, will I be able to get new downloads without the others reporting? I did get some downloads earlier but they all haven't downloaded yet.

Reporting and work fetch are done in a sched_request, usually they'll either both be successful or neither.

During the outage they found at least one more file to be split for AP, and adding that to the nearly full pipe caused by MB 'shorties' has saturated things for now. I'll guess about 5 hours before the APs are all out, or if some longer MB work turns up the difficulties could end sooner.
                                                               Joe
ID: 925480 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 925497 - Posted: 12 Aug 2009, 4:30:09 UTC - in response to Message 925480.  

Ok, thanks. Patiently waiting,hoping for the best.
ID: 925497 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 925510 - Posted: 12 Aug 2009, 6:34:56 UTC

OK thanks for this info, switched on this morning and one wu could not upload. Usual after an outage so I too will wait it out
ID: 925510 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 925520 - Posted: 12 Aug 2009, 8:31:33 UTC - in response to Message 925480.  

During the outage they found at least one more file to be split for AP, and adding that to the nearly full pipe caused by MB 'shorties' has saturated things for now. I'll guess about 5 hours before the APs are all out, or if some longer MB work turns up the difficulties could end sooner.
                                                               Joe

All the AP tapes have gone from the server status page - presumably split at maximum possible speed by all four splitters working at once, with the biggest possible impact on bandwidth - yup, Scarecrow's daemon history page confirms that. And they finished just as the new daily quota kicked in. Be a while before the mammoths move on from the watering hole, and the rest of us can take a sip.
ID: 925520 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 925528 - Posted: 12 Aug 2009, 9:58:31 UTC - in response to Message 925520.  

Finally got the earlier DL's to go thru, and after waiting some more, got the uploads to go thru.
Now
SETI@home Message from server: (Project has no jobs available)
Seems to be the theme of my messages :P

Getting a few shorties though so I'm still plugging along.
ID: 925528 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19048
Credit: 40,757,560
RAC: 67
United Kingdom
Message 925535 - Posted: 12 Aug 2009, 11:31:05 UTC - in response to Message 925520.  

During the outage they found at least one more file to be split for AP, and adding that to the nearly full pipe caused by MB 'shorties' has saturated things for now. I'll guess about 5 hours before the APs are all out, or if some longer MB work turns up the difficulties could end sooner.
                                                               Joe

All the AP tapes have gone from the server status page - presumably split at maximum possible speed by all four splitters working at once, with the biggest possible impact on bandwidth - yup, Scarecrow's daemon history page confirms that. And they finished just as the new daily quota kicked in. Be a while before the mammoths move on from the watering hole, and the rest of us can take a sip.

That's the West coast mammoths probably. For us Europeans that do not run 24/7, we probably switch off before Matt, et al, finish work, and then the AP tasks are all taken before we switch back on the following morning.
ID: 925535 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 925538 - Posted: 12 Aug 2009, 11:43:35 UTC - in response to Message 925535.  

During the outage they found at least one more file to be split for AP, and adding that to the nearly full pipe caused by MB 'shorties' has saturated things for now. I'll guess about 5 hours before the APs are all out, or if some longer MB work turns up the difficulties could end sooner.
                                                               Joe

All the AP tapes have gone from the server status page - presumably split at maximum possible speed by all four splitters working at once, with the biggest possible impact on bandwidth - yup, Scarecrow's daemon history page confirms that. And they finished just as the new daily quota kicked in. Be a while before the mammoths move on from the watering hole, and the rest of us can take a sip.

That's the West coast mammoths probably. For us Europeans that do not run 24/7, we probably switch off before Matt, et al, finish work, and then the AP tasks are all taken before we switch back on the following morning.

Do mammoths ever sleep?
ID: 925538 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 925545 - Posted: 12 Aug 2009, 12:24:45 UTC

Servers seem slow but i did get some work for this new i7. And was able to return some work from the other two crunchers.
[/quote]

Old James
ID: 925545 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 925561 - Posted: 12 Aug 2009, 14:22:41 UTC - in response to Message 925545.  

Hi, I received a lot of work on 3 QUAD's, but saw some ERROR's too, all MB WU's ending after 1 second.
I'll have a look into it.

ID: 925561 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 925563 - Posted: 12 Aug 2009, 14:34:08 UTC - in response to Message 925561.  
Last modified: 12 Aug 2009, 14:35:02 UTC

Hi, I received a lot of work on 3 QUAD's, but saw some ERROR's too, all MB WU's ending after 1 second.
I'll have a look into it.

VLAR WU (AR: 0.010580 )detected... autokill initialised
SETI@home error -6 Bad workunit header

Looks like VLAR kill so far with the ones returned recently.
ID: 925563 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 925611 - Posted: 12 Aug 2009, 17:40:48 UTC

Four new tapes loaded, four AP splitters running, downloads maxxed out, uploads stalled - we're back to the good old days, before the doubled sensitivity (and the new data disks don't seem to have arrived from Arecibo yet - it's all old data).
ID: 925611 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13732
Credit: 208,696,464
RAC: 304
Australia
Message 925614 - Posted: 12 Aug 2009, 17:54:41 UTC - in response to Message 925611.  


And the Assimilators haven't assimilated anything for the last few days.
Log jam approaching.
Grant
Darwin NT
ID: 925614 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 925677 - Posted: 12 Aug 2009, 21:21:19 UTC - in response to Message 925611.  

Four new tapes loaded, four AP splitters running, downloads maxxed out, uploads stalled - we're back to the good old days, before the doubled sensitivity (and the new data disks don't seem to have arrived from Arecibo yet - it's all old data).

One thing I just thought of..and maybe this has been mentioned before..

Is there any reason why we have to have four splitters running for AP? I know they work really well when they work on the same tape, whereas MB works better on separate tapes.. but I noticed that with the 97:3 ratio for the feeder cache, AP is still burning through tapes much faster than MB. So if they were to cut back to just one splitter..would that help anything?
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 925677 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 925685 - Posted: 12 Aug 2009, 21:50:51 UTC

Looks like Anakin is having problems currently as I cannot even report finished uploads right now.

ID: 925685 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 925731 - Posted: 13 Aug 2009, 1:27:23 UTC - in response to Message 925677.  

Four new tapes loaded, four AP splitters running, downloads maxxed out, uploads stalled - we're back to the good old days, before the doubled sensitivity (and the new data disks don't seem to have arrived from Arecibo yet - it's all old data).

One thing I just thought of..and maybe this has been mentioned before..

Is there any reason why we have to have four splitters running for AP? I know they work really well when they work on the same tape, whereas MB works better on separate tapes.. but I noticed that with the 97:3 ratio for the feeder cache, AP is still burning through tapes much faster than MB. So if they were to cut back to just one splitter..would that help anything?

Sure it would. The four ap_splitter processes together manage to produce about 0.6 tasks per second, even dropping to two would halve that and reduce the average amount of download bandwidth for AP from ~42 MBits/second to ~21 MBits/second. Reducing the number of mb_splitter processes to three then might be enough to let "Results ready to send" queues empty out. After that, work delivery would be just what the splitters could produce between each Feeder run. To the extent the download saturation is directly responsible for difficulty in uploading and contacting the Scheduler that should help a lot. We'd get a lot of "(Project has no work available)" responses to work requests, but that's better than no response at all.

The mb_splitter processes each work on a different file as an attempt to avoid getting all 'shorty' work, which doesn't always work. Both kinds of splitters use the same routine to get data from a tape channel, it probably doesn't make a lot of difference if those channels are from the same file or different files, though having fewer files in play at one time would obviously be at least slightly less system load.
                                                              Joe
ID: 925731 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 925752 - Posted: 13 Aug 2009, 2:43:33 UTC - in response to Message 925731.  
Last modified: 13 Aug 2009, 2:43:59 UTC

Four new tapes loaded, four AP splitters running, downloads maxxed out, uploads stalled - we're back to the good old days, before the doubled sensitivity (and the new data disks don't seem to have arrived from Arecibo yet - it's all old data).

One thing I just thought of..and maybe this has been mentioned before..

Is there any reason why we have to have four splitters running for AP? I know they work really well when they work on the same tape, whereas MB works better on separate tapes.. but I noticed that with the 97:3 ratio for the feeder cache, AP is still burning through tapes much faster than MB. So if they were to cut back to just one splitter..would that help anything?

Sure it would. The four ap_splitter processes together manage to produce about 0.6 tasks per second, even dropping to two would halve that and reduce the average amount of download bandwidth for AP from ~42 MBits/second to ~21 MBits/second. Reducing the number of mb_splitter processes to three then might be enough to let "Results ready to send" queues empty out. After that, work delivery would be just what the splitters could produce between each Feeder run. To the extent the download saturation is directly responsible for difficulty in uploading and contacting the Scheduler that should help a lot. We'd get a lot of "(Project has no work available)" responses to work requests, but that's better than no response at all.

The mb_splitter processes each work on a different file as an attempt to avoid getting all 'shorty' work, which doesn't always work. Both kinds of splitters use the same routine to get data from a tape channel, it probably doesn't make a lot of difference if those channels are from the same file or different files, though having fewer files in play at one time would obviously be at least slightly less system load.
                                                              Joe

I was thinking of some sort of chron job that looked at the progress splitting AP, and the progress splitting MB, and maybe stopped three of the AP splitters if they got well-ahead of MB.

That could be done without modifying BOINC.
ID: 925752 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13732
Credit: 208,696,464
RAC: 304
Australia
Message 925776 - Posted: 13 Aug 2009, 6:43:56 UTC - in response to Message 925752.  


Still no assimilating by the Assimilators.
Grant
Darwin NT
ID: 925776 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (22) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.