Panic Mode On (113) Server Problems?

Message boards : Number crunching : Panic Mode On (113) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 26 · 27 · 28 · 29 · 30 · 31 · 32 . . . 37 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 1962537 - Posted: 30 Oct 2018, 6:34:30 UTC

Well, things did get interesting overnight.
Looks like the return rate just missed out on a sustained 300k/s, at least it's now dropped to "only" 190k and the splitters managed to get going again & are mostly keeping up with the present demand- occasionally getting ahead & re-building the Ready-to-send buffer, then dropping behind & the buffer drains again.
And while the deleters managed to catch up & get back on top of the work load, the Assimilators appear to have given up- the present backlog is just short of 1 million.

Interesting that after all the VLAR Arecibo work we had been getting, now we've got lots of quick GBT jobs, we're also getting lots of shorter running Arecibo work. The perversity of nature.
Grant
Darwin NT
ID: 1962537 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1962540 - Posted: 30 Oct 2018, 7:55:25 UTC - in response to Message 1962502.  

. . the Blc22 tasks are from the same date/time and are just as noisy, they are all noise bombs. We have to hope there is a good supply of Arecibo tapes to keep the Arecibo VLARs coming to slow things down and let the RTS refill. Hopefully the remaining blc22/blc23 tapes will split and clear before the outage so that maybe the Blc01 tapes can start to come out and are less noise prone.
Stephen

I'm not finding many BLC22 noise bombs at all. I'd say 90% are good.


. . Then I suggest you take the opportunity to buy a lottery ticket :). I am having the opposite results here. On all 4 machines.

Stephen

:(
ID: 1962540 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1962541 - Posted: 30 Oct 2018, 7:57:57 UTC - in response to Message 1962506.  

Well Green Banks should not have any noise. IMO the noise is probably a hardware issue. We will never know.


. . Even the best locations can experience RFI at some time. The earlier tapes in the current two series (which cover the same observing period) were pretty good and noise free, but at the tail end of the series both series seem to have experienced a rise in RFI.

Stephen

:(
ID: 1962541 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1962595 - Posted: 30 Oct 2018, 15:20:21 UTC
Last modified: 30 Oct 2018, 15:21:23 UTC

still up? i've aready crunched through 1000 of my 3600WU cache from 9am lol. lots of noise.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1962595 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1962596 - Posted: 31 Oct 2018, 5:26:05 UTC

. . Well the message boards are back up even if the servers aint....

Stephen

:)
ID: 1962596 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1962599 - Posted: 31 Oct 2018, 5:36:21 UTC

looks like the splitters are back up and running, but an OUTRAGE this long will take a while to recover from, especially if we still have noise bombs in the queue.
ID: 1962599 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1962602 - Posted: 31 Oct 2018, 5:42:37 UTC - in response to Message 1962599.  

looks like the splitters are back up and running, but an OUTRAGE this long will take a while to recover from, especially if we still have noise bombs in the queue.


. . With empty caches I will take whatever they have to send :)

Stephen

<fingers crossed>
ID: 1962602 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1849
Credit: 268,616,081
RAC: 1,349
United States
Message 1962606 - Posted: 31 Oct 2018, 5:47:58 UTC

I just got a dozen or so, 80% were noisy, 14 sec runtime.
ID: 1962606 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 1962610 - Posted: 31 Oct 2018, 6:00:50 UTC
Last modified: 31 Oct 2018, 6:03:47 UTC

Reporting work OK, but not able to get any so far.
Scheduler requests presently taking 20-30 secs instead of the usual 3sec.

And the forums are a bit sluggish at the moment.
Grant
Darwin NT
ID: 1962610 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1962613 - Posted: 31 Oct 2018, 6:06:21 UTC - in response to Message 1962606.  

I just got a dozen or so, 80% were noisy, 14 sec runtime.


. . I'm getting those HTTP errors and "Internet OK but servers may be down" messages. NO work for me :(

Stephen

:(
ID: 1962613 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 64989
Credit: 55,293,173
RAC: 49
United States
Message 1962614 - Posted: 31 Oct 2018, 6:08:09 UTC

All I'm getting is "No new tasks are available".
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1962614 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 1962615 - Posted: 31 Oct 2018, 6:08:27 UTC - in response to Message 1962613.  

. . I'm getting those HTTP errors and "Internet OK but servers may be down" messages. NO work for me :(

Just got a couple of Scheduler errors, then able to contact it OK again on the next request.
Still no work though.
Grant
Darwin NT
ID: 1962615 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 1962620 - Posted: 31 Oct 2018, 6:17:10 UTC

A lot of the server processes info haven't started being updated either.
Grant
Darwin NT
ID: 1962620 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 1962623 - Posted: 31 Oct 2018, 6:34:23 UTC - in response to Message 1962620.  

A lot of the server processes info haven't started being updated either.

Looks like they're updating again, and for all the WUs supposedly ready-to-send, the Scheduler isn't giving any of them away.
Grant
Darwin NT
ID: 1962623 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 1962625 - Posted: 31 Oct 2018, 6:48:20 UTC
Last modified: 31 Oct 2018, 6:55:46 UTC

Well, the system that still has some work managed to pick up 50 new WUs.
The one that is out of work managed to pick 1 single solitary WU, that took 10 seconds to process.
Grant
Darwin NT
ID: 1962625 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1962626 - Posted: 31 Oct 2018, 6:57:06 UTC - in response to Message 1962625.  

Well, the system that still has some work managed to pick up 50 new WUs.
The one that is out of work managed to pick 1 single solitary WU, that took 10 seconds to process.


I haven't managed to pickup anything on one system :(

Tom
A proud member of the OFA (Old Farts Association).
ID: 1962626 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1962627 - Posted: 31 Oct 2018, 7:05:41 UTC - in response to Message 1962615.  

. . I'm getting those HTTP errors and "Internet OK but servers may be down" messages. NO work for me :(

Just got a couple of Scheduler errors, then able to contact it OK again on the next request.
Still no work though.


. . I'm now getting some small amounts of work for the GPUs, but not seeing any for the CPUs.

Stephen

<shrug>
ID: 1962627 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1962628 - Posted: 31 Oct 2018, 7:07:28 UTC - in response to Message 1962625.  

Well, the system that still has some work managed to pick up 50 new WUs.
The one that is out of work managed to pick 1 single solitary WU, that took 10 seconds to process.


. . Saw that same thing on my most powerful unit, except I was lucky and it was NOT a noise bomb, it took 3 minutes to finsih :)

Stephen
ID: 1962628 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 1962629 - Posted: 31 Oct 2018, 7:14:04 UTC - in response to Message 1962627.  
Last modified: 31 Oct 2018, 7:51:43 UTC

. . I'm now getting some small amounts of work for the GPUs, but not seeing any for the CPUs.

Don't expect anything for the CPU till the GPUs have reached (or are very close to) the server side limits; although surprisingly I've managed to get some for my CPU in this most recent batch of work.

Just managed to pick up some work on the dry system- and it took over a minute for the downloads to finally start happening (and 4 of them so far have been noise bombs).
This is going to be a very messy recovery.
Grant
Darwin NT
ID: 1962629 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 1962632 - Posted: 31 Oct 2018, 7:51:25 UTC
Last modified: 31 Oct 2018, 8:30:04 UTC

I hope the splitters can get their act together soon, otherwise there won't be much work left when the Scheduler finally starts handing out work regularly.
Grant
Darwin NT
ID: 1962632 · Report as offensive
Previous · 1 . . . 26 · 27 · 28 · 29 · 30 · 31 · 32 . . . 37 · Next

Message boards : Number crunching : Panic Mode On (113) Server Problems?


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.