Panic Mode On (113) Server Problems?

Message boards : Number crunching : Panic Mode On (113) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 24 · 25 · 26 · 27 · 28 · 29 · 30 . . . 37 · Next

AuthorMessage
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24922
Credit: 3,081,182
RAC: 7
Ireland
Message 1962113 - Posted: 27 Oct 2018, 12:11:37 UTC
Last modified: 27 Oct 2018, 12:17:05 UTC

Haven't touched settings for a considerable time.
Not running other projects for some time.
Got several wu's about 8am & only getting one at a time every time a wu is completed & reported. What's happened to the 100 wu limit(only have 9 atm)? How can the project report the cache is full when it is empty?
ID: 1962113 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14687
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1962115 - Posted: 27 Oct 2018, 12:33:35 UTC - in response to Message 1962113.  

Set <sched_op_debug> in Event Log options (Ctrl+Shift+F). and read the numbers. If that doesn't explain it, set <work_fetch_debug>. If you need help in analysing the log, please start a separate thread - it sounds like a problem at your end, not a sever problem.
ID: 1962115 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24922
Credit: 3,081,182
RAC: 7
Ireland
Message 1962121 - Posted: 27 Oct 2018, 13:11:44 UTC - in response to Message 1962115.  

Done.
ID: 1962121 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13886
Credit: 208,696,464
RAC: 304
Australia
Message 1962170 - Posted: 27 Oct 2018, 21:26:04 UTC - in response to Message 1962113.  
Last modified: 27 Oct 2018, 21:34:01 UTC

Haven't touched settings for a considerable time.
Not running other projects for some time.

Your system shows it's got plenty of Beta AP work.

It takes over 24 hours for you to process 2 AP WUs with the SSE application, and well over 2 days for a single AP WU with the v7.00. You've got 6 SSE & 5 v7.00 WUs in progress, so that makes almost 17 days worth of AP work on your system at present. And it looks like almost 2 days worth of MB worth. That's almost 20 days worth of work, which is the maximum possible cache setting.

Edit- noticed different application runtimes & therefore cache size.
Grant
Darwin NT
ID: 1962170 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1962194 - Posted: 28 Oct 2018, 0:07:22 UTC - in response to Message 1962170.  

Haven't touched settings for a considerable time.
Not running other projects for some time.

Your system shows it's got plenty of Beta AP work.

It takes over 24 hours for you to process 2 AP WUs with the SSE application, and well over 2 days for a single AP WU with the v7.00. You've got 6 SSE & 5 v7.00 WUs in progress, so that makes almost 17 days worth of AP work on your system at present. And it looks like almost 2 days worth of MB worth. That's almost 20 days worth of work, which is the maximum possible cache setting.

Edit- noticed different application runtimes & therefore cache size.


. . Well spotted Grant ...

Stephen

:)
ID: 1962194 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1962370 - Posted: 29 Oct 2018, 7:31:26 UTC

There is a bunch of noisy files now.
Received last hour is up to 164k and climbing.
I hope this doesn't last long.
ID: 1962370 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13886
Credit: 208,696,464
RAC: 304
Australia
Message 1962374 - Posted: 29 Oct 2018, 7:45:06 UTC - in response to Message 1962370.  

There is a bunch of noisy files now.
Received last hour is up to 164k and climbing.
I hope this doesn't last long.

Now 169k.
This is really going to test the servers.
Grant
Darwin NT
ID: 1962374 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1962375 - Posted: 29 Oct 2018, 7:45:07 UTC - in response to Message 1962370.  

There is a bunch of noisy files now.
Received last hour is up to 164k and climbing.
I hope this doesn't last long.

It certainly has been a while since we have seen a high return rate
ID: 1962375 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13886
Credit: 208,696,464
RAC: 304
Australia
Message 1962378 - Posted: 29 Oct 2018, 8:46:14 UTC - in response to Message 1962374.  

Now 169k.

180k, but it seems to have levelled off there.
Grant
Darwin NT
ID: 1962378 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13886
Credit: 208,696,464
RAC: 304
Australia
Message 1962382 - Posted: 29 Oct 2018, 9:26:56 UTC - in response to Message 1962378.  

Now 169k.

180k, but it seems to have levelled off there.

Spoke too soon- now over 186k, splitter output has fallen off & Ready-to-send buffer is starting to empty.
Grant
Darwin NT
ID: 1962382 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13886
Credit: 208,696,464
RAC: 304
Australia
Message 1962386 - Posted: 29 Oct 2018, 9:52:23 UTC - in response to Message 1962383.  

198K now.

More than double our presnt usual load.
At least the splitter output has recovered, mostly. But demand is still way above supply.
It's server thrashing time.
Grant
Darwin NT
ID: 1962386 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13886
Credit: 208,696,464
RAC: 304
Australia
Message 1962388 - Posted: 29 Oct 2018, 10:07:29 UTC
Last modified: 29 Oct 2018, 10:08:57 UTC

Wow.
Just under 210k.

And looks like the Assimilators are staring to have problems keeping up with that sustained load.
Grant
Darwin NT
ID: 1962388 · Report as offensive
Kevin Olley

Send message
Joined: 3 Aug 99
Posts: 906
Credit: 261,085,289
RAC: 572
United Kingdom
Message 1962389 - Posted: 29 Oct 2018, 10:25:32 UTC

221k now.

Looking at my transfers the upload server is struggling at times as well.
Kevin


ID: 1962389 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13886
Credit: 208,696,464
RAC: 304
Australia
Message 1962392 - Posted: 29 Oct 2018, 10:48:20 UTC - in response to Message 1962389.  
Last modified: 29 Oct 2018, 10:51:02 UTC

221k now.

Make that 233k.
And the splitter output is diving, along with the Ready-to-send buffer.
If the load doesn't taper off soon, it's all going to come to a grinding halt till the returned work load drops off significantly and the backlogs clear.
A big batch of Arecibo VLARs & AP work would be helpful right about now.

Looking at my transfers the upload server is struggling at times as well.

I've noticed a few downloads taking a while to get going.
Grant
Darwin NT
ID: 1962392 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1962402 - Posted: 29 Oct 2018, 12:19:22 UTC

Hey - I have noticed many WUs recently that process in < 10 secs. Like dozens or hundreds/day, both on GPU and CPU. Are these the "noisy" WUs referred to above? Are they the variety that has 30 spikes detected and then dropped by the app(s)?
ID: 1962402 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1962414 - Posted: 29 Oct 2018, 14:03:59 UTC

Wowie Zowie !!!

Results Returned in Last Hour has hit 237,593 and the RTS is practically empty (30K). The system is still up though. It is 7am in California, so hopefully soon someone will notice and throw us an Aricebo dataset , if they have one.
ID: 1962414 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24922
Credit: 3,081,182
RAC: 7
Ireland
Message 1962417 - Posted: 29 Oct 2018, 14:52:03 UTC - in response to Message 1962170.  

I realised that when Grant mentioned Beta earlier. Been too used to seeing 50 Beta & 100 Main. It seems that AP flood upset the applecart. All the MB will be completed by this evening & the AP by the weekend.
ID: 1962417 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1962419 - Posted: 29 Oct 2018, 15:14:01 UTC - in response to Message 1962414.  

Results Returned in Last Hour has hit 237,593 and the RTS is practically empty (30K). The system is still up though. It is 7am in California, so hopefully soon someone will notice and throw us an Aricebo dataset , if they have one.


Yeah. This is consistent with my personal surge of 100s of 10 sec WUs (apparently too many spikes) that I have been getting on both GPU and CPU on 2 crunchers (out of 2). So maybe NOT just me...

My pendings and validated have gone way high in the last couple of days. More than doubled.

Watch for many more WUs being sent to users possibly causing congestion on the outgoing data path from SETI????
ID: 1962419 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11447
Credit: 29,581,041
RAC: 66
United States
Message 1962422 - Posted: 29 Oct 2018, 15:36:26 UTC

Results ready to send 0 0 205 0m
This is a problem
ID: 1962422 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1962427 - Posted: 29 Oct 2018, 15:52:04 UTC - in response to Message 1962422.  

Results ready to send 0 0 205 0m
This is a problem


Definitely a problem. On the bright side, the splitters are still splitting, but just aren't keeping up with demand. The big question is how many of the files to split are garbage. Once we work through the junk and get some good WUs then the system can recover and we can all refill our caches. So far the Results out in the Field is falling slowly.
ID: 1962427 · Report as offensive
Previous · 1 . . . 24 · 25 · 26 · 27 · 28 · 29 · 30 . . . 37 · Next

Message boards : Number crunching : Panic Mode On (113) Server Problems?


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.