Panic Mode On (108) Server Problems?

Message boards : Number crunching : Panic Mode On (108) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 32 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9977
Credit: 130,704,239
RAC: 82,718
Australia
Message 1899734 - Posted: 7 Nov 2017, 23:45:23 UTC
Last modified: 8 Nov 2017, 0:16:56 UTC

And we're live again...



Hmm.
Might take a while to get some GPU work- it looks like the ready-to-send buffer is pretty much all Arecibo VLARs.
Grant
Darwin NT
ID: 1899734 · Report as offensive
Profile Wiggo "Socialist"
Avatar

Send message
Joined: 24 Jan 00
Posts: 14448
Credit: 187,318,979
RAC: 72,479
Australia
Message 1899783 - Posted: 8 Nov 2017, 4:10:14 UTC
Last modified: 8 Nov 2017, 4:11:16 UTC

Even though I was in the process of bottling beer this morning I did notice that my 1060's just scraped through the outage (having AP's onboard likely helped there) and caches were quickly recovered. :-)

Cheers.
ID: 1899783 · Report as offensive
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3408
Credit: 71,958,401
RAC: 84,420
Australia
Message 1900100 - Posted: 9 Nov 2017, 23:34:20 UTC

. . Alf they're at it again!

. . Since I woke up nothing but "project has no tasks". Kicked the servers heaps but no reaction, cannot even get a "request too soon" message, just no tasks. 600K tasks in the hopper but nothing is firing ...

Stephen

:(
ID: 1900100 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4611
Credit: 295,483,026
RAC: 614,825
United States
Message 1900104 - Posted: 9 Nov 2017, 23:48:30 UTC

Is it the problem of nobody getting work when requested or is it nothing but Arecibo tasks in the RTS buffer. I haven't seen hide nor hair of any BLC tasks today. What I have been getting is small dribs and drabs of Arecibo shorties. I wonder if everyone is fighting over them for our Nvidia cards?
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1900104 · Report as offensive
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3408
Credit: 71,958,401
RAC: 84,420
Australia
Message 1900106 - Posted: 9 Nov 2017, 23:57:42 UTC - in response to Message 1900104.  

Is it the problem of nobody getting work when requested or is it nothing but Arecibo tasks in the RTS buffer. I haven't seen hide nor hair of any BLC tasks today. What I have been getting is small dribs and drabs of Arecibo shorties. I wonder if everyone is fighting over them for our Nvidia cards?


. . That may be the case ...

. . Oh well, another rest for the guys coming up ...

Stephen

:(
ID: 1900106 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 12035
Credit: 119,773,618
RAC: 44,060
United Kingdom
Message 1900108 - Posted: 10 Nov 2017, 0:11:27 UTC - in response to Message 1900104.  

Is it the problem of nobody getting work when requested or is it nothing but Arecibo tasks in the RTS buffer. I haven't seen hide nor hair of any BLC tasks today. What I have been getting is small dribs and drabs of Arecibo shorties. I wonder if everyone is fighting over them for our Nvidia cards?
I've been requesting NVidia tasks and getting nothing. On another machine, I requested CPU tasks and got (all except one) guppies. I suspect somebidy is fiddling with something, but will have it sorted out by the time I get up in the morning. G'night all.
ID: 1900108 · Report as offensive
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3408
Credit: 71,958,401
RAC: 84,420
Australia
Message 1900111 - Posted: 10 Nov 2017, 0:27:49 UTC

. . Infamy! Infamy! The servers have got it in for me!

. . I have changed venue to open up CPU downloads and still I keep getting "project has no tasks". Them there servers don't like this rig very much ...

Stephen

:(
ID: 1900111 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9977
Credit: 130,704,239
RAC: 82,718
Australia
Message 1900114 - Posted: 10 Nov 2017, 1:29:01 UTC
Last modified: 10 Nov 2017, 2:02:33 UTC

Something's borked. Even Tbar's triple update doesn't help.
And the Work-in-progress has been falling for almost 6 hours now.

EDIT- I only barely ran out of work during the usual weekly outage. Looks like it's decided to make up for that. A few more hours & i'll be out of GPU work. 1.5hrs down and not a skerrick.
Grant
Darwin NT
ID: 1900114 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1090
Credit: 107,994,288
RAC: 23,347
United States
Message 1900125 - Posted: 10 Nov 2017, 2:18:20 UTC - in response to Message 1900106.  
Last modified: 10 Nov 2017, 2:18:55 UTC

Is it the problem of nobody getting work when requested or is it nothing but Arecibo tasks in the RTS buffer. I haven't seen hide nor hair of any BLC tasks today. What I have been getting is small dribs and drabs of Arecibo shorties. I wonder if everyone is fighting over them for our Nvidia cards?


. . That may be the case ...

. . Oh well, another rest for the guys coming up ...

Stephen

:(

If the queues are such that Guppis can sit unsent because there are Aricebos ahead of them in the queue that cannot be sent to GPU, seems to me there's a real problem with how work queueing is being handled.
Goes back to the same old question. Now that SoG is the preferred app for MB, is there any reason to inhibit VLARs from being run on GPUs? The Guppi VLARS seem to do just fine on GPU now.
Sure seems like this should be looked at.
ID: 1900125 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9977
Credit: 130,704,239
RAC: 82,718
Australia
Message 1900130 - Posted: 10 Nov 2017, 2:36:39 UTC - in response to Message 1900125.  

Now that SoG is the preferred app for MB, is there any reason to inhibit VLARs from being run on GPUs? The Guppi VLARS seem to do just fine on GPU now.
Sure seems like this should be looked at.

A lot of systems are stull using CUDA.
Windows/x86 8.00 (cuda23) 800 GigaFLOPS
Windows/x86 8.00 (cuda32) 6,950 GigaFLOPS
Windows/x86 8.00 (cuda42) 24,992 GigaFLOPS
Windows/x86 8.00 (cuda50) 28,292 GigaFLOPS


And whatever going on is a server problem, not something to do with what work is or isn't in the feeder at the time of a request.
Grant
Darwin NT
ID: 1900130 · Report as offensive
Profile Jeff Buck Special Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1900131 - Posted: 10 Nov 2017, 2:37:45 UTC

I don't like the looks of WU 2739307008. One host successfully downloaded and ran it. Everybody else is getting D/L errors on it, 5 hosts so far. My machine's Event Log shows:

11/9/2017 7:17:34 AM | SETI@home | Started download of blc24_2bit_guppi_57895_43958_HIP91357_0024.29905.818.23.46.0.vlar
11/9/2017 7:17:38 AM | SETI@home | Finished download of blc24_2bit_guppi_57895_43958_HIP91357_0024.29905.818.23.46.0.vlar
11/9/2017 7:17:38 AM | SETI@home | [error] MD5 check failed for blc24_2bit_guppi_57895_43958_HIP91357_0024.29905.818.23.46.0.vlar
11/9/2017 7:17:38 AM | SETI@home | [error] expected 4bb0fee3928609f2b1df21e44ac13b4e, got 450a32005c6700d7ab95284edc959572
11/9/2017 7:17:38 AM | SETI@home | [error] Checksum or signature error for blc24_2bit_guppi_57895_43958_HIP91357_0024.29905.818.23.46.0.vlar
I just downloaded the WU manually and didn't seem to have any errors. The file doesn't appear to be truncated, either, ending with "</workunit>" as the last line.

Anybody else run into one (or more) of these today?
ID: 1900131 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9977
Credit: 130,704,239
RAC: 82,718
Australia
Message 1900133 - Posted: 10 Nov 2017, 2:47:57 UTC - in response to Message 1900131.  

Anybody else run into one (or more) of these today?

At the moment I can't get any work.
Been getting "Project has no tasks available" for hours now. And the Graphs show it's not just me, Work-in-progress just keeps on falling.

Grant
Darwin NT
ID: 1900133 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9977
Credit: 130,704,239
RAC: 82,718
Australia
Message 1900137 - Posted: 10 Nov 2017, 3:26:36 UTC - in response to Message 1900114.  

EDIT- I only barely ran out of work during the usual weekly outage. Looks like it's decided to make up for that. A few more hours & i'll be out of GPU work. 1.5hrs down and not a skerrick.

Just managed to pick up some work, then back to nothing.
Should get another 40min now before the GPU work runs out.
Grant
Darwin NT
ID: 1900137 · Report as offensive
juan BFP Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 6895
Credit: 390,285,169
RAC: 148,567
Panama
Message 1900138 - Posted: 10 Nov 2017, 3:33:11 UTC

Current result creation rate ** 0/sec 0.9798/sec 1.2241/sec 5m

Something is wrong.
ID: 1900138 · Report as offensive
Profile Jeff Buck Special Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1900140 - Posted: 10 Nov 2017, 3:46:33 UTC

Actually having more than 650 unsent Astropulse tasks in the RTS buffer is, I believe, one of the signs of the Apocalypse.
ID: 1900140 · Report as offensive
juan BFP Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 6895
Credit: 390,285,169
RAC: 148,567
Panama
Message 1900141 - Posted: 10 Nov 2017, 3:49:34 UTC
Last modified: 10 Nov 2017, 3:50:09 UTC

Data Distribution State SETI@home v7 # Astropulse # SETI@home v8 # As of*
Results ready to send 0 654 629,960 14m
Current result creation rate ** 0/sec 0.3544/sec 0.8368/sec 5m

A little more.... LOL
ID: 1900141 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4611
Credit: 295,483,026
RAC: 614,825
United States
Message 1900147 - Posted: 10 Nov 2017, 3:57:03 UTC - in response to Message 1900140.  

Its up to 920 now. So something has definitely gummed up the works. Normally, the AP tasks are gobbled up as fast as they are created.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1900147 · Report as offensive
Profile Jeff Buck Special Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1900150 - Posted: 10 Nov 2017, 4:05:17 UTC - in response to Message 1900147.  

Seems like a couple of years, at least, since I've seen more than 2 APs in the RTS, what with the very limited supply and high demand.
ID: 1900150 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9977
Credit: 130,704,239
RAC: 82,718
Australia
Message 1900152 - Posted: 10 Nov 2017, 4:19:12 UTC - in response to Message 1900147.  

Its up to 920 now. So something has definitely gummed up the works. Normally, the AP tasks are gobbled up as fast as they are created.

Scheduler or feeder is my WAG.

There's plenty of work there, and there's plenty of people asking for it that aren't anywhere near their cache or the server side limits, but work just isn't being allocated.
Grant
Darwin NT
ID: 1900152 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4611
Credit: 295,483,026
RAC: 614,825
United States
Message 1900153 - Posted: 10 Nov 2017, 4:22:07 UTC

Now is the time a good sysadmin would be handy to look at the server logs and figure out why no work is being sent when there is plenty available.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1900153 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 32 · Next

Message boards : Number crunching : Panic Mode On (108) Server Problems?


 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.