Panic Mode On (109) Server Problems?

Message boards : Number crunching : Panic Mode On (109) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 36 · Next

AuthorMessage
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1908425 - Posted: 22 Dec 2017, 11:49:44 UTC - in response to Message 1908415.  
Last modified: 22 Dec 2017, 12:31:42 UTC

I still have full caches here and I get 1 for 1 at every request so far today.

Yes, as Stephen mentioned, I do use 6:10:60 and the only time I suffer is if there's a real problem. ;-)

Cheers.


. . Glad to know it is still working for you but I should point out that it was Keith who noticed your Boinc version, credit where it is due :)

[edit] . . As of about 1 hour ago I have been intermittently getting "unable to contact server" "project might be down" messages. And problems/slow accessing message forums.

Stephen

:(
ID: 1908425 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1908819 - Posted: 24 Dec 2017, 19:45:26 UTC

Having a real hard time getting work for the Linux cruncher. Down almost 100 tasks. No work is available messages. Triple Update isn't working. Might have to resort to the server kick.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1908819 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1908822 - Posted: 24 Dec 2017, 20:14:52 UTC - in response to Message 1908819.  

Having a real hard time getting work for the Linux cruncher. Down almost 100 tasks. No work is available messages. Triple Update isn't working. Might have to resort to the server kick.


. . I have had to play kick the server on all 3 rigs this am.

Stephen

:(
ID: 1908822 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1908840 - Posted: 24 Dec 2017, 23:16:30 UTC

Let's hope that the servers behave themselves over the festive season and the team in the SETI@home labs can have a well-earned and peaceful break over Christmas.
ID: 1908840 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1908849 - Posted: 25 Dec 2017, 0:12:08 UTC - in response to Message 1908840.  

Let's hope that the servers behave themselves over the festive season and the team in the SETI@home labs can have a well-earned and peaceful break over Christmas.


. . Absolutely ....

Stephen

:)
ID: 1908849 · Report as offensive
JLDun
Volunteer tester
Avatar

Send message
Joined: 21 Apr 06
Posts: 573
Credit: 196,101
RAC: 0
United States
Message 1908860 - Posted: 25 Dec 2017, 3:28:03 UTC

As long as I don't get visited by too many ghosts of Christmas Past.
ID: 1908860 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1908941 - Posted: 25 Dec 2017, 20:35:08 UTC
Last modified: 25 Dec 2017, 20:42:41 UTC

Back to being Starved, even on Christmas day.
Earlier two of the Linux machines were being Starved, they recovered after going almost 100 tasks down.
Now the Server has decided to starve All the machines, even the Mac. The Mac is set for SETI@home v8 only, so, No reason is being offered, the others are set for Both and are getting No Tasks are available for AstroPulse v7. They are being tossed a bone ever so often, which doesn't even begin to make up for the tasks that aren't sent. The Current result creation rate is down to 0.7559/sec, so, my guess is No one is being sent much work.

Mon Dec 25 15:14:20 2017 | SETI@home | Sending scheduler request: To report completed tasks.
Mon Dec 25 15:14:20 2017 | SETI@home | Reporting 3 completed tasks
Mon Dec 25 15:14:20 2017 | SETI@home | Requesting new tasks for NVIDIA GPU
Mon Dec 25 15:14:20 2017 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
Mon Dec 25 15:14:20 2017 | SETI@home | [sched_op] NVIDIA GPU work request: 234054.58 seconds; 0.00 devices
Mon Dec 25 15:14:22 2017 | SETI@home | Scheduler request completed: got 0 new tasks
Mon Dec 25 15:14:22 2017 | SETI@home | No tasks sent
Mon Dec 25 15:19:34 2017 | SETI@home | Sending scheduler request: To report completed tasks.
Mon Dec 25 15:19:34 2017 | SETI@home | Reporting 5 completed tasks
Mon Dec 25 15:19:34 2017 | SETI@home | Requesting new tasks for NVIDIA GPU
Mon Dec 25 15:19:34 2017 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
Mon Dec 25 15:19:34 2017 | SETI@home | [sched_op] NVIDIA GPU work request: 235287.77 seconds; 0.00 devices
Mon Dec 25 15:19:37 2017 | SETI@home | Scheduler request completed: got 1 new tasks
Mon Dec 25 15:24:49 2017 | SETI@home | Sending scheduler request: To report completed tasks.
Mon Dec 25 15:24:49 2017 | SETI@home | Reporting 5 completed tasks
Mon Dec 25 15:24:49 2017 | SETI@home | Requesting new tasks for NVIDIA GPU
Mon Dec 25 15:24:49 2017 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
Mon Dec 25 15:24:49 2017 | SETI@home | [sched_op] NVIDIA GPU work request: 236244.87 seconds; 0.00 devices
Mon Dec 25 15:24:50 2017 | SETI@home | Scheduler request completed: got 0 new tasks
Mon Dec 25 15:24:50 2017 | SETI@home | No tasks sent
Mon Dec 25 15:30:03 2017 | SETI@home | Sending scheduler request: To report completed tasks.
Mon Dec 25 15:30:03 2017 | SETI@home | Reporting 5 completed tasks
Mon Dec 25 15:30:03 2017 | SETI@home | Requesting new tasks for NVIDIA GPU
Mon Dec 25 15:30:03 2017 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
Mon Dec 25 15:30:03 2017 | SETI@home | [sched_op] NVIDIA GPU work request: 237455.11 seconds; 0.00 devices
Mon Dec 25 15:30:04 2017 | SETI@home | Scheduler request completed: got 2 new tasks

..a few minutes later and a couple machines got their caches filled.

Mon Dec 25 15:35:17 2017 | SETI@home | Sending scheduler request: To report completed tasks.
Mon Dec 25 15:35:17 2017 | SETI@home | Reporting 4 completed tasks
Mon Dec 25 15:35:17 2017 | SETI@home | Requesting new tasks for NVIDIA GPU
Mon Dec 25 15:35:17 2017 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
Mon Dec 25 15:35:17 2017 | SETI@home | [sched_op] NVIDIA GPU work request: 238279.78 seconds; 0.00 devices
Mon Dec 25 15:35:19 2017 | SETI@home | Scheduler request completed: got 60 new tasks

Oh, and Merry Christmas to you too.
ID: 1908941 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1908944 - Posted: 25 Dec 2017, 20:41:12 UTC

No problems here getting work and that creation rate should pick up again when the ready to send number drops to around 580K. ;-)

Cheers.
ID: 1908944 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1908945 - Posted: 25 Dec 2017, 20:43:25 UTC - in response to Message 1908941.  

Only the Linux machine has been struggling to get work all morning. I think it is because we are having a VLAR storm and no or little BLC work. Noticed every machine is getting a majority of Arecibo shorties when requesting work. Since the linux machine asks for more work at each request than the others, it is getting shortchanged and constantly falling in its cache.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1908945 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1908946 - Posted: 25 Dec 2017, 20:48:29 UTC - in response to Message 1908944.  

You see the Log. It's about the same on 3 different machines. The fourth machine is still down about 50 tasks.
ID: 1908946 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1908949 - Posted: 25 Dec 2017, 21:37:52 UTC
Last modified: 25 Dec 2017, 22:00:06 UTC

It's just another system weirdness.
After the outages, the majority of the work is GBT. By the time the next outage comes around, the majority of the work is Arecibo. And since it appears to be all VLARs coming out of the splitters at the moment (with no GBT worth mentioning), Nvidia GPUs are going without.
Current hardware & SoG can handle Arecibo VLARs, but the older hardware and applications would still just about bring a system to it's knees.


EDIT- the triple update managed to shake free some GBT work for both systems.
Grant
Darwin NT
ID: 1908949 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1908952 - Posted: 25 Dec 2017, 22:39:46 UTC - in response to Message 1908949.  

It's just another system weirdness.
After the outages, the majority of the work is GBT. By the time the next outage comes around, the majority of the work is Arecibo. And since it appears to be all VLARs coming out of the splitters at the moment (with no GBT worth mentioning), Nvidia GPUs are going without.
Current hardware & SoG can handle Arecibo VLARs, but the older hardware and applications would still just about bring a system to it's knees.


EDIT- the triple update managed to shake free some GBT work for both systems.


. . After kicking the server the Linux machine filled up, 100% GBT work.

. . Tried on the Windows machine but getting nothing, nada, zip :(

Stephen

:(
ID: 1908952 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1908953 - Posted: 25 Dec 2017, 22:45:11 UTC - in response to Message 1908952.  
Last modified: 25 Dec 2017, 22:58:34 UTC

I've tried the Triple Update all morning. Finally I just tried the server kick with the ghost recovery protocol. Reported 34 tasks ...... got 4. Down about 135 180 tasks now on the Linux machine. The other machines can drop 10-20 below full but after a couple of requests, bounce back to full. The linux machine is not seen the same by the servers apparently.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1908953 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1908956 - Posted: 25 Dec 2017, 22:56:06 UTC - in response to Message 1908953.  

I've tried the Triple Update all morning. Finally I just tried the server kick with the ghost recovery protocol. Reported 34 tasks ...... got 4. Down about 135 tasks now on the Linux machine. The other machines can drop 10-20 below full but after a couple of requests, bounce back to full. The linux machine is not seen the same by the servers apparently.


. . After the success getting loads of GBT work earlier all 3 machines are now getting "this machine has reached it's limit" and no new work on any of them :(

Stephen

:(
ID: 1908956 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1908957 - Posted: 25 Dec 2017, 23:00:52 UTC - in response to Message 1908956.  

I've tried the Triple Update all morning. Finally I just tried the server kick with the ghost recovery protocol. Reported 34 tasks ...... got 4. Down about 135 tasks now on the Linux machine. The other machines can drop 10-20 below full but after a couple of requests, bounce back to full. The linux machine is not seen the same by the servers apparently.


. . After the success getting loads of GBT work earlier all 3 machines are now getting "this machine has reached it's limit" and no new work on any of them :(

Stephen

:(

I'm getting the same thing now on all machines, "This computer has reached a limit on tasks in progress"
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1908957 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1908964 - Posted: 26 Dec 2017, 0:17:07 UTC - in response to Message 1908957.  


. . After the success getting loads of GBT work earlier all 3 machines are now getting "this machine has reached it's limit" and no new work on any of them :(
Stephen
:(

I'm getting the same thing now on all machines, "This computer has reached a limit on tasks in progress"


. . It could be a very, very long outage with no work going in .... :(

Stephen

:(
ID: 1908964 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1908974 - Posted: 26 Dec 2017, 1:28:20 UTC - in response to Message 1908964.  

Yes, looks like the project is observing the holiday. The Haveland graphs show a steady decline in the number of tasks in progress. Nothing is going out to replace those coming in.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1908974 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1908977 - Posted: 26 Dec 2017, 1:56:49 UTC

Well, there's actually plenty of work available, IF you don't mind running Arecibo VLARs on your GPUs. I normally stockpile for the outage by rescheduling tasks from the GPU queue to the CPU, then letting the GPU queue refill. It usually only takes a couple rescheduling runs to stockpile what I need, then once the outage is underway, I move the excess tasks back to the GPU queue.

Right now, though, tasks other than Arecibo VLARs seem to be extremely scarce, so I just tried an experiment on one of my machines to stockpile in the other direction, moving tasks from the CPU queue to the GPU, then letting the CPU refill. It's more tedious that way, but the CPU queue refilled every time. Seven rescheduling runs snagged 600 consecutive Arecibo VLARs. For now, I've moved those all back to the CPU queue in the hopes that the run will come to an end and the GPU queue can maintain a normal level until the outage. If not, though, I'll have to try snagging some more Arecibo VLARs, and do the same on my other two Linux boxes.

Of course, one of the potential drawbacks to downloading Arecibo VLARs, or any other type of task to the CPU queue and then running them on the GPUs, is that the APR for those tasks will eventually climb to the point that the "Elapsed time exceeded" error starts to show up, so that has to be monitored. A real PITA but, if the choice is either to run Arecibo VLARs on my GPUs, or run out of work on the GPUs altogether, I'll take the first option. :^)
ID: 1908977 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1908979 - Posted: 26 Dec 2017, 2:04:25 UTC - in response to Message 1908976.  

Maybe it's not the end of the world, if your RAC drops a bit eh?

Blasphemy!
Grant
Darwin NT
ID: 1908979 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1908981 - Posted: 26 Dec 2017, 2:06:06 UTC - in response to Message 1908977.  

Seven rescheduling runs snagged 600 consecutive Arecibo VLARs.

Yep, a huge number of Arecibo VLARs about at the moment. Every so often i'll get a bit of GPU work but not enough to keep the cache full.
Grant
Darwin NT
ID: 1908981 · Report as offensive
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 36 · Next

Message boards : Number crunching : Panic Mode On (109) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.