Panic Mode On (105) Server Problems?

Message boards : Number crunching : Panic Mode On (105) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 28 · 29 · 30 · 31 · 32 · 33 · 34 · Next

AuthorMessage
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1863523 - Posted: 24 Apr 2017, 21:03:00 UTC
Last modified: 24 Apr 2017, 21:06:53 UTC

Drat, all I'm getting across all 3 machines are "no tasks are available" The toggle in Preferences didn't do anything.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1863523 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1863525 - Posted: 24 Apr 2017, 21:26:54 UTC

OK, it took a few cycles of requests to start work flowing again after my Preferences toggle. I'm getting about 80/20 BLC to Arecibo work on the last successful request. I've put tasks per GPU back to 2 also.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1863525 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1863560 - Posted: 25 Apr 2017, 0:27:41 UTC - in response to Message 1863509.  

No panic here...

I welcome BLC work units.....


. . There's one in every crowd! :)

Stephen

:)
ID: 1863560 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1863608 - Posted: 25 Apr 2017, 7:09:14 UTC - in response to Message 1863607.  

Shedloads of Guppis and no AP's :-(

There are a couple of new Arecibo files loaded, and they're splitting some AP from them.
And with plenty of Guppies, the system will be busy for longer during the weekly outage. Might even make it all the way through... (nah, but it's a nice thought).
Grant
Darwin NT
ID: 1863608 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1863636 - Posted: 26 Apr 2017, 0:58:21 UTC

Servers are back online but can't be reached with comms error.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1863636 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1863638 - Posted: 26 Apr 2017, 1:10:14 UTC - in response to Message 1863636.  

Servers are back online but can't be reached with comms error.


. . Yep, back online, have reported all completed tasks but while not getting any comms error I am getting "Project has no tasks available".

. . I guess Einstein will get a little more work out of me today ....

Stephen

:(
ID: 1863638 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1863641 - Posted: 26 Apr 2017, 1:17:14 UTC - in response to Message 1863638.  

Yeah, the servers can't be reached because of HTTP errors only lasted about 5 minutes. Now is just no work is available.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1863641 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1863644 - Posted: 26 Apr 2017, 1:20:17 UTC

The task creation rate spiked to 50+ per second right after coming back online. But it has now crashed and burned down to 2/sec.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1863644 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1863654 - Posted: 26 Apr 2017, 2:27:37 UTC

Looks like all 7 GBT splitters are running on the same file now. That is a recipe for disaster. Or at least curtailed WU output as mentioned just above.
ID: 1863654 · Report as offensive
Profile tazzduke
Volunteer tester

Send message
Joined: 15 Sep 07
Posts: 190
Credit: 28,269,068
RAC: 5
Australia
Message 1864479 - Posted: 29 Apr 2017, 6:52:11 UTC

Greetings All

Found a strange one, and there might be a topic for it, but it could be 5 pages or more away.

So the following task, has been marked - Completed, validation inconclusive, on both the 1st and 2nd recipient and now waiting for the 3rd recipient.

https://setiathome.berkeley.edu/workunit.php?wuid=2463641979

Trouble is the 3rd received it on the 11th March and it is now nearly May, he/she hasn't returned any tasks at all (52), thinks it may be a ghost/dead machine.

Or maybe on a positive note, has crunched them, but is not in a position to upload them.

Oh well, at least on a positive note, the supply of workunits is good lol.

Happy Crunching
Mark
ID: 1864479 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864485 - Posted: 29 Apr 2017, 7:42:14 UTC - in response to Message 1864479.  
Last modified: 29 Apr 2017, 7:46:44 UTC

Greetings All

Found a strange one, and there might be a topic for it, but it could be 5 pages or more away.

So the following task, has been marked - Completed, validation inconclusive, on both the 1st and 2nd recipient and now waiting for the 3rd recipient.

https://setiathome.berkeley.edu/workunit.php?wuid=2463641979

Trouble is the 3rd received it on the 11th March and it is now nearly May, he/she hasn't returned any tasks at all (52), thinks it may be a ghost/dead machine.

Or maybe on a positive note, has crunched them, but is not in a position to upload them.

Oh well, at least on a positive note, the supply of workunits is good lol.

Happy Crunching
Mark


. . Hi there Mark,

. . The sad reality of crunching for Seti is that there many derelict machines out there. Some are newbies who tried it and were unimpressed, leaving again with a cache full on uncompleted tasks to sit in limbo for a couple of months. Others are older contributors who probably should know better but for whatever reason have left the project doing the same thing. There's no excuse for them, it takes only a couple of minutes to set "No new work", mark all uncompleted tasks as aborted and upload that to the servers before shutting down. I wish more people would pick up on that.

. . The good news is that all his "ghosted" tasks will expire before the end of the week and hopefully bounce to another host that is NOT a derelict machine.

Stephen

:(
ID: 1864485 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22199
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1864488 - Posted: 29 Apr 2017, 8:21:13 UTC

...and one class of computer you miss - those that have suffered a major hardware failure which prevents the computer from being used - "failure" might include the data disk failing, a house fire, the computer being stolen, all of which have happened and will happen again. Thankfully these are relatively rare events, but hey do contribute to the list of computers "missing in action" :-(

(Said with some feeling as I sit here looking at a pair of 2Gb disks that failed when a PSU decided to send 230v up the 12V line to them...)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1864488 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864492 - Posted: 29 Apr 2017, 8:46:03 UTC - in response to Message 1864488.  

...and one class of computer you miss - those that have suffered a major hardware failure which prevents the computer from being used - "failure" might include the data disk failing, a house fire, the computer being stolen, all of which have happened and will happen again. Thankfully these are relatively rare events, but hey do contribute to the list of computers "missing in action" :-(

(Said with some feeling as I sit here looking at a pair of 2Gb disks that failed when a PSU decided to send 230v up the 12V line to them...)


. . Yes I did overlook that group, my apologies to those thus afflicted and as for your HDDs, ... OUCH!

. . I should have thought to include them as Wiggo had a power outage take his machine down only to find no working HDD when it fired back up. Thankfully he had a spare unit and reconfigured it and is back crunching ... but there are now x number of ghosties on his rig too. I have offered him a copy of my tried and proven Ghost Recovery process but so far no answer :)

. . I will qualify that by adding the process was provided by someone else but I have added my own refinements to make it pretty much bullet proof. The only drawback is that it is a little fiddly and can only recover 20 ghosts at a time (that is all the system will send you) so if you have a lot of ghosts it can take a while, but better than leaving them, and your wingmen, in limbo for two months :)

Stephen

:)
ID: 1864492 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1864703 - Posted: 30 Apr 2017, 6:06:00 UTC - in response to Message 1864492.  

...and one class of computer you miss - those that have suffered a major hardware failure which prevents the computer from being used - "failure" might include the data disk failing, a house fire, the computer being stolen, all of which have happened and will happen again. Thankfully these are relatively rare events, but hey do contribute to the list of computers "missing in action" :-(

(Said with some feeling as I sit here looking at a pair of 2Gb disks that failed when a PSU decided to send 230v up the 12V line to them...)


. . Yes I did overlook that group, my apologies to those thus afflicted and as for your HDDs, ... OUCH!

. . I should have thought to include them as Wiggo had a power outage take his machine down only to find no working HDD when it fired back up. Thankfully he had a spare unit and reconfigured it and is back crunching ... but there are now x number of ghosties on his rig too. I have offered him a copy of my tried and proven Ghost Recovery process but so far no answer :)

. . I will qualify that by adding the process was provided by someone else but I have added my own refinements to make it pretty much bullet proof. The only drawback is that it is a little fiddly and can only recover 20 ghosts at a time (that is all the system will send you) so if you have a lot of ghosts it can take a while, but better than leaving them, and your wingmen, in limbo for two months :)

Stephen

:)

Yeah, going through that right now with one. 8 month old 250g SSD died, no backup, of course.
ID: 1864703 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1864727 - Posted: 30 Apr 2017, 8:13:35 UTC - in response to Message 1864703.  


Yeah, going through that right now with one. 8 month old 250g SSD died, no backup, of course.

On the bright side it should still be under warranty. Best of luck with your data recovery
ID: 1864727 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1864731 - Posted: 30 Apr 2017, 9:27:58 UTC

This WU certainly was around the block to validate.
http://setiathome.berkeley.edu/workunit.php?wuid=2518067783
I kept it in case someone wants to look at it.
ID: 1864731 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1864732 - Posted: 30 Apr 2017, 9:33:24 UTC - in response to Message 1864731.  
Last modified: 30 Apr 2017, 9:40:11 UTC

This WU certainly was around the block to validate.
http://setiathome.berkeley.edu/workunit.php?wuid=2518067783
I kept it in case someone wants to look at it.

One of the systems is pumping out heaps of Invalids & Errors.
EDIT-
Another is pumping out almost nothing but Invalids & Inconclusives.
Another has 2 Valids, and everything else is an Error.
I'm guessing the 2nd computer that scored the Valid result was initially inconclusive against the first computer to return Valid work, looks like yours confirmed the result.
Grant
Darwin NT
ID: 1864732 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864735 - Posted: 30 Apr 2017, 10:25:05 UTC - in response to Message 1864703.  

...and one class of computer you miss - those that have suffered a major hardware failure which prevents the computer from being used - "failure" might include the data disk failing, a house fire, the computer being stolen, all of which have happened and will happen again. Thankfully these are relatively rare events, but hey do contribute to the list of computers "missing in action" :-(

(Said with some feeling as I sit here looking at a pair of 2Gb disks that failed when a PSU decided to send 230v up the 12V line to them...)


. . Yes I did overlook that group, my apologies to those thus afflicted and as for your HDDs, ... OUCH!

Yeah, going through that right now with one. 8 month old 250g SSD died, no backup, of course.


. . You have reminded me of something I should probably take care of right away :)

. . Time to make a backup of all seti drives ....

Stephen

:)
ID: 1864735 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864736 - Posted: 30 Apr 2017, 10:36:25 UTC

. . @ Keith

. . I was reminded today of our conversation about running Arecibo VLAR tasks on GPUs using SoG.

. . On my new rig, Bertie, I made a goof with the scheduling and ran out of GPU tasks. So to keep the GPUs working while I sorted it out (uploaded a swathe of result files and downloaded some fresh stuff), I moved some of the Arecibo VLAR tasks queued behind the CPU into the GPU Q. As I am doing four at a time (2 per GPU) I moved 4 across and they were all done in 24.5 mins on my GTX 970s (SoG r3557). Normal Arecibo tasks under the same conditions take 13 to 14 mins. Not quite twice as long but at least they were being productive. So anytime you feel the need to do the same I am pretty sure you will be fine.

. . Of course I have not tried that with r3584.

Stephen

.
ID: 1864736 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1864841 - Posted: 30 Apr 2017, 19:07:42 UTC - in response to Message 1864736.  

Hi Stephen, I should try and find the answer myself with a forum search, but since I have your attention, what is the method you use to move a Arecibo VLAR onto a Nvidia GPU? The scheduler won't do it because of the way the move rules are written. I too have been in troubles especially on outage days where I run out quickly on GPU tasks and it would be nice to move some CPU tasks to GPU. At least for the slow FX systems. I've learned that is counter productive on the Ryzen system since it would run out just as fast for both task types on that system. Best to let it run un-optimized without rescheduling on outage days.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1864841 · Report as offensive
Previous · 1 . . . 28 · 29 · 30 · 31 · 32 · 33 · 34 · Next

Message boards : Number crunching : Panic Mode On (105) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.