Panic Mode On (114) Server Problems?

Message boards : Number crunching : Panic Mode On (114) Server Problems?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 45 · Next

AuthorMessage
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1966337 - Posted: 21 Nov 2018, 3:35:09 UTC

If anyone tried to reach my site today and all you got was a blank page, I have gotten that problem taken care of, after looking at the error log and determining that I had a cache file screwing it up.

Time to start a new thread, because we are approaching 800 posts on #113.

ID: 1966337 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1966341 - Posted: 21 Nov 2018, 3:47:06 UTC

And to Christen the new thread- now getting some "Couldn't connect to server" errors on Scheduler requests.
Grant
Darwin NT
ID: 1966341 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1966345 - Posted: 21 Nov 2018, 3:56:51 UTC - in response to Message 1966341.  

Same here.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1966345 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1966349 - Posted: 21 Nov 2018, 4:10:32 UTC

looks like we are going to need this new thread...sigh. I hope it is only a "normal" part of a recovery from a long OUTRAGE! and that sometime tonight all will be well. We still have a stuck file in the splitter, and a backlog of validating, so it isn't a good place to start a recovery.
ID: 1966349 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1966351 - Posted: 21 Nov 2018, 4:18:27 UTC
Last modified: 21 Nov 2018, 4:18:50 UTC

Looking at the graphs, the servers have been back up for over an hour, but no work has been sent out in that time- the Ready-to-send buffer remains unchanged from it's initial after outage update.
Grant
Darwin NT
ID: 1966351 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1966360 - Posted: 21 Nov 2018, 4:49:52 UTC

Scheduler is now MIA. Haven't had a response from it for ages now.
Grant
Darwin NT
ID: 1966360 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1966362 - Posted: 21 Nov 2018, 4:55:09 UTC

The fun is being spread around today. MW, Einstein and Seti all have problems today and for the past week.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1966362 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1966365 - Posted: 21 Nov 2018, 5:06:01 UTC - in response to Message 1966362.  

Hmm, some sort of BOINC system update?
ID: 1966365 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1966366 - Posted: 21 Nov 2018, 5:08:31 UTC
Last modified: 21 Nov 2018, 5:30:16 UTC

It accepted my completed WUs 20 minutes ago, but they aren't validated yet even if 2 of us have finished. I'm not bothering at this time to even ask for new WUs until the system looks good again.

edit: the splitters have been shutdown. not that they were doing any splitting, but it means someone is still working on the system.
ID: 1966366 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 1966375 - Posted: 21 Nov 2018, 5:43:53 UTC

ID: 1966375 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1966377 - Posted: 21 Nov 2018, 6:21:03 UTC

It's going to be a cool day as my crunchers are all but out of work and I forgot to set any "zero" tasks before the outrage and my task count is very low. Oh well, it saves electricity.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1966377 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1966378 - Posted: 21 Nov 2018, 6:21:21 UTC

One of my hosts just got 4 tasks. So the work is starting to flow again. They weren't resends.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1966378 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1966380 - Posted: 21 Nov 2018, 6:25:48 UTC - in response to Message 1966378.  

One of my hosts just got 4 tasks. So the work is starting to flow again. They weren't resends.

Lucky you.
At least i'm getting responses from the Scheduler again, even if it is nothing but "Project has no tasks available". Much better than the previous repeating "Couldn't connect to server" messages.
Server status hasn't been updated for 3 hours for most values.

This is one ugly after outage recovery.
Grant
Darwin NT
ID: 1966380 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1966381 - Posted: 21 Nov 2018, 6:45:40 UTC

status just updated. not sure if it will continue to update as I am sure it is getting pounded with requests for WUs. It looks like AP splitting is happening. It is hard to tell if the gbt files are being split or not.
ID: 1966381 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1966382 - Posted: 21 Nov 2018, 6:46:12 UTC

I think they need to shutdown & restart things all over again.
AP work is now being split, but it's all just going in to the Ready-to-send buffer- none of it is being sent out to be processed.
Grant
Darwin NT
ID: 1966382 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1966383 - Posted: 21 Nov 2018, 6:52:04 UTC

I just got a couple of good hits.
ID: 1966383 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1966386 - Posted: 21 Nov 2018, 7:09:54 UTC

validation and purging are happening, but it will take a long time to work through all this. Looks like splitting is happening too. Very positive signs. Thanks to the person who stayed at work to "kick" the machine (or did it remotely).
ID: 1966386 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1966387 - Posted: 21 Nov 2018, 7:12:55 UTC - in response to Message 1966383.  
Last modified: 21 Nov 2018, 7:17:54 UTC

I just got a couple of good hits.

It's only taken 4 hours since the end of the outage.
And the splitters haven't fired up to replace the work that has been sent out.

Still getting "Project has no tasks available" for me.

Edit- and then 2 minutes later- work gets allocated.
Now if only they would download- says Download active, but nothing is actually happening.
Grant
Darwin NT
ID: 1966387 · Report as offensive
Profile Tilmitt

Send message
Joined: 31 Oct 02
Posts: 9
Credit: 9,769,912
RAC: 0
Japan
Message 1966388 - Posted: 21 Nov 2018, 7:18:47 UTC - in response to Message 1966387.  

Just got about 60 tasks. Btw is the maximum amount of tasks that can be cached 200?
ID: 1966388 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1966389 - Posted: 21 Nov 2018, 7:36:08 UTC
Last modified: 21 Nov 2018, 7:42:22 UTC

The size of your cache depends on the number of proecssors installed and "available for use by SETI".
A CPU will get you 100 tasks, regardless of the number of cores/threads it has.
Each GPU will get you another 100 tasks.

So your computer with a CPU and two GPUs will get 300 tasks.

Edit - This ignores any "penalty" for having an excessively high error rate which reduces the allocation if you have more than a certain number of such errors in a 24hr period.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1966389 · Report as offensive
1 · 2 · 3 · 4 . . . 45 · Next

Message boards : Number crunching : Panic Mode On (114) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.