The Server Issues / Outages Thread - Panic Mode On! (118)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 94 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 2025293 - Posted: 28 Dec 2019, 5:54:58 UTC
Last modified: 28 Dec 2019, 5:55:30 UTC

And with the Workunits-waiting-for-validation backlog finally clearing (a peak of almost 11 milion, now down to 8.3 million), and the Workunits-waiting-for-assimilation starting to catch up as well (from 2.5 million down to 1 million (things working normally value being 0)), that's leading to a backlog in Workunit-files-waiting-for-deletion (from 0 to over 1 million & climbing (things working normally being 0)).
And they can't be purged, until they are deleted.

And since the db purge & some of the file deleters run on Bruno, which is the upload server, it would be nice if we could get the new hardware running here on main to help things along.
Something for the new year's wish list.
Grant
Darwin NT
ID: 2025293 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 2025295 - Posted: 28 Dec 2019, 6:21:54 UTC - in response to Message 2025294.  

But it seems as if the new SSD upload server that was tested on Beta failed (file system error) last Friday, according to Eric's post.
It didn't last long even on the low production Beta project. So any hope of that one coming to Main in the near future isn't high.
https://setiathome.berkeley.edu/forum_thread.php?id=84968&postid=2023883#2023883
Fingers crossed it was just a software issue, not hardware.


Speaking of issues, we're back to getting nothing but "Project has no tasks available" messages, again.
Grant
Darwin NT
ID: 2025295 · Report as offensive
wujj123456

Send message
Joined: 5 Sep 04
Posts: 40
Credit: 20,877,975
RAC: 219
China
Message 2025297 - Posted: 28 Dec 2019, 6:29:46 UTC - in response to Message 2025296.  

No such problems here, but then I do not ask for many tasks, and not very often either.

It just started happening an hour or two ago. You probably would see it next time it tries to fetch work. :-(
ID: 2025297 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 2025298 - Posted: 28 Dec 2019, 6:35:22 UTC

And now picking up work again. Posting about a problem often sorts it out.
Till the next time.
Grant
Darwin NT
ID: 2025298 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2025300 - Posted: 28 Dec 2019, 6:54:20 UTC - in response to Message 2025294.  

But it seems as if the new SSD upload server that was tested on Beta failed (file system error) last Friday, according to Eric's post.
It didn't last long even on the low production Beta project. So any hope of that one coming to Main in the near future isn't high.
https://setiathome.berkeley.edu/forum_thread.php?id=84968&postid=2023883#2023883


. . You make a good point there ...

Stephen

:(
ID: 2025300 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 2025302 - Posted: 28 Dec 2019, 7:31:04 UTC
Last modified: 28 Dec 2019, 7:42:08 UTC

HP-Z400

55148	SETI@home	12/28/19 12:24:25 AM	Sending scheduler request: To report completed tasks.	
55150	SETI@home	12/28/19 12:24:25 AM	Requesting new tasks for CPU and NVIDIA GPU	
55151	SETI@home	12/28/19 12:24:28 AM	Scheduler request completed: got 0 new tasks	
55152	SETI@home	12/28/19 12:24:28 AM	Project has no tasks available
And now back ok?
ID: 2025302 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2025303 - Posted: 28 Dec 2019, 7:59:33 UTC

Downloads are stalling out again.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2025303 · Report as offensive
wujj123456

Send message
Joined: 5 Sep 04
Posts: 40
Credit: 20,877,975
RAC: 219
China
Message 2025304 - Posted: 28 Dec 2019, 8:44:27 UTC
Last modified: 28 Dec 2019, 8:46:05 UTC

I did get a bunch WUs soon after my previous post. These are the recent requests I had (in PST).

27-Dec-2019 22:27:40 [SETI@home] Scheduler request completed: got 0 new tasks
27-Dec-2019 22:39:07 [SETI@home] Scheduler request completed: got 0 new tasks
27-Dec-2019 23:04:30 [SETI@home] Scheduler request completed: got 0 new tasks
27-Dec-2019 23:04:30 [SETI@home] Scheduler request completed: got 0 new tasks
27-Dec-2019 23:47:20 [SETI@home] Scheduler request completed: got 174 new tasks
27-Dec-2019 23:52:28 [SETI@home] Scheduler request completed: got 113 new tasks
28-Dec-2019 00:06:36 [SETI@home] Scheduler request completed: got 177 new tasks
28-Dec-2019 00:32:52 [SETI@home] Scheduler request completed: got 147 new tasks

I guess this is just transient failures not what we had a few days before. :-)
ID: 2025304 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 2025312 - Posted: 28 Dec 2019, 12:04:45 UTC

After all the discussion before the previous outage I have my parameters set for "1 day of work" but only check in once every 6 hours or so (0.25). I used to be setup so it would contact every 5 minutes or so to keep my cache topped off.
Since that last disaster I said "Well, be that way" and set it to the 0.25 (per day) setting.

It seems like my Cache is staying filled sufficiently and my logs are a bit shorter :)

Tom
A proud member of the OFA (Old Farts Association).
ID: 2025312 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 2025342 - Posted: 28 Dec 2019, 18:25:12 UTC

pre panic.

The RTS is 90K and falling. The return rate is around 150k which is a bit high, and the system seems to be more interested in assimilation that splitting. We should have full caches, but I thought I'd post why we will get 0 WUs available messages. Think all we can do is wait it out.

I've sent myself to NNT as I have a slow machine and can wait until the RTS is flush again.

(you happy now Tom :P)
ID: 2025342 · Report as offensive
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 212
United States
Message 2025363 - Posted: 28 Dec 2019, 21:27:13 UTC - in response to Message 2024771.  

Christmas Eve I posted:

Hi All

I haven't been very active the last couple of years or so, I thought I might fire up a box or two now that SAH has had the time to fix the software/server/WU availability issues that had been plaguing them...


The response was a thunderous silence - did I somehow offend someone?
Member of the 20 Year Club



ID: 2025363 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 2025365 - Posted: 28 Dec 2019, 21:33:22 UTC - in response to Message 2025363.  

Christmas Eve I posted:

Hi All

I haven't been very active the last couple of years or so, I thought I might fire up a box or two now that SAH has had the time to fix the software/server/WU availability issues that had been plaguing them...


The response was a thunderous silence - did I somehow offend someone?

Possible everyone was incapacitated laughing about the assertion that the "software/server/WU availability issues" have been fixed?
lol
It's been a pretty rough patch here with server issues lately . . . but welcome back !!
ID: 2025365 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 2025366 - Posted: 28 Dec 2019, 21:33:52 UTC - in response to Message 2025363.  

No. You just posted at a unfortunate time. There was issues with the server code right then and it took several days for the issue to be resolved. The server is back up and running with occasional hiccups. What kind of information do you need?
ID: 2025366 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36864
Credit: 261,360,520
RAC: 489
Australia
Message 2025367 - Posted: 28 Dec 2019, 21:34:39 UTC - in response to Message 2025363.  
Last modified: 28 Dec 2019, 21:40:41 UTC

Christmas Eve I posted:
Hi All

I haven't been very active the last couple of years or so, I thought I might fire up a box or two now that SAH has had the time to fix the software/server/WU availability issues that had been plaguing them...
The response was a thunderous silence - did I somehow offend someone?
Not everyone uses the "quote" or "reply to" buttons, but you were replied to. ;-)

Cheers.
ID: 2025367 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 2025368 - Posted: 28 Dec 2019, 21:38:12 UTC

It's taken a couple of weeks, but the Validation, Assimilation, Deletion backlogs have cleared. Now we just need the Purgers to catch up, and then maybe the splitters will be able to really get going and finally refill the Ready-to-send buffer.
Apart from a couple of periods where the servers stopped sending out work again, over the last day and a half work demand has exceed supply and the Ready-to-send buffer has been slowly but steadily emptying until the last few hours when the above backlogs were finally cleared out.
Grant
Darwin NT
ID: 2025368 · Report as offensive
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 212
United States
Message 2025371 - Posted: 28 Dec 2019, 21:47:43 UTC - in response to Message 2025366.  

I had some communication issues with a host and lots of software questions but I am sure I can work the remainder out alone as well.
Thanx.
Member of the 20 Year Club



ID: 2025371 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2025373 - Posted: 28 Dec 2019, 21:54:45 UTC - in response to Message 2025363.  

Your post also didn’t contain any kind of question. Just a statement that didn’t seem complete as you ended on an ellipses (...). Not sure what people were supposed to say.

But welcome back. If you have an actual question feel free to post it in the appropriate forum.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2025373 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36864
Credit: 261,360,520
RAC: 489
Australia
Message 2025374 - Posted: 28 Dec 2019, 21:55:33 UTC - in response to Message 2025371.  

I had some communication issues with a host and lots of software questions but I am sure I can work the remainder out alone as well.
Thanx.
If you have issues then it's best to start your own thread and that way you'll get notified for all replies. ;-)

Cheers.
ID: 2025374 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 2025377 - Posted: 28 Dec 2019, 22:00:14 UTC - in response to Message 2025363.  

The response was a thunderous silence - did I somehow offend someone?
No, but if you had posted in one of the threads for general discussion such as "Don't know where it should go? Stick it here", or such similar thread in the CAFE Seti forum it probably would have been noticed & got a response there, as this one is for Server & other issues worth panicking over, and people were (as others have pointed out) in the middle of a panic at the time.
Grant
Darwin NT
ID: 2025377 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 2025379 - Posted: 28 Dec 2019, 22:02:39 UTC - in response to Message 2025374.  

I had some communication issues with a host and lots of software questions but I am sure I can work the remainder out alone as well.
Thanx.
If you have issues then it's best to start your own thread and that way you'll get notified for all replies. ;-)
Or do a quick check to see if there is already a thread that concerns the issues you are having, and make use of that one if the answer to your question hasn't yet been provided.
Grant
Darwin NT
ID: 2025379 · Report as offensive
Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 94 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.