The Server Issues / Outages Thread - Panic Mode On! (118)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 23 · 24 · 25 · 26 · 27 · 28 · 29 . . . 94 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14008
Credit: 208,696,464
RAC: 304
Australia
Message 2027379 - Posted: 11 Jan 2020, 22:53:09 UTC - in response to Message 2027376.  

I fear there is something going on at your end, Ville.
The Special Application does have issues with some noise bombs when it comes to triplets & pulses & what goes where.
Grant
Darwin NT
ID: 2027379 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2027381 - Posted: 11 Jan 2020, 23:00:27 UTC - in response to Message 2027377.  

Something's broken- Received-last-hour for both MB & AP have plummeted, as has the splitter output. All at the same time, around 21:20 UTC.
Although it looks like things might just be coming back to life again; splitter output is no longer 0, Returned-last-hour is climbing again.
I think that's usually the sign of everything being switched off and back on again, either to clear a blockage (some server getting stuck) or to install a new version of something.
ID: 2027381 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2027385 - Posted: 11 Jan 2020, 23:40:51 UTC

No body hear me, but each time the total of Wu reaches 23 MM something weird happening. Can be just coincidence?
ID: 2027385 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14008
Credit: 208,696,464
RAC: 304
Australia
Message 2027386 - Posted: 11 Jan 2020, 23:56:55 UTC - in response to Message 2027385.  
Last modified: 11 Jan 2020, 23:57:56 UTC

No body hear me, but each time the total of Wu reaches 23 MM something weird happening. Can be just coincidence?
Maybe. What are you totalling to get the 23M number?
The fact is, issues are occurring all the time, particularly with the Scheduler. And the backlog of Validation, Assimilation & Deletion and then Purging work doesn't help either.

You can see the Scheduler issues by looking at the In-progress numbers- people returning & reporting work, but not getting any replacement WUs.



Tues 22:30 UTC is the weekly outage, but after that there's blips at Thurs 02:00 & Thurs 18:00, then a bunch of big ones (several hours) Fri 01:00, another Fri 20:00 and the most recent Sat 10:00.
Grant
Darwin NT
ID: 2027386 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14008
Credit: 208,696,464
RAC: 304
Australia
Message 2027387 - Posted: 12 Jan 2020, 0:03:10 UTC - in response to Message 2027386.  

No body hear me, but each time the total of Wu reaches 23 MM something weird happening. Can be just coincidence?
Maybe. What are you totalling to get the 23M number?
Everything? In-progress, WU & Results awaiting validation, assimilation, deletion & purging?
Could be that's the point where everything jams up, and then it takes a while for things to get going again once some of those numbers have reduced.
Grant
Darwin NT
ID: 2027387 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2027401 - Posted: 12 Jan 2020, 1:00:55 UTC - in response to Message 2027387.  
Last modified: 12 Jan 2020, 1:03:16 UTC

No body hear me, but each time the total of Wu reaches 23 MM something weird happening. Can be just coincidence?
Maybe. What are you totalling to get the 23M number?
Everything? In-progress, WU & Results awaiting validation, assimilation, deletion & purging?
Could be that's the point where everything jams up, and then it takes a while for things to get going again once some of those numbers have reduced.

Yes everything, just sum all the lines and see. Could be just a coincidence but each time the total reaches 23 MM weird things starts to happening. When the totals down to 18-20 MM the system works normally. Please note i`m not saying that is a fact, was just an observation. As i say, could be just coincidence. Need more inside info to be sure.
ID: 2027401 · Report as offensive
Oddbjornik Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 220
Credit: 349,610,548
RAC: 1,728
Norway
Message 2027402 - Posted: 12 Jan 2020, 1:02:41 UTC - in response to Message 2027401.  

No body hear me, but each time the total of Wu reaches 23 MM something weird happening. Can be just coincidence?
Maybe. What are you totalling to get the 23M number?
Everything? In-progress, WU & Results awaiting validation, assimilation, deletion & purging?
Could be that's the point where everything jams up, and then it takes a while for things to get going again once some of those numbers have reduced.

Yes everything, just sum all the lines and see. Could be just a coincidence but each time the total reaches 22-23 MM weird things happening. When the totals down to 18-20 MM the system works normally. Please note i`m not saying that is a fact, was just an observation.
My guess is that it's just the statistics dump that slows everything down while it runs.
ID: 2027402 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14008
Credit: 208,696,464
RAC: 304
Australia
Message 2027403 - Posted: 12 Jan 2020, 1:03:55 UTC - in response to Message 2027401.  

Yes everything, just sum all the lines and see. Could be just a coincidence but each time the total reaches 22-23 MM weird things happening. When the totals down to 18-20 MM the system works normally. Please note i`m not saying that is a fact, was just an observation.
It would be good if it is the case. Sort out the Validation/ Assimilation/ Deletion/ Purge backlogs, and then they can increase the server side limits further & not have system issues (we can at least dream).
Grant
Darwin NT
ID: 2027403 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14008
Credit: 208,696,464
RAC: 304
Australia
Message 2027405 - Posted: 12 Jan 2020, 1:05:22 UTC - in response to Message 2027402.  

My guess is that it's just the statistics dump that slows everything down while it runs.
Isn't that just 1once every 24hrs? The Scheduler issues can occur several times a day, and not always at the same times, and last for 30min to over 5 hours.
Grant
Darwin NT
ID: 2027405 · Report as offensive
Oddbjornik Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 220
Credit: 349,610,548
RAC: 1,728
Norway
Message 2027412 - Posted: 12 Jan 2020, 1:53:41 UTC - in response to Message 2027405.  

My guess is that it's just the statistics dump that slows everything down while it runs.
Isn't that just 1once every 24hrs? The Scheduler issues can occur several times a day, and not always at the same times, and last for 30min to over 5 hours.
They've been running twice a day for some months. Approximately twelve hour intervals. Dump files go here, so it's easy to keep track.
ID: 2027412 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14008
Credit: 208,696,464
RAC: 304
Australia
Message 2027420 - Posted: 12 Jan 2020, 4:56:46 UTC
Last modified: 12 Jan 2020, 5:00:38 UTC

Well, they're better than noise bombs (as long as they aren't noise bombs themselves), but it looks like there's a bunch of shorties going to hit the servers soon- 29ap11ah looks like it's pretty much all shorties.
Grant
Darwin NT
ID: 2027420 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14008
Credit: 208,696,464
RAC: 304
Australia
Message 2027423 - Posted: 12 Jan 2020, 6:05:51 UTC
Last modified: 12 Jan 2020, 6:07:03 UTC

And the Scheduler's having another time out.
"Project has no tasks available" is the most common response, with a few 1 or 2 WUs handed out responses mixed in.

And website & forums getting laggy again.
Grant
Darwin NT
ID: 2027423 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14008
Credit: 208,696,464
RAC: 304
Australia
Message 2027425 - Posted: 12 Jan 2020, 8:25:34 UTC

Well, the Scheduler is awake again, now it's the download servers having issues. Taking quite a while for the downloads to start, even after several retries.
Grant
Darwin NT
ID: 2027425 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14008
Credit: 208,696,464
RAC: 304
Australia
Message 2027431 - Posted: 12 Jan 2020, 9:45:01 UTC

I hope the splitters get their act together soon, otherwise we'll be out of work in less than 15min.
Grant
Darwin NT
ID: 2027431 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 2027442 - Posted: 12 Jan 2020, 13:54:31 UTC - in response to Message 2027425.  

Uploads are slow and troublesome as well.
ID: 2027442 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14008
Credit: 208,696,464
RAC: 304
Australia
Message 2027479 - Posted: 12 Jan 2020, 17:54:39 UTC - in response to Message 2027442.  

Uploads are slow and troublesome as well.

Been that way for ages, although it seems to be worse after each server issue when things are recovering, and even worse than usual after the last couple of server problems.
Mind you the return rate lately has been at record levels.
Grant
Darwin NT
ID: 2027479 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2027543 - Posted: 13 Jan 2020, 1:07:35 UTC

It seems the Scheduler has checked out again. Can't get any New tasks, Webpages Slow, Time of last 'Results ready to send' results were 39 minutes ago, https://setiathome.berkeley.edu/show_server_status.php
Hopefully it will come back before my machines run outta work......AGAIN.
ID: 2027543 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2027551 - Posted: 13 Jan 2020, 4:26:18 UTC
Last modified: 13 Jan 2020, 4:28:01 UTC

Things are looking better, but most machines are Still Not receiving tasks. This one is down to about a Dozen tasks left out of a Thousand, https://setiathome.berkeley.edu/results.php?hostid=6796479
It's about to be run out of work for the Third time in 24 hours.
Correction....it's now out of work....AGAIN.
ID: 2027551 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14008
Credit: 208,696,464
RAC: 304
Australia
Message 2027554 - Posted: 13 Jan 2020, 6:49:41 UTC

I see we're presently recovering from yet another unscheduled Scheduler rest break.
Grant
Darwin NT
ID: 2027554 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11451
Credit: 29,581,041
RAC: 66
United States
Message 2027580 - Posted: 13 Jan 2020, 17:32:01 UTC

RTS = 39,968 this does not bode well for the recovery from the upcoming Tuesday outage.
ID: 2027580 · Report as offensive
Previous · 1 . . . 23 · 24 · 25 · 26 · 27 · 28 · 29 . . . 94 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)


 
©2026 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.