The Server Issues / Outages Thread - Panic Mode On! (119)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 58 · 59 · 60 · 61 · 62 · 63 · 64 . . . 107 · Next

AuthorMessage
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 716
Credit: 8,032,827
RAC: 62
France
Message 2042814 - Posted: 3 Apr 2020, 20:22:48 UTC

03-Apr-2020 21:54:49 [SETI@home] Scheduler request completed: got 7 new tasks
03-Apr-2020 21:54:51 [SETI@home] Started download of 27mr20aa.28143.10701.8.35.253
03-Apr-2020 21:54:51 [SETI@home] Started download of 27mr20aa.8571.10701.9.36.22
03-Apr-2020 21:54:51 [SETI@home] Started download of 26mr20ae.17732.11110.9.36.252
03-Apr-2020 21:54:51 [SETI@home] Started download of 26mr20ae.21537.9883.10.37.251
03-Apr-2020 21:54:51 [SETI@home] Started download of 27mr20ab.20599.119335.6.33.0.vlar
03-Apr-2020 21:54:51 [SETI@home] Started download of 27mr20ab.20599.119335.6.33.42.vlar
03-Apr-2020 21:54:51 [SETI@home] Started download of 27mr20ab.20599.119335.6.33.198.vlar
03-Apr-2020 21:54:55 [SETI@home] Finished download of 26mr20ae.17732.11110.9.36.252
03-Apr-2020 21:54:56 [SETI@home] Finished download of 27mr20aa.28143.10701.8.35.253
03-Apr-2020 21:54:56 [SETI@home] Finished download of 27mr20aa.8571.10701.9.36.22
03-Apr-2020 21:54:56 [SETI@home] Finished download of 27mr20ab.20599.119335.6.33.42.vlar
03-Apr-2020 21:54:57 [SETI@home] Finished download of 26mr20ae.21537.9883.10.37.251
03-Apr-2020 21:54:57 [SETI@home] Finished download of 27mr20ab.20599.119335.6.33.0.vlar
03-Apr-2020 21:54:58 [SETI@home] Finished download of 27mr20ab.20599.119335.6.33.198.vlar


UTC+2 all _2 indeed
ID: 2042814 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2042817 - Posted: 3 Apr 2020, 20:48:26 UTC

This is weird.

The scheduling is back to work and data is flowing but some of the request ends on:

vie 03 abr 2020 15:46:07 EST | SETI@home | Project requested delay of 606 seconds


Instead of the regular 303 sec.

Is only here or something was changed? I not remembering see that before.
ID: 2042817 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2042818 - Posted: 3 Apr 2020, 20:52:39 UTC - in response to Message 2042817.  
Last modified: 3 Apr 2020, 20:54:25 UTC

This is weird.
The scheduling is back to work and data is flowing but some of the request ends on:
vie 03 abr 2020 15:46:07 EST | SETI@home | Project requested delay of 606 seconds

Instead of the regular 303 sec.
Is only here or something was changed? I not remembering see that before.


. . Clearly with almost no work to send out and far fewer hosts still active they decided to reduce the server load so they can play catchup. IMHO. Maybe that will slow the progress into ludicrous backoff times which increases exponentially with each failed attempt to get work.

Stephen

:(
ID: 2042818 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2042819 - Posted: 3 Apr 2020, 20:55:08 UTC - in response to Message 2042818.  

This is weird.
The scheduling is back to work and data is flowing but some of the request ends on:
vie 03 abr 2020 15:46:07 EST | SETI@home | Project requested delay of 606 seconds

Instead of the regular 303 sec.
Is only here or something was changed? I not remembering see that before.


. . Clearly with almost no work to send out and far fewer hosts still active they decided to reduce the server load so they can play catchup. IMHO.

Stephen

:(

Was the first i think too, but there are some request who ends on 303 and some on 606.
Can you look yours?
ID: 2042819 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13833
Credit: 208,696,464
RAC: 304
Australia
Message 2042830 - Posted: 3 Apr 2020, 21:59:10 UTC - in response to Message 2042819.  

Was the first i think too, but there are some request who ends on 303 and some on 606.
Can you look yours?
10min backoff on all requests since the Scheduler came back to life. Looks like it is able to do 8 unsuccessful requests before it starts increasing the backoff time.

If they would now just set Resend deadlines to 2 days, we could get this cleared out in 3.5 months.
Grant
Darwin NT
ID: 2042830 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13833
Credit: 208,696,464
RAC: 304
Australia
Message 2042832 - Posted: 3 Apr 2020, 22:05:21 UTC

Not a good sign, forums have just become sluggish again. So how long til the Scheduler falls over again?
Grant
Darwin NT
ID: 2042832 · Report as offensive     Reply Quote
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 2042836 - Posted: 3 Apr 2020, 22:18:05 UTC - in response to Message 2042830.  

Was the first i think too, but there are some request who ends on 303 and some on 606.
Can you look yours?
10min backoff on all requests since the Scheduler came back to life. Looks like it is able to do 8 unsuccessful requests before it starts increasing the backoff time.

If they would now just set Resend deadlines to 2 days, we could get this cleared out in 3.5 months.

So far today I have been successful in receiving:
4/04/2020 9:20:16 AM | SETI@home | Scheduler request completed: got 23 new tasks
All work has been returned took just over an hour to process. I hope it made a difference they were all 2 and 3. Yes changing the deadlines would help but we still have to wait for the initial period of the tasks to expire. What will be will be
ID: 2042836 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13833
Credit: 208,696,464
RAC: 304
Australia
Message 2042838 - Posted: 3 Apr 2020, 22:39:06 UTC - in response to Message 2042836.  
Last modified: 3 Apr 2020, 22:57:58 UTC

So far today I have been successful in receiving:
4/04/2020 9:20:16 AM | SETI@home | Scheduler request completed: got 23 new tasks
All work has been returned took just over an hour to process. I hope it made a difference they were all 2 and 3. Yes changing the deadlines would help but we still have to wait for the initial period of the tasks to expire. What will be will be
You were incredibly lucky,
I've only seen a couple of WUs since they stopped splitting new work.

And the Scheduler is borked again, so it's not too likely i'll see any more in the near future.


Edit- and it's back again.
I see it's going to be one of those days.


And it looks like the Replica is trying for a new record.
Grant
Darwin NT
ID: 2042838 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2042861 - Posted: 4 Apr 2020, 0:45:57 UTC - in response to Message 2042819.  

The scheduling is back to work and data is flowing but some of the request ends on:
vie 03 abr 2020 15:46:07 EST | SETI@home | Project requested delay of 606 seconds

Instead of the regular 303 sec.
Is only here or something was changed? I not remembering see that before.

. . Clearly with almost no work to send out and far fewer hosts still active they decided to reduce the server load so they can play catchup. IMHO.
Stephen

Was the first i think too, but there are some request who ends on 303 and some on 606.
Can you look yours?


. . I am only getting "no tasks" and have no work to report so my logs are not recording the backoff time being set. But when I do a manual update the backoff is 606.

Stephen

. .
ID: 2042861 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13833
Credit: 208,696,464
RAC: 304
Australia
Message 2042864 - Posted: 4 Apr 2020, 1:03:35 UTC

Ok, this is what is happening with Scheduler requests resulting in no work & backoffs at the moment.
4/04/2020 8:09:49 | SETI@home | Scheduler request completed: got 0 new tasks   1
4/04/2020 8:20:58 | SETI@home | Scheduler request completed: got 0 new tasks   2
4/04/2020 8:31:06 | SETI@home | Scheduler request completed: got 0 new tasks   3
4/04/2020 8:41:15 | SETI@home | Scheduler request completed: got 0 new tasks   4
4/04/2020 8:51:23 | SETI@home | Scheduler request completed: got 0 new tasks   5
4/04/2020 9:18:34 | SETI@home | Scheduler request completed: got 0 new tasks   6
4/04/2020 9:28:44 | SETI@home | Scheduler request completed: got 0 new tasks   7
It looks like we get 4 requests that result in a backoff of 10min.
The 5th request results in a 30min backoff
The 6th request results in a 10min backoff.
The 7th request has resulted in a 1hr 20min backoff.

I'm dreading what the next couple of failures to get work will result in.
Grant
Darwin NT
ID: 2042864 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2042868 - Posted: 4 Apr 2020, 1:21:07 UTC - in response to Message 2042864.  

that's standard BOINC behavior to extend the backoffs.

this kind of thing is exactly why i run:

watch  -n 930 ./boinccmd --project http://setiathome.berkeley.edu update


in a terminal window. no more extended backoffs ever.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2042868 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13833
Credit: 208,696,464
RAC: 304
Australia
Message 2042869 - Posted: 4 Apr 2020, 1:24:03 UTC

And the Scheduler has fallen over, yet again, so that should reset everything.
Grant
Darwin NT
ID: 2042869 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13833
Credit: 208,696,464
RAC: 304
Australia
Message 2042881 - Posted: 4 Apr 2020, 3:24:27 UTC

Well, the Scheduler is back.


Results-in-progress is now almost a million Tasks lower than it was before the Server side limits were raised, almost 2.5 million below it's peak. Database is still jammed.
8.5 million Workunits waiting for assimilation, 22.4 million Results returned and awaiting validation.
Grant
Darwin NT
ID: 2042881 · Report as offensive     Reply Quote
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 2042884 - Posted: 4 Apr 2020, 3:49:16 UTC - in response to Message 2042881.  

Well, the Scheduler is back.


Results-in-progress is now almost a million Tasks lower than it was before the Server side limits were raised, almost 2.5 million below it's peak. Database is still jammed.
8.5 million Workunits waiting for assimilation, 22.4 million Results returned and awaiting validation.

When things move in in the database I expect they will move quickly. I am thinking there may be quite a few resends to come because the number of tasks out in the field is over triple the amount of work waiting validation. It will be interesting to see what happens when things move.
ID: 2042884 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2042893 - Posted: 4 Apr 2020, 5:06:47 UTC
Last modified: 4 Apr 2020, 5:07:17 UTC

I see a repeating pattern in my failing scheduler requests.

Everything works fine for a few hours and then I get a bunch - usually three consecutive - failures. Those failures periods repeat almost exactly once every three hours.

Yesterday I had a long streak of successes and then between 19:11 and 19:27 UTC just failures. After that successes again until the next failure bunch happened between 22:21 and 22:39. Then the next failures between 01:17 and 01:24 and then the next 4:11 to 4:31.

If this pattern is not just a random coincidence, then we could expect the next failure period happen at about 7:15 UTC.
ID: 2042893 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2042909 - Posted: 4 Apr 2020, 8:29:42 UTC

And just as I predicted, the next failed scheduler request happened at 7:25 UTC.
ID: 2042909 · Report as offensive     Reply Quote
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 716
Credit: 8,032,827
RAC: 62
France
Message 2042910 - Posted: 4 Apr 2020, 8:41:00 UTC
Last modified: 4 Apr 2020, 8:44:42 UTC


04-Apr-2020 02:50:08 [SETI@home] Scheduler request completed: got 1 new tasks
04-Apr-2020 04:09:21 [SETI@home] Scheduler request completed: got 1 new tasks
04-Apr-2020 07:03:18 [SETI@home] Scheduler request completed: got 2 new tasks
04-Apr-2020 07:13:31 [SETI@home] Scheduler request completed: got 1 new tasks
04-Apr-2020 08:43:30 [SETI@home] Scheduler request completed: got 4 new tasks
04-Apr-2020 09:56:53 [SETI@home] Scheduler request completed: got 2 new tasks

ID: 2042910 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2042919 - Posted: 4 Apr 2020, 10:56:39 UTC

SSP is showing sane data! Assimilation has finally started to catch up - slowly. Replica db is falling behind - fast.
ID: 2042919 · Report as offensive     Reply Quote
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 2042924 - Posted: 4 Apr 2020, 12:29:13 UTC

I see that some of you are still getting quite a few tasks which, I suspect, must be resends. My fastest computer has just run out of the massive cache I had. That computer hasn't received a task since two were sent on April 2nd, at 12:11:11 UTC. If there is a way to get a few more tasks, please tell me.

By the way, there is an interesting story why the search for SETI is, and has been, one of my life's goals since 1962.
I think it is time I should share it with this special community. I will, time permitting, write it down and post it for you sometime within the next few days.

Also, I wish to thank all of you for the most wonderful assistance, and friendship, that all of you have given to everyone over the last two decades. I truly feel I am very lucky to be a part of this special faamily.
ID: 2042924 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2042932 - Posted: 4 Apr 2020, 14:09:52 UTC - in response to Message 2042909.  

And just as I predicted, the next failed scheduler request happened at 7:25 UTC.

Did you try to NNT? When i do that the problem disappears.
ID: 2042932 · Report as offensive     Reply Quote
Previous · 1 . . . 58 · 59 · 60 · 61 · 62 · 63 · 64 . . . 107 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.