Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (119)
Message board moderation
Previous · 1 . . . 58 · 59 · 60 · 61 · 62 · 63 · 64 . . . 107 · Next
Author | Message |
---|---|
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
03-Apr-2020 21:54:49 [SETI@home] Scheduler request completed: got 7 new tasks UTC+2 all _2 indeed |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
This is weird. The scheduling is back to work and data is flowing but some of the request ends on: vie 03 abr 2020 15:46:07 EST | SETI@home | Project requested delay of 606 seconds Instead of the regular 303 sec. Is only here or something was changed? I not remembering see that before. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
This is weird. . . Clearly with almost no work to send out and far fewer hosts still active they decided to reduce the server load so they can play catchup. IMHO. Maybe that will slow the progress into ludicrous backoff times which increases exponentially with each failed attempt to get work. Stephen :( |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
This is weird. Was the first i think too, but there are some request who ends on 303 and some on 606. Can you look yours? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13864 Credit: 208,696,464 RAC: 304 |
Was the first i think too, but there are some request who ends on 303 and some on 606.10min backoff on all requests since the Scheduler came back to life. Looks like it is able to do 8 unsuccessful requests before it starts increasing the backoff time. If they would now just set Resend deadlines to 2 days, we could get this cleared out in 3.5 months. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13864 Credit: 208,696,464 RAC: 304 |
Not a good sign, forums have just become sluggish again. So how long til the Scheduler falls over again? Grant Darwin NT |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Was the first i think too, but there are some request who ends on 303 and some on 606.10min backoff on all requests since the Scheduler came back to life. Looks like it is able to do 8 unsuccessful requests before it starts increasing the backoff time. So far today I have been successful in receiving: 4/04/2020 9:20:16 AM | SETI@home | Scheduler request completed: got 23 new tasks All work has been returned took just over an hour to process. I hope it made a difference they were all 2 and 3. Yes changing the deadlines would help but we still have to wait for the initial period of the tasks to expire. What will be will be |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13864 Credit: 208,696,464 RAC: 304 |
So far today I have been successful in receiving:You were incredibly lucky, I've only seen a couple of WUs since they stopped splitting new work. And the Scheduler is borked again, so it's not too likely i'll see any more in the near future. Edit- and it's back again. I see it's going to be one of those days. And it looks like the Replica is trying for a new record. Grant Darwin NT |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
The scheduling is back to work and data is flowing but some of the request ends on: . . I am only getting "no tasks" and have no work to report so my logs are not recording the backoff time being set. But when I do a manual update the backoff is 606. Stephen . . |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13864 Credit: 208,696,464 RAC: 304 |
Ok, this is what is happening with Scheduler requests resulting in no work & backoffs at the moment. 4/04/2020 8:09:49 | SETI@home | Scheduler request completed: got 0 new tasks 1 4/04/2020 8:20:58 | SETI@home | Scheduler request completed: got 0 new tasks 2 4/04/2020 8:31:06 | SETI@home | Scheduler request completed: got 0 new tasks 3 4/04/2020 8:41:15 | SETI@home | Scheduler request completed: got 0 new tasks 4 4/04/2020 8:51:23 | SETI@home | Scheduler request completed: got 0 new tasks 5 4/04/2020 9:18:34 | SETI@home | Scheduler request completed: got 0 new tasks 6 4/04/2020 9:28:44 | SETI@home | Scheduler request completed: got 0 new tasks 7It looks like we get 4 requests that result in a backoff of 10min. The 5th request results in a 30min backoff The 6th request results in a 10min backoff. The 7th request has resulted in a 1hr 20min backoff. I'm dreading what the next couple of failures to get work will result in. Grant Darwin NT |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
that's standard BOINC behavior to extend the backoffs. this kind of thing is exactly why i run: watch -n 930 ./boinccmd --project http://setiathome.berkeley.edu update in a terminal window. no more extended backoffs ever. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13864 Credit: 208,696,464 RAC: 304 |
And the Scheduler has fallen over, yet again, so that should reset everything. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13864 Credit: 208,696,464 RAC: 304 |
Well, the Scheduler is back. Results-in-progress is now almost a million Tasks lower than it was before the Server side limits were raised, almost 2.5 million below it's peak. Database is still jammed. 8.5 million Workunits waiting for assimilation, 22.4 million Results returned and awaiting validation. Grant Darwin NT |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Well, the Scheduler is back. When things move in in the database I expect they will move quickly. I am thinking there may be quite a few resends to come because the number of tasks out in the field is over triple the amount of work waiting validation. It will be interesting to see what happens when things move. |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
I see a repeating pattern in my failing scheduler requests. Everything works fine for a few hours and then I get a bunch - usually three consecutive - failures. Those failures periods repeat almost exactly once every three hours. Yesterday I had a long streak of successes and then between 19:11 and 19:27 UTC just failures. After that successes again until the next failure bunch happened between 22:21 and 22:39. Then the next failures between 01:17 and 01:24 and then the next 4:11 to 4:31. If this pattern is not just a random coincidence, then we could expect the next failure period happen at about 7:15 UTC. |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
And just as I predicted, the next failed scheduler request happened at 7:25 UTC. |
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
|
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
SSP is showing sane data! Assimilation has finally started to catch up - slowly. Replica db is falling behind - fast. |
Cherokee150 Send message Joined: 11 Nov 99 Posts: 192 Credit: 58,513,758 RAC: 74 |
I see that some of you are still getting quite a few tasks which, I suspect, must be resends. My fastest computer has just run out of the massive cache I had. That computer hasn't received a task since two were sent on April 2nd, at 12:11:11 UTC. If there is a way to get a few more tasks, please tell me. By the way, there is an interesting story why the search for SETI is, and has been, one of my life's goals since 1962. I think it is time I should share it with this special community. I will, time permitting, write it down and post it for you sometime within the next few days. Also, I wish to thank all of you for the most wonderful assistance, and friendship, that all of you have given to everyone over the last two decades. I truly feel I am very lucky to be a part of this special faamily. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
And just as I predicted, the next failed scheduler request happened at 7:25 UTC. Did you try to NNT? When i do that the problem disappears. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.