Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (119)
Message board moderation
Previous · 1 . . . 59 · 60 · 61 · 62 · 63 · 64 · 65 . . . 107 · Next
Author | Message |
---|---|
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
04-Apr-2020 18:43:29 [SETI@home] Scheduler request completed: got 1 new tasks another one ;) |
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
great ! |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
Even when you have no Tasks to report & just need new work?And just as I predicted, the next failed scheduler request happened at 7:25 UTC. Grant Darwin NT |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
And just as I predicted, the next failed scheduler request happened at 7:25 UTC. I have NNT. I still get errors regularily every 3 hours. Yesterday the errors happened in several requests in row but today it seems to be just one failed request every three hours. 04-Apr-2020 15:34:29 [SETI@home] Scheduler request completed 04-Apr-2020 15:44:54 [SETI@home] Scheduler request completed 04-Apr-2020 15:55:13 [SETI@home] Scheduler request completed 04-Apr-2020 16:06:11 [SETI@home] Scheduler request completed 04-Apr-2020 16:18:39 [SETI@home] Scheduler request failed: HTTP service unavailable 04-Apr-2020 16:22:19 [SETI@home] Scheduler request completed 04-Apr-2020 16:34:11 [SETI@home] Scheduler request completed 04-Apr-2020 16:44:26 [SETI@home] Scheduler request completed 04-Apr-2020 16:54:51 [SETI@home] Scheduler request completed 04-Apr-2020 17:05:14 [SETI@home] Scheduler request completed 04-Apr-2020 17:15:36 [SETI@home] Scheduler request completed 04-Apr-2020 17:26:15 [SETI@home] Scheduler request completed 04-Apr-2020 17:36:41 [SETI@home] Scheduler request completed 04-Apr-2020 17:47:04 [SETI@home] Scheduler request completed 04-Apr-2020 17:57:27 [SETI@home] Scheduler request completed 04-Apr-2020 18:07:51 [SETI@home] Scheduler request completed 04-Apr-2020 18:18:24 [SETI@home] Scheduler request completed 04-Apr-2020 18:28:45 [SETI@home] Scheduler request completed 04-Apr-2020 18:39:04 [SETI@home] Scheduler request completed 04-Apr-2020 18:49:21 [SETI@home] Scheduler request completed 04-Apr-2020 18:59:47 [SETI@home] Scheduler request completed 04-Apr-2020 19:11:05 [SETI@home] Scheduler request completed 04-Apr-2020 19:22:42 [SETI@home] Scheduler request failed: Failure when receiving data from the peer 04-Apr-2020 19:26:33 [SETI@home] Scheduler request completed 04-Apr-2020 19:37:06 [SETI@home] Scheduler request completed 04-Apr-2020 19:47:28 [SETI@home] Scheduler request completed 04-Apr-2020 19:57:44 [SETI@home] Scheduler request completed 04-Apr-2020 20:08:06 [SETI@home] Scheduler request completed 04-Apr-2020 20:18:35 [SETI@home] Scheduler request completed 04-Apr-2020 20:28:56 [SETI@home] Scheduler request completed 04-Apr-2020 20:39:22 [SETI@home] Scheduler request completed 04-Apr-2020 20:49:40 [SETI@home] Scheduler request completed 04-Apr-2020 20:59:59 [SETI@home] Scheduler request completed 04-Apr-2020 21:10:24 [SETI@home] Scheduler request completed 04-Apr-2020 21:20:49 [SETI@home] Scheduler request completed 04-Apr-2020 21:31:08 [SETI@home] Scheduler request completed 04-Apr-2020 21:41:28 [SETI@home] Scheduler request completed 04-Apr-2020 21:51:52 [SETI@home] Scheduler request completed 04-Apr-2020 22:02:49 [SETI@home] Scheduler request completed 04-Apr-2020 22:14:53 [SETI@home] Scheduler request completed 04-Apr-2020 22:27:28 [SETI@home] Scheduler request failed: HTTP service unavailable 04-Apr-2020 22:30:56 [SETI@home] Scheduler request completed 04-Apr-2020 22:41:19 [SETI@home] Scheduler request completed 04-Apr-2020 22:51:48 [SETI@home] Scheduler request completed 04-Apr-2020 23:02:08 [SETI@home] Scheduler request completed (My local time is 3 hours ahead of UTC) |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
It is much worse for my other computer that has no NNT: 04-Apr-2020 15:38:22 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 15:48:42 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 15:59:00 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 16:11:12 [SETI@home] Scheduler request failed: HTTP service unavailable 04-Apr-2020 16:14:21 [SETI@home] Scheduler request failed: Timeout was reached 04-Apr-2020 16:19:48 [SETI@home] Scheduler request failed: Timeout was reached 04-Apr-2020 16:26:42 [SETI@home] Scheduler request failed: Timeout was reached 04-Apr-2020 16:42:06 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 16:52:26 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 17:02:43 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 17:13:06 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 17:23:27 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 17:33:56 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 17:44:15 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 17:54:37 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 18:04:54 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 18:15:12 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 18:25:34 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 18:35:59 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 18:46:21 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 18:56:39 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 19:08:18 [SETI@home] Scheduler request failed: HTTP internal server error 04-Apr-2020 19:12:00 [SETI@home] Scheduler request failed: HTTP service unavailable 04-Apr-2020 19:17:42 [SETI@home] Scheduler request failed: HTTP internal server error 04-Apr-2020 19:25:31 [SETI@home] Scheduler request failed: HTTP internal server error 04-Apr-2020 19:35:17 [SETI@home] Scheduler request failed: HTTP internal server error 04-Apr-2020 20:02:34 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 20:12:55 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 20:23:19 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 20:33:40 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 20:44:01 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 20:54:21 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 21:04:43 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 21:15:06 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 21:25:25 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 21:35:47 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 21:46:16 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 21:56:33 [SETI@home] Scheduler request completed: got 0 new tasks 04-Apr-2020 22:08:45 [SETI@home] Scheduler request failed: HTTP internal server error 04-Apr-2020 22:12:15 [SETI@home] Scheduler request failed: HTTP internal server error 04-Apr-2020 22:17:46 [SETI@home] Scheduler request failed: HTTP internal server error 04-Apr-2020 22:24:39 [SETI@home] Scheduler request failed: Failure when receiving data from the peer 04-Apr-2020 22:38:17 [SETI@home] Scheduler request failed: HTTP internal server error 04-Apr-2020 22:59:40 [SETI@home] Scheduler request completed: got 0 new tasks |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Even when you have no Tasks to report & just need new work?And just as I predicted, the next failed scheduler request happened at 7:25 UTC. No, this works for us who still has work to report and due the server problem we can´t. If you have work to report and your hosts can`t it enter on a long delay adding more time on each call and stopping to ask for new work. This NNT allow the host reports the crunched WU so clear the cache and them when you set back o New Work the request back to ask for work and eventually pick some resend as your host is doing. <edit> From Ville post This procedure i use to avoid this internal server error who happening each 3 hrs as he post (i have no idea why this error happening BTW): 04-Apr-2020 19:08:18 [SETI@home] Scheduler request failed: HTTP internal server error 04-Apr-2020 19:12:00 [SETI@home] Scheduler request failed: HTTP service unavailable 04-Apr-2020 19:17:42 [SETI@home] Scheduler request failed: HTTP internal server error 04-Apr-2020 19:25:31 [SETI@home] Scheduler request failed: HTTP internal server error 04-Apr-2020 19:35:17 [SETI@home] Scheduler request failed: HTTP internal server error |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
set back o New Work the request back to ask for work and eventually pick some resend as your host is doing.That's the problem. The Scheduler has been spending so much time down that even when i do ask for work, there's no chance of getting because the Scheduler has gone AWOL. I've probably picked up less than 12 Tasks since the splitters stopped producing new work. Of course at the present rate it will probably be a week (or more) before i can see the actual numbers- it looks like the Replica has all but given up trying to get data from Main and just gets further & further behind. I think Eric needs to give his Tasks that have been out for over 4 weeks script another try to get some Resends out to get the size of the database down so things can actually function again. Grant Darwin NT |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
set back o New Work the request back to ask for work and eventually pick some resend as your host is doing.That's the problem. The Scheduler has been spending so much time down that even when i do ask for work, there's no chance of getting because the Scheduler has gone AWOL. Agree that is why i leave with NNT. At least the crunched work could be reported. Anyway in my particular case i still have a lot of work to be crunched until my host runs out. So try to get few more WU and the expense of stop to report the crunched work makes no sense for me. Hope in a week with less users around that could be sorted. What still bugs my mind is, if the queries/second=450, the Results received in last hour=13,001 and the rest of the numbers are low, why we continue to have problem with the servers. Is expected to they start to run smother after several days without new work and a less of requests from the running hosts. Other point that call my attention is why, if there are no more tapes to split, there still are few pfb splitters running on the servers. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14673 Credit: 200,643,578 RAC: 874 |
Whenever I've seen a fully-populated SSP today, the 'results returned per hour' has been around ten times the rate of creation of resends (15K returns/hour, closer to 1500 created/hour). So things are moving, oh-so-slowly, in the right direction. But it would still take 10 days or so to gather in the overdue results, and we will still have all the assimilation to do. |
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
last errors i had
i made UTC time for easy comparison ^^ |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
What still bugs my mind is, if the queries/second=450, the Results received in last hour=13,001 and the rest of the numbers are low, why we continue to have problem with the servers. Is expected to they start to run smother after several days without new work and a less of requests from the running hosts.Because although some people said the number of "Results out in the field" was a significant part of the problem (they've now dropped to lowest levels ever, with no impact what so ever), the numbers of "Results returned and awaiting validation" and " Workunits waiting for assimilation" are huge- and we have to wait for Resends in order to get those 2 to clear out. And with many WUs with over 2 month deadlines, and a surprising number with over 3 month deadlines, that's how long it's going to take to put a big dent in the present backlogged numbers. Once we get those down to below that magic 20 million total number, then the Assimilation should crank up again, and everything functioning normally again, at long last. Of course we'll have to wait another 2-3+ months for those resneds that have been sent out to hosts that don't return anything before they well be resent (yet again). Unless the change the deadlines for resends is implemented, or Eric gets his script working, it's going to take 9+ months to clear all presently outstanding results. Other point that call my attention is why, if there are no more tapes to split, there still are few pfb splitters running on the servers.And only 3 GBT splitters amongst them. Given the problems with Assimilation, i'd have thought there would be more than 2 v8 Assimilators running. Maybe they shut the others down to try & reduce I/O contention? Grant Darwin NT |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Given the problems with Assimilation, i'd have thought there would be more than 2 v8 Assimilators running. Maybe they shut the others down to try & reduce I/O contention? I asked that before but not rely understood the answer. Still not imagine how Results out in the field 3,527,077 will going to validate Results returned and awaiting validation 21,488,654 If you look at the AP part the numbers looks as expected: 36,636 to validate 38,675. But as you said, we will probably need to wait a couple of months for the answer. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
04-Apr-2020 18:43:29 [SETI@home] Scheduler request completed: got 1 new tasks . . Braggart :) Stephen :( |
Ulrich Metzner Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13 |
... 05/04/2020 00:37:38 | SETI@home | [cpu_sched] Starting task 27mr20aa.2497.25697.7.34.182_0 using setiathome_v8 version 805 in slot 1 05/04/2020 00:37:40 | SETI@home | Started upload of 27mr20aa.9940.11928.16.43.10_1_r826992552_0 05/04/2020 00:37:44 | SETI@home | Finished upload of 27mr20aa.9940.11928.16.43.10_1_r826992552_0 05/04/2020 00:39:33 | SETI@home | Sending scheduler request: To report completed tasks. 05/04/2020 00:39:33 | SETI@home | Reporting 2 completed tasks 05/04/2020 00:39:33 | SETI@home | Requesting new tasks for CPU and NVIDIA GPU 05/04/2020 00:39:36 | SETI@home | Scheduler request completed: got 75 new tasks <-- !!!!!! 05/04/2020 00:39:38 | SETI@home | Started download of 14mr09ag.17677.3344.10.37.252 05/04/2020 00:39:38 | SETI@home | Started download of 14mr09ag.17389.4162.4.31.251 05/04/2020 00:39:45 | SETI@home | Finished download of 14mr09ag.17389.4162.4.31.251 05/04/2020 00:39:45 | SETI@home | Started download of 14mr09ag.17677.3344.10.37.246 05/04/2020 00:39:46 | SETI@home | Finished download of 14mr09ag.17677.3344.10.37.252 05/04/2020 00:39:46 | SETI@home | Started download of 14mr09ag.17389.4162.4.31.255 05/04/2020 00:39:51 | SETI@home | Finished download of 14mr09ag.17677.3344.10.37.246 05/04/2020 00:39:51 | SETI@home | Started download of 14mr09ag.17389.4162.4.31.243 ... :O Aloha, Uli |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
@Stephen Could you make me a favor? Can you look and post (or PM) the size of your setiathome.berkeley.edu.xml file? Its located on the boinc directory (the one where the boinc.exe program is placed) And if you could look how many seconds your host is asking for new work on the scheduled request? That will appears o the history file if you have the sched_op flag activated. Asked because after the change for 303 to 606 my host has serious troubles to report the work and the only possible cause i was able to imagine focus on this 2 possibilities. So i need the info of a regular client who is getting work to be sure i'm in the right path. Thanks in advance. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
Can you look and post the size of your setiathome.berkeley.edu.xml file? Its located on the boinc directory (the one where the boinc.exe program is placed)For me, nothing in progress & nothing to report. 4kB. Grant Darwin NT |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
04-Apr-2020 18:43:29 [SETI@home] Scheduler request completed: got 1 new tasks if you're looking for account_setiathome.berkeley.edu.xml, mine's 6kb. As I recall, only the first three lines, <master_url>, <authenticator>, <project_name>, are needed to connect. The remainder gets filled in from the website from your preferences as defined there. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
I'm looking for this file: sched_request_setiathome.berkeley.edu It's contains the data the host send to the server when you ask for new work. Mine has 1.8 MB and that is what i believe is the problem. |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
Mine's 15kb. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Mine's 15kb Thanks. That's solve the puzzle. Something was changed and the servers not handle anymore my large file. Back to the drawing board. Will remain running with NNT until i think on something. Thanks again for the help. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.