Panic Mode On (108) Server Problems?

Author	Message
Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1898755 - Posted: 2 Nov 2017, 20:45:14 UTC - in response to Message 1898751. There's a thread over on Q&A, completed tasks, with a similar issue. Jord forwarded that info to Eric, who's apparently trying to look into it. Perhaps you can piggyback on that one to press the issue. ID: 1898755 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1898756 - Posted: 2 Nov 2017, 20:51:54 UTC - in response to Message 1898754. I'd try one cycle with 'no new tasks' selected, to get rid of those completed tasks: then try requesting work again (waiting 303 seconds first, of course), but with a smaller cache setting. I don't think you're ever going to need 13.85 CPU-days of work in one go, when there's a limit pf 100 CPU tasks at a time. What's the Host ID of Darksider? OK, out now - leave some tasks for me, please ;) ID: 1898756 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1898762 - Posted: 2 Nov 2017, 21:09:56 UTC - in response to Message 1898756. Last modified: 2 Nov 2017, 21:12:21 UTC Already tried that. Part of the 'kick the servers' process is setting NNT. Then interrupt the network communication. Wait out 305 seconds. Shut down BOINC and wait 1 minute and restart BOINC. Set tasks back to receive and then restart network communications. That process is what usually gets the servers to wake up and send you work. My work cache settings are global and set for 2.0 days + 0.1 days additional. As you said there is no point in asking for more work when the server limits you to 100 task per CPU and 100 tasks per GPU at any time. I should have 400 tasks on board at any time if the servers are working correctly and there is work, I request work often enough every day that 2 days is more than sufficient. I crunch through my 300 GPU task allotment every 2 1/2 hours. I have no clue why the servers are calculating that I am requesting that much work. It should be 2 days worth. I have 92 CPU tasks on board now. Zero GPU tasks. The Host ID is 8306366 8306366 Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1898762 ·

Brent Norman Volunteer tester Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835	Message 1898763 - Posted: 2 Nov 2017, 21:19:40 UTC - in response to Message 1898762. You are chasing goblins, it is a server problem affecting everyone, look at the SSP and haveland. ID: 1898763 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1898765 - Posted: 2 Nov 2017, 21:24:14 UTC - in response to Message 1898763. Thanks for the comment and clue Brent. I hadn't looked there yet since all the replies to me this morning was that everything was working fine for everyone else but me. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1898765 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1898767 - Posted: 2 Nov 2017, 21:36:24 UTC - in response to Message 1898765. Last modified: 2 Nov 2017, 21:40:11 UTC Thanks for the comment and clue Brent. I hadn't looked there yet since all the replies to me this morning was that everything was working fine for everyone else but me. . . I think that returns of only 99,000 in last hour seems low ... maybe not, but the creation rate of 0.5 tasks per sec definitely does. . . And as usual 610K tasks in the hopper and none being sent out ??? . . And I am getting nothing as well :( (Down to 60 tasks on the big rig and dropping, no 300 cache there) Stephen :( ID: 1898767 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1898773 - Posted: 2 Nov 2017, 22:33:10 UTC - in response to Message 1898762. I have no clue why the servers are calculating that I am requesting that much work. It should be 2 days worth. I have 92 CPU tasks on board now. Zero GPU tasks. The Host ID is 8306366 It's not the servers that calculate that - it's your own client doing the requesting. Asking for two days of work for each of 8 CPUs - that would be 16 days. You must have had some left. ID: 1898773 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1898774 - Posted: 2 Nov 2017, 22:51:04 UTC - in response to Message 1898767. Last modified: 2 Nov 2017, 22:53:41 UTC Thanks for the comment and clue Brent. I hadn't looked there yet since all the replies to me this morning was that everything was working fine for everyone else but me. . . I think that returns of only 99,000 in last hour seems low ... maybe not, but the creation rate of 0.5 tasks per sec definitely does. . . And as usual 610K tasks in the hopper and none being sent out ??? . . And I am getting nothing as well :( (Down to 60 tasks on the big rig and dropping, no 300 cache there) Stephen :( . . Update: . . Down to 4 tasks, no work coming in, shutting down for the interim. Wake me when the work starts flowing again <joke> ID: 1898774 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1898775 - Posted: 2 Nov 2017, 22:52:55 UTC From my end the servers seem to be in good order. Project details for: SETI@home including all dates Scheduler Requests: 4411 Scheduler Success: 99 %, Count: 4404 Scheduler Failure: 0 %, Count: 7 (Total) Scheduler Failure: 0 % of total, Count: 0 (Couldn't connect to server) Scheduler Failure: 0 % of total, Count: 4 (HTTP service unavailable) Scheduler Failure: 0 % of total, Count: 0 (HTTP internal server error) Scheduler Failure: 0 % of total, Count: 3 (Couldn't resolve host name) Scheduler Failure: 0 % of total, Count: 0 (Failure when receiving data from the peer) Scheduler Failure: 0 % of total, Count: 0 (Timeout was reached) Scheduler Timeout: 0 % of failures Project details for: SETI@home on 02-Nov-2017 Scheduler Requests: 114 Scheduler Success: 100 %, Count: 114 Project details for: SETI@home on 01-Nov-2017 Scheduler Requests: 173 Scheduler Success: 100 %, Count: 173 Project details for: SETI@home on 31-Oct-2017 Scheduler Requests: 65 Scheduler Success: 100 %, Count: 65 Project details for: SETI@home including all dates Total number of work requests: 4404 Number of requests gaining work: 1306 Number of requests not gaining work: 3098 Number of requests not gaining work: 2662 (project task limit) Number of requests not gaining work: 42 (project down for maintenance) Number of requests not gaining work: 0 (request too recent) Number of requests not gaining work: 52 (Project has no tasks available) Number of times no work was requested: 0 Number of tasks gained: 1590 Project details for: SETI@home on 02-Nov-2017 Total number of work requests: 114 Number of requests gaining work: 30 Number of requests not gaining work: 84 Number of requests not gaining work: 84 (project task limit) Number of tasks gained: 33 Project details for: SETI@home on 01-Nov-2017 Total number of work requests: 173 Number of requests gaining work: 63 Number of requests not gaining work: 110 Number of requests not gaining work: 107 (project task limit) Number of requests not gaining work: 3 (Project has no tasks available) Number of tasks gained: 67 Project details for: SETI@home on 31-Oct-2017 Total number of work requests: 65 Number of requests gaining work: 17 Number of requests not gaining work: 48 Number of requests not gaining work: 39 (project task limit) Number of requests not gaining work: 9 (project down for maintenance) Number of tasks gained: 46 SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1898775 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1898777 - Posted: 2 Nov 2017, 22:55:18 UTC - in response to Message 1898775. From my end the servers seem to be in good order. . . Aaahh! teacher's pet! :) Stephen :) ID: 1898777 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1898778 - Posted: 2 Nov 2017, 22:57:53 UTC - in response to Message 1898775. That's why I suspect there's something odd about the way Keith's Linux client is doing the requesting, which causes the server to fall over with an error. And of course if the server daemon falls over, it has to restart and re-cache whatever it held in memory - that'll slow things down. Keith's log contained: 690 11/2/2017 13:31:02 [http] [ID#0] Sent header to server: Ã¿ 702 SETI@home 11/2/2017 13:31:02 [http] [ID#1] Sent header to server: t (x86_64-pc-linux-gnu 7.8.3) 704 SETI@home 11/2/2017 13:31:02 [http] [ID#1] Sent header to server: Ac ID: 1898778 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1898782 - Posted: 2 Nov 2017, 23:54:48 UTC - in response to Message 1898778. That's why I suspect there's something odd about the way Keith's Linux client is doing the requesting, which causes the server to fall over with an error. And of course if the server daemon falls over, it has to restart and re-cache whatever it held in memory - that'll slow things down. Keith's log contained: 690 11/2/2017 13:31:02 [http] [ID#0] Sent header to server: Ã¿ 702 SETI@home 11/2/2017 13:31:02 [http] [ID#1] Sent header to server: t (x86_64-pc-linux-gnu 7.8.3) 704 SETI@home 11/2/2017 13:31:02 [http] [ID#1] Sent header to server: Ac I wonder if while the daemon is recovering the feeder queue will report being empty and is related to their high rate of "Project has no tasks available" responses when requesting work. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1898782 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1898784 - Posted: 3 Nov 2017, 0:11:32 UTC - in response to Message 1898778. That's was caused by the copy/paste from the remote BT server. It wasn't showing those characters in the machine log itself. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1898784 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1898785 - Posted: 3 Nov 2017, 0:12:18 UTC I have shut down the machine and restarted it a couple times now. It hasn't changed the symptom at all. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1898785 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1898787 - Posted: 3 Nov 2017, 0:17:03 UTC - in response to Message 1898782. That's why I suspect there's something odd about the way Keith's Linux client is doing the requesting, which causes the server to fall over with an error. And of course if the server daemon falls over, it has to restart and re-cache whatever it held in memory - that'll slow things down. Keith's log contained: 690 11/2/2017 13:31:02 [http] [ID#0] Sent header to server: Ã¿ 702 SETI@home 11/2/2017 13:31:02 [http] [ID#1] Sent header to server: t (x86_64-pc-linux-gnu 7.8.3) 704 SETI@home 11/2/2017 13:31:02 [http] [ID#1] Sent header to server: Ac I wonder if while the daemon is recovering the feeder queue will report being empty and is related to their high rate of "Project has no tasks available" responses when requesting work. The other machines haven't received the internal server error message today. They are Windows machines. I have received the error message on all machines in the past month. The Win10 machine moreso and it is a high production machine too that processes a lot of work fast each day. Not as fast as the Linux machine of course. All machines down on work with everyone getting the " no work is available" message response from the servers. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1898787 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1898791 - Posted: 3 Nov 2017, 0:39:03 UTC - in response to Message 1898787. I suppose when our machines run out of work in a short while We can all sit around and pretend there's nothing wrong with the Server. All my machines are Low with 2 getting Very Low. Increasing the cache didn't work this time. You can see it on the SSP as well. Both the Results out in the field & Results received in last hour have dropped well be;ow the recent normal levels. All I'm getting is; Thu Nov 2 20:36:48 2017 \| SETI@home \| Requesting new tasks for CPU and NVIDIA GPU Thu Nov 2 20:36:48 2017 \| SETI@home \| [sched_op] CPU work request: 201653.26 seconds; 0.00 devices Thu Nov 2 20:36:48 2017 \| SETI@home \| [sched_op] NVIDIA GPU work request: 516992.53 seconds; 0.00 devices Thu Nov 2 20:36:51 2017 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Nov 2 20:36:51 2017 \| SETI@home \| [sched_op] Server version 707 Thu Nov 2 20:36:51 2017 \| SETI@home \| Project has no tasks available Thu Nov 2 20:36:51 2017 \| SETI@home \| Project requested delay of 303 seconds Thu Nov 2 20:36:51 2017 \| SETI@home \| [sched_op] Deferring communication for 00:05:03 Thu Nov 2 20:36:51 2017 \| SETI@home \| [sched_op] Reason: requested by project Over and over again... ID: 1898791 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1898792 - Posted: 3 Nov 2017, 0:39:45 UTC I just tried an update on the Linux machine to override the backoff caused by the server error message. Looks like they might have straightened out the servers a bit. I am getting a proper response now. Just the normal "no work is available" message that everyone's been getting today when requesting work. Darksider 2680 SETI@home 11/2/2017 17:35:49 Sending scheduler request: To fetch work. 2681 SETI@home 11/2/2017 17:35:49 Requesting new tasks for CPU and NVIDIA GPU 2682 SETI@home 11/2/2017 17:35:51 Scheduler request completed: got 0 new tasks 2683 SETI@home 11/2/2017 17:35:51 Project has no tasks available Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1898792 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1898795 - Posted: 3 Nov 2017, 0:50:25 UTC Sheesh! The RTS buffer is up over 800K tasks! And nobody is getting any of them. The splitters have run amok. You would think they have a process that tells the splitters to back off and stop once you reach a prescribed buffer threshold. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1898795 ·

RueiKe Volunteer tester Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785	Message 1898796 - Posted: 3 Nov 2017, 1:00:33 UTC - in response to Message 1898785. I have shut down the machine and restarted it a couple times now. It hasn't changed the symptom at all. My Linux machine is currently without work also. https://setiathome.berkeley.edu/results.php?hostid=8365846 The 437 in progress are ghosts from when I was fumbling around to get the machine up. My other systems are also low on work. ID: 1898796 ·

Brent Norman Volunteer tester Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835	Message 1898798 - Posted: 3 Nov 2017, 1:04:02 UTC I managed to hold on for the last 4 hours with a few dribbles coming in but my 1080s ran dry, move 53 CPU tasks over, that should hold for 30 minutes, LOL. ID: 1898798 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.