Message boards :
Number crunching :
Panic Mode On (115) Server Problems?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 31 · Next
Author | Message |
---|---|
Wiggo Send message Joined: 24 Jan 00 Posts: 36385 Credit: 261,360,520 RAC: 489 |
All back to business as usual here now. Cheers. |
Cactus Bob Send message Joined: 19 May 99 Posts: 209 Credit: 10,924,287 RAC: 29 |
Just checked the SSP and the RTS is at 69. Creation rate is 53/s. not good Luck for me a have a low RAC so will prob not run out before the servers catch up and stabilize. Bob Sometimes I wonder, what happened to all the people I gave directions to? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SETI@home classic workunits 4,321 SETI@home classic CPU time 22,169 hours |
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
Some DL errors this night ( here UTC+1 ) with fine UL
|
Bill Send message Joined: 30 Nov 05 Posts: 282 Credit: 6,916,194 RAC: 60 |
So I had a problem in the old Panic Mode On thread (message located here, where after the maintenance shutdown, I would have a full slug of CPU tasks, but I am only getting one GPU task at a time. Sometimes, no GPU tasks even come through and other projects with a resource share of 0 start crunching GPU tasks. At the time I had several AP7 GPU tasks that were stopping due to computation errors, but I haven't had that problem for a few weeks now. I don't see how the server could be restricting me because of that. Any thoughts? I did see a line in the event log that made me curious. I searched for the line on the forums, but it was from nine years ago. I'm not sure if it is even valid anymore. I'm not in front of that computer but when I am I'll post what I discovered. Seti@home classic: 1,456 results, 1.613 years CPU time |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
Which event log entry is of concern? Please post startup. Which computer? For one, I would try suspending all other projects for awhile so you are only requesting from seti and see if that changes anything. |
Bill Send message Joined: 30 Nov 05 Posts: 282 Credit: 6,916,194 RAC: 60 |
Let me pull up the event log this evening when I have a moment. I'll post information then. Computer this is happening to is here. Seti@home classic: 1,456 results, 1.613 years CPU time |
rob smith Send message Joined: 7 Mar 03 Posts: 22455 Credit: 416,307,556 RAC: 380 |
You appear to be using an unreleased version of BOINC on that computer - is that the one you were working with Richard to trace the crazy peak_flops problem on? Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Likely is. I think he is running the beta client from appveyor that Richard had him test. If so, it is likely it is from the current 7.15.0 master branch and that branch has a serious problem with work fetch caused by a commit that was developed to resolve a bug I filed. Fixed my bug but broke work fetch badly. Anything is possible now with the current master branch with inability to get work for various situations. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Bill Send message Joined: 30 Nov 05 Posts: 282 Credit: 6,916,194 RAC: 60 |
Likely is. I think he is running the beta client from appveyor that Richard had him test. If so, it is likely it is from the current 7.15.0 master branch and that branch has a serious problem with work fetch caused by a commit that was developed to resolve a bug I filed. Fixed my bug but broke work fetch badly. Anything is possible now with the current master branch with inability to get work for various situations.Yeah, good point, I'm running the unreleased version of BOINC. I suppose troubleshooting this might be wasting time. I'm still crunching, I just don't have the backlog that one would normally like. Seti@home classic: 1,456 results, 1.613 years CPU time |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
If you have any sort of max_concurrent statement on any project, that is what is likely interfering with work fetch. Or if you have a large REC debt to other projects. DA wrote the code to not request work if any project had an abundance of work onboard to complete. So you will likely not be able to maintain any sort of normal cache. The way the client now works is more of a just in time system. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Bill Send message Joined: 30 Nov 05 Posts: 282 Credit: 6,916,194 RAC: 60 |
If you have any sort of max_concurrent statement on any project, that is what is likely interfering with work fetch. Or if you have a large REC debt to other projects. DA wrote the code to not request work if any project had an abundance of work onboard to complete. So you will likely not be able to maintain any sort of normal cache. The way the client now works is more of a just in time system.I do not have any max_concurrent, so we're good there. I did just realize I have resource share for seti beta set to 100, but I have that project suspended since I'm done testing for now. I have set the resource share to 0 just in case...but this problem was happening before I used the beta. As for my event log, this is the typical verbiage I get when attempting to get more tasks: 2/21/2019 4:33:21 PM | SETI@home | Requesting new tasks for CPU 2/21/2019 4:33:22 PM | SETI@home | Scheduler request completed: got 0 new tasks 2/21/2019 4:33:22 PM | SETI@home | No tasks sent 2/21/2019 4:33:22 PM | SETI@home | No tasks are available for AstroPulse v7 2/21/2019 4:33:22 PM | SETI@home | No tasks are available for SETI@home v8 2/21/2019 4:33:22 PM | SETI@home | Tasks for Intel GPU are available, but your preferences are set to not accept them 2/21/2019 4:33:22 PM | SETI@home | This computer has reached a limit on tasks in progress That is when it is looking for more CPU work. When a GPU tasks completes, this is all I get: 2/21/2019 4:28:14 PM | SETI@home | Reporting 1 completed tasks 2/21/2019 4:28:14 PM | SETI@home | Requesting new tasks for CPU and AMD/ATI GPU 2/21/2019 4:28:16 PM | SETI@home | Scheduler request completed: got 1 new tasks Seti@home classic: 1,456 results, 1.613 years CPU time |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
I suggest you add "sched_op_debug" to your Event Log options. It's relatively quiet, but it shows much more clearly what is being requested (or not) at each scheduler contact. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I suggest you add "sched_op_debug" to your Event Log options. It's relatively quiet, but it shows much more clearly what is being requested (or not) at each scheduler contact. +1 Always run that option myself for the log. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Bill Send message Joined: 30 Nov 05 Posts: 282 Credit: 6,916,194 RAC: 60 |
I suggest you add "sched_op_debug" to your Event Log options. It's relatively quiet, but it shows much more clearly what is being requested (or not) at each scheduler contact. So before I retire for the evening, this is what has popped up so far: 2/21/2019 10:00:37 PM | SETI@home | [sched_op] CPU work request: 1615101.02 seconds; 0.00 devices 2/21/2019 10:00:37 PM | SETI@home | [sched_op] AMD/ATI GPU work request: 0.00 seconds; 0.00 devices 2/21/2019 10:00:38 PM | SETI@home | [sched_op] Server version 709I will post more tomorrow. Seti@home classic: 1,456 results, 1.613 years CPU time |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Yep, you aren't asking for gpu work. I never saw a notice about Eric updating the Seti servers with the updated software for the schedulers you tested at Beta. But you now have an established APR for gpu work here at Main. That should prevent you from getting any more -197 errors because of an incorrectly calculated time to completion. So you could revert back to the stock 7.14.2 client and see if you can process gpu work now. You might want to run down your cache to only a dozen or so tasks for the experiment. That way you won't throw away a ton of tasks if the error is still present on the stock client now that you have a reasonable APR. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
status page isn't updating, so something might be going wrong. I hope by posting about it, it will magically fix itself. Maybe it is just a small hiccup. Is there a schedule of when hiccups happen? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
Yep, you aren't asking for gpu work. I never saw a notice about Eric updating the Seti servers with the updated software for the schedulers you tested at Beta.I haven't heard about any software update either, although we're working to get the patch into the official server release code (#3027). Bill only has an established APR for MB work here - he's had that all along, before all this started. It was AP work that started the trouble, and that still needs the patch. |
Tom M Send message Joined: 28 Nov 02 Posts: 5126 Credit: 276,046,078 RAC: 462 |
Is there a schedule of when hiccups happen? Of course there is :) I believe they input a random number table from a text book and have been invoking it systematically, at random and throwing those results as a seed into a random generator. This produces that highly reliable schedule of random events :) Sorry, just fooling around. Its Friday!!! Tom A proud member of the OFA (Old Farts Association). |
Bill Send message Joined: 30 Nov 05 Posts: 282 Credit: 6,916,194 RAC: 60 |
So I think most of today S@H was not requesting GPU work. I seem to keep getting info such as this: 2/22/2019 9:12:03 PM | SETI@home | [sched_op] CPU work request: 1619990.50 seconds; 0.00 devices 2/22/2019 9:12:03 PM | SETI@home | [sched_op] AMD/ATI GPU work request: 0.00 seconds; 0.00 devicesHowever, I did have E@H enabled as a 0 work load project, and it kept processing GPU tasks for that. I suspended E@H, and then got this: 2/22/2019 9:22:17 PM | SETI@home | [sched_op] CPU work request: 1621367.64 seconds; 0.00 devices 2/22/2019 9:22:17 PM | SETI@home | [sched_op] AMD/ATI GPU work request: 1.00 seconds; 1.00 devices 2/22/2019 9:22:19 PM | SETI@home | Scheduler request completed: got 1 new tasks 2/22/2019 9:22:19 PM | SETI@home | [sched_op] Server version 709 2/22/2019 9:22:19 PM | SETI@home | Project requested delay of 303 seconds 2/22/2019 9:22:19 PM | SETI@home | [sched_op] estimated total CPU task duration: 0 seconds 2/22/2019 9:22:19 PM | SETI@home | [sched_op] estimated total AMD/ATI GPU task duration: 1447 seconds And I finally was able to download a single GPU task (amazing!). I'll let this run overnight like this and see what happens. Seti@home classic: 1,456 results, 1.613 years CPU time |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . I have a small issue someone might be able to help me with. During the kerfuffle when we lost the storage device, along with a great number of failed downloads and later failed validations due to missing files I had one other casualty. One task which had completed went through the upload process but failed to complete, yet apparently went far enough that the result files were removed from my drive and the transfer from my file transfer screeen, but remains in the Manager list as "uploading". I presume this simply means that the client didn't update the client_state.xml to remove the listing by marking it as reported. I could not abort the transfer because it had been removed. I cannot abort the task because it's state is "being uploaded". I tried the ghost recovery protocol but this achieved nothing (possibly because there were no actual ghosted tasks) and even tried the benchmarking option in the hope it would download the master file but none of these methods worked. I gave up and waited for it to be removed from the system hoping that might somehow trigger an accounting for the task but it is still there. . . So does anyone have any ideas on how to get Boinc Manager to 'realise' that this task is no longer here? I am sick of seeing it sitting there taunting me ... :( Stephen :) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.