Panic Mode On (111) Server Problems?

Author	Message
TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1928156 - Posted: 5 Apr 2018, 19:46:20 UTC - in response to Message 1928139. Last modified: 5 Apr 2018, 20:06:33 UTC Probably someone collected those Tasks for CPU are available, but your preferences are set to not accept them Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them Tasks for Intel GPU are available, but your preferences are set to not accept them that were cluttering up the feeder cache. Perhaps 'Tasks for CPU' were available, but because you weren't asking for any, and your preferences (may?) allow them, there was no point in sending that message. Hmmm, are you saying those messages are bouncing around inside the server creating problems? I've never received any of those messages, and since I run Anonymous platform I really don't have any need to change those Preferences as I just list the Apps I'm using in the app_info. I'm running 2 CPU tasks and 3 nVidia GPUs on that machine, it asks for GPU tasks much more often than CPU tasks. It's also back to not being sent much work in the last hour or so. It has around 50 CPU tasks onboard, meaning right now it's down around 70 GPU tasks. It does receive some tasks ever so often, but not enough to replace the completed tasks; Thu Apr 5 14:37:54 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 14:37:59 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 14:37:59 2018 \| SETI@home \| Reporting 5 completed tasks Thu Apr 5 14:37:59 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 14:37:59 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 14:37:59 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 220997.64 seconds; 0.00 devices Thu Apr 5 14:38:00 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 14:38:00 2018 \| SETI@home \| No tasks sent Thu Apr 5 14:43:08 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 14:43:13 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 14:43:13 2018 \| SETI@home \| Reporting 7 completed tasks Thu Apr 5 14:43:13 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 14:43:13 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 14:43:13 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 222353.24 seconds; 0.00 devices Thu Apr 5 14:43:14 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 14:43:14 2018 \| SETI@home \| No tasks sent Thu Apr 5 14:48:18 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 14:48:23 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 14:48:23 2018 \| SETI@home \| Reporting 5 completed tasks Thu Apr 5 14:48:23 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 14:48:23 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 14:48:23 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 223388.41 seconds; 0.00 devices Thu Apr 5 14:48:24 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 14:48:24 2018 \| SETI@home \| No tasks sent Thu Apr 5 14:53:27 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 14:53:32 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 14:53:32 2018 \| SETI@home \| Reporting 7 completed tasks Thu Apr 5 14:53:32 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 14:53:32 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 14:53:32 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 224525.77 seconds; 0.00 devices Thu Apr 5 14:53:33 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 14:53:33 2018 \| SETI@home \| No tasks sent Thu Apr 5 14:58:36 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 14:58:41 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 14:58:41 2018 \| SETI@home \| Reporting 10 completed tasks Thu Apr 5 14:58:41 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 14:58:41 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 14:58:41 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 225665.87 seconds; 0.00 devices Thu Apr 5 14:58:42 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 14:58:42 2018 \| SETI@home \| No tasks sent Thu Apr 5 15:03:46 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 15:03:51 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 15:03:51 2018 \| SETI@home \| Reporting 5 completed tasks Thu Apr 5 15:03:51 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 15:03:51 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 15:03:51 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 226879.65 seconds; 0.00 devices Thu Apr 5 15:03:52 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 15:03:52 2018 \| SETI@home \| No tasks sent Thu Apr 5 15:09:00 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 15:09:05 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 15:09:05 2018 \| SETI@home \| Reporting 5 completed tasks Thu Apr 5 15:09:05 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 15:09:05 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 15:09:05 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 228181.81 seconds; 0.00 devices Thu Apr 5 15:09:06 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 15:09:06 2018 \| SETI@home \| No tasks sent Thu Apr 5 15:14:19 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 15:14:24 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 15:14:24 2018 \| SETI@home \| Reporting 6 completed tasks Thu Apr 5 15:14:24 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 15:14:24 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 15:14:24 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 229321.23 seconds; 0.00 devices Thu Apr 5 15:14:25 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 15:14:25 2018 \| SETI@home \| No tasks sent Thu Apr 5 15:19:29 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 15:19:34 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 15:19:34 2018 \| SETI@home \| Reporting 6 completed tasks Thu Apr 5 15:19:34 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 15:19:34 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 15:19:34 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 230393.37 seconds; 0.00 devices Thu Apr 5 15:19:35 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 15:19:35 2018 \| SETI@home \| No tasks sent Thu Apr 5 15:24:38 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 15:24:43 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 15:24:43 2018 \| SETI@home \| Reporting 8 completed tasks Thu Apr 5 15:24:43 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 15:24:43 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 15:24:43 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 231670.33 seconds; 0.00 devices Thu Apr 5 15:24:45 2018 \| SETI@home \| Scheduler request completed: got 4 new tasks Thu Apr 5 15:24:45 2018 \| SETI@home \| [sched_op] estimated total CPU task duration: 0 seconds Thu Apr 5 15:24:45 2018 \| SETI@home \| [sched_op] estimated total NVIDIA GPU task duration: 425 seconds Thu Apr 5 15:29:52 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 15:29:57 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 15:29:57 2018 \| SETI@home \| Reporting 4 completed tasks Thu Apr 5 15:29:57 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 15:29:57 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 15:29:57 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 232270.98 seconds; 0.00 devices Thu Apr 5 15:29:58 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 15:29:58 2018 \| SETI@home \| No tasks sent Thu Apr 5 15:35:06 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 15:35:11 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 15:35:11 2018 \| SETI@home \| Reporting 6 completed tasks Thu Apr 5 15:35:11 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 15:35:11 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 15:35:11 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 233431.68 seconds; 0.00 devices Thu Apr 5 15:35:12 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 15:35:12 2018 \| SETI@home \| No tasks sent Thu Apr 5 15:40:20 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 15:40:25 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 15:40:25 2018 \| SETI@home \| Reporting 7 completed tasks Thu Apr 5 15:40:25 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 15:40:25 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 15:40:25 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 234552.87 seconds; 0.00 devices Thu Apr 5 15:40:26 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 15:40:26 2018 \| SETI@home \| No tasks sent Thu Apr 5 15:45:34 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 15:45:39 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 15:45:39 2018 \| SETI@home \| Reporting 6 completed tasks Thu Apr 5 15:45:39 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 15:45:39 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 15:45:39 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 235729.31 seconds; 0.00 devices Thu Apr 5 15:45:40 2018 \| SETI@home \| Scheduler request completed: got 0 new tasks Thu Apr 5 15:45:40 2018 \| SETI@home \| No tasks sent 89 tasks only lasts for so long. Oh look, the server woke again; Thu Apr 5 16:01:11 2018 \| SETI@home \| [sched_op] Starting scheduler request Thu Apr 5 16:01:16 2018 \| SETI@home \| Sending scheduler request: To report completed tasks. Thu Apr 5 16:01:16 2018 \| SETI@home \| Reporting 5 completed tasks Thu Apr 5 16:01:16 2018 \| SETI@home \| Requesting new tasks for NVIDIA GPU Thu Apr 5 16:01:16 2018 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Thu Apr 5 16:01:16 2018 \| SETI@home \| [sched_op] NVIDIA GPU work request: 234417.73 seconds; 0.00 devices Thu Apr 5 16:01:20 2018 \| SETI@home \| Scheduler request completed: got 76 new tasks Thu Apr 5 16:01:20 2018 \| SETI@home \| [sched_op] estimated total CPU task duration: 0 seconds Thu Apr 5 16:01:20 2018 \| SETI@home \| [sched_op] estimated total NVIDIA GPU task duration: 19947 seconds So....How does it figure 76 tasks will last 332 minutes? ID: 1928156 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1928158 - Posted: 5 Apr 2018, 20:01:24 UTC Last modified: 5 Apr 2018, 20:10:31 UTC OK, had much better luck this time. This has something to do with defining g_wreq->max_jobs_exceeded() if (config.max_wus_to_send) { g_wreq->max_jobs_per_rpc = mult * config.max_wus_to_send; } else { g_wreq->max_jobs_per_rpc = 999999; g_reply->set_delay(DELAY_NO_WORK_CACHE); } if (g_wreq->max_jobs_exceeded()) { sprintf(buf, "This computer has reached a limit on tasks in progress"); Last indexed on Jul 30, 2017 whatever indexed means and bool max_jobs_exceeded() { if (max_jobs_on_host_exceeded) return true; for (int i=0; i<NPROC_TYPES; i++) { extern WORK_REQ* g_wreq; extern double capped_host_fpops(); static inline void add_no_work_message(const char* m) { g_wreq->add_no_work_message(m); Last indexed on Feb 7 max_jobs_per_rpc can only be as high 999999 per request. So now have to figure out what config.max_wus_to_send is defined as. And what does mult * function do to that variable? And extern double capped_host_fpops() looks interesting too. Good 'ole fpops comes into play again. [Edit] OK, it has to do with determining how many tasks you get based on how many gpus on the host. if (n > MAX_GPUS) n = MAX_GPUS; ninstances[proc_type] = n; effective_ngpus += n; } int mult = effective_ncpus + config.gpu_multiplier * effective_ngpus; if (config.max_wus_to_send) { g_wreq->max_jobs_per_rpc = mult * config.max_wus_to_send; } else { g_wreq->max_jobs_per_rpc = 999999; It would seem that the number of tasks per cpu is defined somewhere else. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1928158 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1928172 - Posted: 5 Apr 2018, 21:07:29 UTC So....How does it figure 76 tasks will last 332 minutes? That's a VERY GOOD question. I have always thought the fpops_est was always screwed up and didn't calculate true computing power of gpus. Even less so for the special app. The APR for the gpu tasks done on a special app host don't seem to be that wrong. So how does the scheduler mess up the estimated gpu task completion time so badly? Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1928172 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1928173 - Posted: 5 Apr 2018, 21:08:06 UTC - in response to Message 1928156. estimated total NVIDIA GPU task duration: 19947 seconds So....How does it figure 76 tasks will last 332 minutes? Look on the tasks tab in BOINC Manager. Each task has a "Remaining (estimated)" runtime. I'm guessing most of them are around 00:04:22. That's usually a pretty good estimate, if all your cards run at the same speed. The server keeps track of your performance, and tweaks the figures so the estimate is realistic. If you run stock apps, the server monitors and adjusts speed (APR). If you run Anonymous Platform, the server takes your word for the speed, and tweaks the size of the task instead. Both routes end up in the same place. ID: 1928173 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1928183 - Posted: 5 Apr 2018, 21:23:56 UTC - in response to Message 1928173. estimated total NVIDIA GPU task duration: 19947 seconds So....How does it figure 76 tasks will last 332 minutes? Look on the tasks tab in BOINC Manager. Each task has a "Remaining (estimated)" runtime. I'm guessing most of them are around 00:04:22. That's usually a pretty good estimate, if all your cards run at the same speed. The server keeps track of your performance, and tweaks the figures so the estimate is realistic. If you run stock apps, the server monitors and adjusts speed (APR). If you run Anonymous Platform, the server takes your word for the speed, and tweaks the size of the task instead. Both routes end up in the same place. I don't understand Richard. What do you mean the server "takes your word for the speed" I don't know how we alter or affect the calculated APR other than what the server calculates for us. I don't think any of us are messing with the fpops_est value in the client_state. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1928183 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1928190 - Posted: 5 Apr 2018, 21:38:13 UTC - in response to Message 1928183. Bedtime approaches, and the board is slow - it may take me until tomorrow to re-locate that code. But: Speed - in the stock case - APR in GigaFlop/sec - in the anonymous platform case - CPU benchmark for CPU tasks, Peak flops * fiddle factor for GPUs. Fiddle factor might be 1/20th. Task size - in the stock case - workunit <rsc_fpops_est>, raw, from splitters - in the anonymous platform case - <rsc_fpops_est>,tweaked by the inverse of the ratio of speed (as above - you're following me?) to APR. ID: 1928190 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1928191 - Posted: 5 Apr 2018, 21:38:25 UTC - in response to Message 1928173. If you look at the posted logs you can see it's reporting 5 to 9 completed tasks every 5 minutes. 5 x 12 = 60 tasks in an hour. I just received a load of shorties estimated to take 85 seconds a piece. 85 seconds. 76 won't last long. The longest estimate on the tasks page is 3:47, the shortest is 1:25, that's how you complete well over 1000 tasks a day. Then there are all those that finish in about 5 seconds. We need more tasks. ID: 1928191 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1928193 - Posted: 5 Apr 2018, 21:40:32 UTC - in response to Message 1928191. That depends whether the project exists to provide kibble to you, or whether you exist to do science for the project. ID: 1928193 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1928197 - Posted: 5 Apr 2018, 21:48:26 UTC - in response to Message 1928193. I'm running Low end hardware. I'll let what you posted sink in to the people running High end hardware. ID: 1928197 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1928203 - Posted: 5 Apr 2018, 22:02:42 UTC - in response to Message 1928172. So how does the scheduler mess up the estimated gpu task completion time so badly? Easy. Just take the longest estimate and consider ONE device. That would be a little closer to reality. ID: 1928203 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1928220 - Posted: 5 Apr 2018, 23:55:29 UTC - in response to Message 1928191. Last modified: 5 Apr 2018, 23:57:59 UTC I just looked at the estimated time for completion for all my tasks on the Intel machine. 52 minutes for cpu tasks. 43 seconds for shorties. 1 minute 24 seconds - 1 minute 54 seconds for VLAR's. The Ryzen machines do cpu tasks in 28-45 minutes. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1928220 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1928225 - Posted: 6 Apr 2018, 0:12:18 UTC - in response to Message 1928118. Last modified: 6 Apr 2018, 0:24:21 UTC Now both Linux crunchers are back to being down 100 tasks from full again like last night. Only 1 in 5 task requests get any work and then only 1 or 2 tasks. The rest of the time I get the "you've reached the limit of tasks in progress" message. . . And those pesky Blc01 tapes seem to still be stuck in the splitters ... Stephen ?? ID: 1928225 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1928229 - Posted: 6 Apr 2018, 0:26:37 UTC - in response to Message 1928122. Seems the messages have changed. Along with more of the Reached a Limit, I'm now just being told Nothing was sent; . . Getting that here as well. Stephen :( ID: 1928229 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1928230 - Posted: 6 Apr 2018, 0:29:38 UTC - in response to Message 1928125. All of these look new... I wonder with the recent long outrage to fix database issues that this has been introduced to reduce work-in-progress entries, specifically by countering "bunkering" (edit: not saying that anyone is, of course! Pretty sure that the 100 CPU/GPU limit was introduced specifically to limit work-in-progress entries so it is an issue that has been addressed before so it's possible.) . . Hey there Mr Kevvy, . . Long time no hear ... . . That sounds very plausible to me ... Stephen :( ID: 1928230 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1928233 - Posted: 6 Apr 2018, 0:37:44 UTC - in response to Message 1928126. I'm reading, but I don't have an explanation. Is it possible that on a machine which has CPU crunching enabled, you might have 100 CPU tasks onboard, and the scheduler might count them and say "enough, already", and bail out without enumerating GPU tasks? The message you're discussing does say "This computer has reached A limit on tasks in progress" (direct quote from my log at 18:15, except for the emphasis). It doesn't say which limit. . . I have observed just that sort of behaviour a lot lately. When getting new work the CPU queue will be completely refilled but the GPU Q will be shortchanged. Even the other way around on a rare occasion. So maybe it is contextual and whichever Q gets work allocated first triggers the "enough" signal despite the status of the second Q. That to me would be an error in the procedure, OR a deliberate change to the code to prevent bunkering as was suggested earlier. Stephen :( ID: 1928233 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1928241 - Posted: 6 Apr 2018, 1:26:00 UTC - in response to Message 1928220. I just looked at the estimated time for completion for all my tasks on the Intel machine. 52 minutes for cpu tasks. 43 seconds for shorties. 1 minute 24 seconds - 1 minute 54 seconds for VLAR's. The Ryzen machines do cpu tasks in 28-45 minutes. The question was; Thu Apr 5 16:01:20 2018 \| SETI@home \| [sched_op] estimated total NVIDIA GPU task duration: 19947 seconds So....How does it figure 76 tasks will last 332 minutes? The answer is it took the longest estimate of around 4 minutes, multiplied that by 76, and came up with a time close to 304 minutes. BUT, that's for just ONE GPU...the machine has 3 GPUs. Therefore, the estimate is immediately off by a factor of 3, and then there is the problem of tasks taking much less than the High estimate. By the time all is corrected, the 76 tasks will probably take about 76 minutes using 3 GPUs with some tasks taking only 80 seconds to complete. Apparently the estimate completely ignores the number of devices the machine has. ID: 1928241 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1928242 - Posted: 6 Apr 2018, 1:41:26 UTC - in response to Message 1928241. That snippet of code I posted is supposed to calculate the number of seconds of work based on the number of gpus in the host. From your calculation, that part of the code is not working evidently as I agree with your estimate is only for ONE device, not for three gpus. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1928242 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1928263 - Posted: 6 Apr 2018, 5:35:34 UTC So, apart from the random work allocation rearing it's ugly head again, the database re-organisation seems to be helping. Things were a bit messy after the outage, but the splitters (in-spite of some slowdowns after a very good start) have been able to fill the Ready-to-send buffer, and keep it filled for over half a day. Been a while since that has been the case. The Results & WUs Awaiting-purge have both reached & generally settled around their more normal levels. WU Awaiting-deletion while not back to (effectively) zero like they used to be, are at least close enough to it, and not heading for yet another record high. Now we just need another bunch of short WUs so we can hammer the servers with 145,000/hour again to see just how well they can hold up. If we can get the Scheduler to reliably allocate work when a host hasn't reached it's cache or server-side limits, we might actually be ready to cope with more crunchers. Grant Darwin NT ID: 1928263 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1928264 - Posted: 6 Apr 2018, 5:38:25 UTC - in response to Message 1928242. That snippet of code I posted is supposed to calculate the number of seconds of work based on the number of gpus in the host. From your calculation, that part of the code is not working evidently as I agree with your estimate is only for ONE device, not for three gpus. Or it's working the way it's meant to; wasn't it a glitch in the code that allows each GPU to get 100 WU, instead of like the CPU where it's a limit of 100 regardless of the number of cores/threads? Grant Darwin NT ID: 1928264 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22199 Credit: 416,307,556 RAC: 380	Message 1928266 - Posted: 6 Apr 2018, 6:32:34 UTC Since the routine is working "correctly" on two of my four crunchers, and "incorrectly" on the other two I would suggest there is something amiss in the communication between the cruncher and the calculation. It is worth noting that the two that are "incorrect" are my top two.... Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1928266 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.