Message boards :
Number crunching :
SETI@Home will now cache 150 CPU work units and 150 per installed GPU
Message board moderation
Author | Message |
---|---|
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
Have we just crashed? Definitely strangeness going on. What I've never seen before is that on my Win box the stalled downloads are in excess to my full cache. e.g. with 2 GPUs I'm limited to 200 GPU tasks, I have 200 tasks in process and it's stalled downloading another 51. Something threw a wobblie .... [edit] A quick peek also shows this: All tasks for computer 8699207 State: All (872) · In progress (350) ...With 2 GPUs and CPU max should indeed be 300 in progress, |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
Have we just crashed? All tasks for computer 8699207 State: All (1042) · In progress (518)Just keeps getting weirder ... |
Wiggo Send message Joined: 24 Jan 00 Posts: 36319 Credit: 261,360,520 RAC: 489 |
Has Xmas come early? Both of my rigs have 490 In progress tasks each. :-O Cheers. |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
I've set myself to no new task for a while, until the situation gets better. I hope everyone who needs WUs can get them. Set myself to APs only for a while, given that the Win box which is entitled to 200 GPU tasks now has nearly 500, and a Linux box has over 150 CPU tasks. Presumably they'll figure out what happened to the scheduler check on max tasks per client device. |
Wiggo Send message Joined: 24 Jan 00 Posts: 36319 Credit: 261,360,520 RAC: 489 |
I've set myself to no new task for a while, until the situation gets better. I hope everyone who needs WUs can get them.I did that with my 2500K rig when it hit 750 in progress. Cheers. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13834 Credit: 208,696,464 RAC: 304 |
Well, there's the usual & not so usual after outage weirdness going on. My Linux system is chewing through work as fast as it can get it, most Scheduler requests resulting in "Project has no tasks available". However the Windows system on the other hand is getting work on almost every other request, and has gone on a rampage. It's usual limit is 300 WUs, yet it's already up to 469. I'm thinking that whatever broke, might have taken out the Server side limits. It might be worth someone letting those in charge know- as much as I would like a larger cache to get through outages such as this most recent one, I don't want the system falling over due to the amount of work in the system bringing the database to a grinding halt. Grant Darwin NT |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Wow - this reminds me of before the CPU/GPU limits were imposed, and one of my machines had ~3500 WUs on board at one point...Yay! Right now, both are well over their limits, but one has 0 CPU WUs waiting (with ~700 GPU), and the other has 750 GPU and also 170+ CPU onboard. Both have 2 GPUs, so should max out at 22 (+100 CPU) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . At this point it is safe to say something is broken. Hopefully it will only require a reset of some task/s. Stephen <crossing fingers> |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Other than not getting work on almost all requests, the only thing out of the ordinary is that I have 111 cpu tasks on the Xeon host which should only ever get 100. The Ryzen 3900X is almost out of cpu tasks since my hosts always ask for gpu work first to fill up their cache first before ever asking for cpu work. All hosts are down on gpu work because of the scheduler sending 0 tasks upon request. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Now I have another host with more than 100 cpu tasks. But the host that has been empty for hours of cpu work never gets any if requested. And then sets a 1400 second backoff timer. Strange. Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | choose_project: scanning Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | can fetch CPU Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | CPU needs work - buffer low Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | checking CPU Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [work_fetch] set_request() for CPU: ninst 15 nused_total 4191.00 nidle_now 0.00 fetch share 1.00 req_inst 0.00 req_secs 505762.90 Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | CPU set_request: 505762.895780 Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | checking NVIDIA GPU Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [work_fetch] set_request() for NVIDIA GPU: ninst 48 nused_total 4191.00 nidle_now 0.00 fetch share 1.00 req_inst 0.00 req_secs 1637296.40 Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | NVIDIA GPU set_request: 1637296.401672 Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [sched_op] Starting scheduler request Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [work_fetch] request: CPU (505762.90 sec, 0.00 inst) NVIDIA GPU (1637296.40 sec, 0.00 inst) Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | Sending scheduler request: To fetch work. Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | Reporting 17 completed tasks Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | Requesting new tasks for CPU and NVIDIA GPU Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [sched_op] CPU work request: 505762.90 seconds; 0.00 devices Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [sched_op] NVIDIA GPU work request: 1637296.40 seconds; 0.00 devices Sat 07 Dec 2019 12:57:37 AM PST | | [work_fetch] Request work fetch: application exited Sat 07 Dec 2019 12:57:37 AM PST | SETI@home | Computation for task 05mr11ad.28881.3339.3.30.90_0 finished Sat 07 Dec 2019 12:57:37 AM PST | SETI@home | Starting task 05mr11ad.11989.2112.6.33.83_0 Sat 07 Dec 2019 12:57:39 AM PST | SETI@home | Started upload of 05mr11ad.28881.3339.3.30.90_0_r114699300_0 Sat 07 Dec 2019 12:57:41 AM PST | SETI@home | Finished upload of 05mr11ad.28881.3339.3.30.90_0_r114699300_0 Sat 07 Dec 2019 12:57:41 AM PST | | [work_fetch] Request work fetch: project finished uploading Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | Scheduler request completed: got 41 new tasks Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | [sched_op] Server version 709 Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | Project requested delay of 303 seconds Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | [sched_op] estimated total CPU task duration: 0 seconds Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | [sched_op] estimated total NVIDIA GPU task duration: 1743 seconds ~~ Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | [work_fetch] backing off CPU 1400 sec Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | [sched_op] Deferring communication for 00:05:03 And still no cpu tasks. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Ghia Send message Joined: 7 Feb 17 Posts: 238 Credit: 28,911,438 RAC: 50 |
I have 553 WUs in progress, 192 are for the CPU, the rest are for the GPU. My alottment is 100+100, so I have work galore and uploads are working fine (so far.....) I've set NNT and will just crunch away. Could have been worse ! Humans may rule the world...but bacteria run it... |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
Just tried backing the work cache down to .5 day with .5 additional, as suggested. We'll see if it has any effect. At present, my Win box has 500 GPU tasks for 2x750tis, and the little Linux box has 190 cpu and 900ish gpu. Guess we'll see if limits throttle it. Meanwhile, the Big Boy is on a lean diet and can't get fed. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13834 Credit: 208,696,464 RAC: 304 |
Just tried backing the work cache down to .5 day with .5 additional, as suggested. We'll see if it has any effect.Better yet, try 0.02 for the additional setting (the larger the additional number, the less frequently it asks for work). Grant Darwin NT |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
Just tried backing the work cache down to .5 day with .5 additional, as suggested. We'll see if it has any effect.Better yet, try 0.02 for the additional setting (the larger the additional number, the less frequently it asks for work). Just waiting for a log file entry telling me I've reached a limit on tasks. Until then, all else seems irrelevant ... appears that the server isn't telling the client to quit asking for work as I think it normally does. |
Bernie Vine Send message Joined: 26 May 99 Posts: 9956 Credit: 103,452,613 RAC: 328 |
My three machines actually seem to have the amount of work actually requested. i.e my Linux machine has around 800 tasks and is set for 1 day, and 800 is about what it gets through. One Windows machine has 139 GPU and 109 CPU, and the other has 77 GPU and 62 CPU again about correct for the cache I have set so for me there is no real problem |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
My three machines actually seem to have the amount of work actually requested. i.e my Linux machine has around 800 tasks and is set for 1 day, and 800 is about what it gets through. Yes, it is looking like the limits are being respected. Will look a bit weird when I get up, I suspect. Near as I can tell, my 980s tend to do about 1000 MBs per day each, so I could end up with 5000 and 7000 GPU tasks respectively for the Linux boxes. Fortunately each machine has about 100gb free disk space. The Win box is no longer asking for GPU work, and has filled its CPU to a reasonable level. |
Siran d'Vel'nahr Send message Joined: 23 May 99 Posts: 7379 Credit: 44,181,323 RAC: 238 |
Greetings, I just set my older Linux PC to No New Tasks. This host should only get 200 WUs and it's got 600 in progress. Looks like something barfed all over the place. ;) Have a great day! :) Siran CAPT Siran d'Vel'nahr - L L & P _\\// Winders 11 OS? "What a piece of junk!" - L. Skywalker "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
So, is everybody saying that "We've got what we wished for" - cache sizes controlled only by our own request preferences, not by a rigid server cap? Use it wisely. I hope we don't come to regret this. |
Siran d'Vel'nahr Send message Joined: 23 May 99 Posts: 7379 Credit: 44,181,323 RAC: 238 |
So, is everybody saying that "We've got what we wished for" - cache sizes controlled only by our own request preferences, not by a rigid server cap? Hi Richard, Are you saying that the powers-that-be undid the 100 max WUs? It kinda doesn't make sense. My newer, faster Linux PC should have WAY more WUs than my older one then. It's got less than half what the older one has. I also noticed that the RTS is down under 100 WUs. I don't think the splitters are keeping up. ;) Have a great day! :) Siran CAPT Siran d'Vel'nahr - L L & P _\\// Winders 11 OS? "What a piece of junk!" - L. Skywalker "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13834 Credit: 208,696,464 RAC: 304 |
I hope we don't come to regret this.That's the million dollar question. Have we been given what we wished for, or is it just because something's broken? If we have been given it, I would have thought it would be a case of gradually increasing the limits to make sure the system doesn't fall over in a screaming heap. To just remove them seems rather, risky. Especially considering how sluggish the system has been over the last few weeks (although now without the Haveland graphs we can't actually see that occurring anymore). Grant Darwin NT |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.