SETI@Home will now cache 150 CPU work units and 150 per installed GPU

Message boards : Number crunching : SETI@Home will now cache 150 CPU work units and 150 per installed GPU
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 2021994 - Posted: 7 Dec 2019, 2:53:58 UTC - in response to Message 2021993.  
Last modified: 7 Dec 2019, 2:57:33 UTC

Have we just crashed?
Just had a bunch of stalled downloads on one system (cleared after 3-4min), and now both systems are getting "Project has no tasks available" Scheduler responses, and not a WU in sight.


Edit- eventually got some work, on both systems. And neither can download it; one system's been going for 4min & counting with not a bit transferred (and it's just gone in to timeout). The other keeps timing out on the retries.

Definitely strangeness going on. What I've never seen before is that on my Win box the stalled downloads are in excess to my full cache. e.g. with 2 GPUs I'm limited to 200 GPU tasks, I have 200 tasks in process and it's stalled downloading another 51.
Something threw a wobblie ....
[edit]
A quick peek also shows this:
All tasks for computer 8699207

State: All (872) · In progress (350) ...
With 2 GPUs and CPU max should indeed be 300 in progress,
ID: 2021994 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 2022008 - Posted: 7 Dec 2019, 4:14:05 UTC - in response to Message 2021994.  

Have we just crashed?
Just had a bunch of stalled downloads on one system (cleared after 3-4min), and now both systems are getting "Project has no tasks available" Scheduler responses, and not a WU in sight.


Edit- eventually got some work, on both systems. And neither can download it; one system's been going for 4min & counting with not a bit transferred (and it's just gone in to timeout). The other keeps timing out on the retries.

Definitely strangeness going on. What I've never seen before is that on my Win box the stalled downloads are in excess to my full cache. e.g. with 2 GPUs I'm limited to 200 GPU tasks, I have 200 tasks in process and it's stalled downloading another 51.
Something threw a wobblie ....
[edit]
A quick peek also shows this:
All tasks for computer 8699207

State: All (872) · In progress (350) ...
With 2 GPUs and CPU max should indeed be 300 in progress,

All tasks for computer 8699207

State: All (1042) · In progress (518)
Just keeps getting weirder ...
ID: 2022008 · Report as offensive     Reply Quote
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36558
Credit: 261,360,520
RAC: 489
Australia
Message 2022010 - Posted: 7 Dec 2019, 4:33:22 UTC

Has Xmas come early?

Both of my rigs have 490 In progress tasks each. :-O

Cheers.
ID: 2022010 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 2022019 - Posted: 7 Dec 2019, 5:35:35 UTC - in response to Message 2022018.  

I've set myself to no new task for a while, until the situation gets better. I hope everyone who needs WUs can get them.

Set myself to APs only for a while, given that the Win box which is entitled to 200 GPU tasks now has nearly 500, and a Linux box has over 150 CPU tasks. Presumably they'll figure out what happened to the scheduler check on max tasks per client device.
ID: 2022019 · Report as offensive     Reply Quote
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36558
Credit: 261,360,520
RAC: 489
Australia
Message 2022020 - Posted: 7 Dec 2019, 5:41:50 UTC - in response to Message 2022018.  

I've set myself to no new task for a while, until the situation gets better. I hope everyone who needs WUs can get them.
I did that with my 2500K rig when it hit 750 in progress.

Cheers.
ID: 2022020 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13841
Credit: 208,696,464
RAC: 304
Australia
Message 2022021 - Posted: 7 Dec 2019, 5:44:59 UTC
Last modified: 7 Dec 2019, 5:47:15 UTC

Well, there's the usual & not so usual after outage weirdness going on.
My Linux system is chewing through work as fast as it can get it, most Scheduler requests resulting in "Project has no tasks available". However the Windows system on the other hand is getting work on almost every other request, and has gone on a rampage. It's usual limit is 300 WUs, yet it's already up to 469.



I'm thinking that whatever broke, might have taken out the Server side limits.
It might be worth someone letting those in charge know- as much as I would like a larger cache to get through outages such as this most recent one, I don't want the system falling over due to the amount of work in the system bringing the database to a grinding halt.
Grant
Darwin NT
ID: 2022021 · Report as offensive     Reply Quote
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 2022024 - Posted: 7 Dec 2019, 5:51:02 UTC
Last modified: 7 Dec 2019, 5:51:44 UTC

Wow - this reminds me of before the CPU/GPU limits were imposed, and one of my machines had ~3500 WUs on board at one point...Yay!

Right now, both are well over their limits, but one has 0 CPU WUs waiting (with ~700 GPU), and the other has 750 GPU and also 170+ CPU onboard.
Both have 2 GPUs, so should max out at 22 (+100 CPU)
ID: 2022024 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2022028 - Posted: 7 Dec 2019, 6:20:09 UTC - in response to Message 2022008.  


All tasks for computer 8699207

State: All (1042) · In progress (518)

Just keeps getting weirder ...


. . At this point it is safe to say something is broken. Hopefully it will only require a reset of some task/s.

Stephen

<crossing fingers>
ID: 2022028 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2022029 - Posted: 7 Dec 2019, 6:27:11 UTC

Other than not getting work on almost all requests, the only thing out of the ordinary is that I have 111 cpu tasks on the Xeon host which should only ever get 100. The Ryzen 3900X is almost out of cpu tasks since my hosts always ask for gpu work first to fill up their cache first before ever asking for cpu work. All hosts are down on gpu work because of the scheduler sending 0 tasks upon request.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2022029 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2022043 - Posted: 7 Dec 2019, 9:05:15 UTC

Now I have another host with more than 100 cpu tasks. But the host that has been empty for hours of cpu work never gets any if requested. And then sets a 1400 second backoff timer. Strange.

Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | choose_project: scanning
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | can fetch CPU
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | CPU needs work - buffer low
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | checking CPU
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [work_fetch] set_request() for CPU: ninst 15 nused_total 4191.00 nidle_now 0.00 fetch share 1.00 req_inst 0.00 req_secs 505762.90
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | CPU set_request: 505762.895780
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | checking NVIDIA GPU
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [work_fetch] set_request() for NVIDIA GPU: ninst 48 nused_total 4191.00 nidle_now 0.00 fetch share 1.00 req_inst 0.00 req_secs 1637296.40
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | NVIDIA GPU set_request: 1637296.401672
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [sched_op] Starting scheduler request
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [work_fetch] request: CPU (505762.90 sec, 0.00 inst) NVIDIA GPU (1637296.40 sec, 0.00 inst)
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | Sending scheduler request: To fetch work.
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | Reporting 17 completed tasks
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [sched_op] CPU work request: 505762.90 seconds; 0.00 devices
Sat 07 Dec 2019 12:57:32 AM PST | SETI@home | [sched_op] NVIDIA GPU work request: 1637296.40 seconds; 0.00 devices
Sat 07 Dec 2019 12:57:37 AM PST | | [work_fetch] Request work fetch: application exited
Sat 07 Dec 2019 12:57:37 AM PST | SETI@home | Computation for task 05mr11ad.28881.3339.3.30.90_0 finished
Sat 07 Dec 2019 12:57:37 AM PST | SETI@home | Starting task 05mr11ad.11989.2112.6.33.83_0
Sat 07 Dec 2019 12:57:39 AM PST | SETI@home | Started upload of 05mr11ad.28881.3339.3.30.90_0_r114699300_0
Sat 07 Dec 2019 12:57:41 AM PST | SETI@home | Finished upload of 05mr11ad.28881.3339.3.30.90_0_r114699300_0
Sat 07 Dec 2019 12:57:41 AM PST | | [work_fetch] Request work fetch: project finished uploading
Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | Scheduler request completed: got 41 new tasks
Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | [sched_op] Server version 709
Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | Project requested delay of 303 seconds
Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | [sched_op] estimated total CPU task duration: 0 seconds
Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | [sched_op] estimated total NVIDIA GPU task duration: 1743 seconds
~~
Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | [work_fetch] backing off CPU 1400 sec
Sat 07 Dec 2019 12:57:51 AM PST | SETI@home | [sched_op] Deferring communication for 00:05:03

And still no cpu tasks.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2022043 · Report as offensive     Reply Quote
Ghia
Avatar

Send message
Joined: 7 Feb 17
Posts: 238
Credit: 28,911,438
RAC: 50
Norway
Message 2022045 - Posted: 7 Dec 2019, 9:26:51 UTC

I have 553 WUs in progress, 192 are for the CPU, the rest are for the GPU.
My alottment is 100+100, so I have work galore and uploads are working fine (so far.....)
I've set NNT and will just crunch away.
Could have been worse !
Humans may rule the world...but bacteria run it...
ID: 2022045 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 2022047 - Posted: 7 Dec 2019, 9:38:33 UTC

Just tried backing the work cache down to .5 day with .5 additional, as suggested. We'll see if it has any effect.
At present, my Win box has 500 GPU tasks for 2x750tis, and the little Linux box has 190 cpu and 900ish gpu.
Guess we'll see if limits throttle it.
Meanwhile, the Big Boy is on a lean diet and can't get fed.
ID: 2022047 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13841
Credit: 208,696,464
RAC: 304
Australia
Message 2022049 - Posted: 7 Dec 2019, 9:40:32 UTC - in response to Message 2022047.  

Just tried backing the work cache down to .5 day with .5 additional, as suggested. We'll see if it has any effect.
Better yet, try 0.02 for the additional setting (the larger the additional number, the less frequently it asks for work).
Grant
Darwin NT
ID: 2022049 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 2022051 - Posted: 7 Dec 2019, 9:53:04 UTC - in response to Message 2022049.  

Just tried backing the work cache down to .5 day with .5 additional, as suggested. We'll see if it has any effect.
Better yet, try 0.02 for the additional setting (the larger the additional number, the less frequently it asks for work).

Just waiting for a log file entry telling me I've reached a limit on tasks. Until then, all else seems irrelevant ... appears that the server isn't telling the client to quit asking for work as I think it normally does.
ID: 2022051 · Report as offensive     Reply Quote
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9957
Credit: 103,452,613
RAC: 328
United Kingdom
Message 2022053 - Posted: 7 Dec 2019, 10:09:34 UTC
Last modified: 7 Dec 2019, 10:12:13 UTC

My three machines actually seem to have the amount of work actually requested. i.e my Linux machine has around 800 tasks and is set for 1 day, and 800 is about what it gets through.

One Windows machine has 139 GPU and 109 CPU, and the other has 77 GPU and 62 CPU again about correct for the cache I have set so for me there is no real problem
ID: 2022053 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 2022055 - Posted: 7 Dec 2019, 10:17:00 UTC - in response to Message 2022053.  

My three machines actually seem to have the amount of work actually requested. i.e my Linux machine has around 800 tasks and is set for 1 day, and 800 is about what it gets through.

One Windows machine has 139 GPU and 109 CPU, and the other has 77 GPU and 62 CPU again about correct for the cache I have set so for me there is no real problem

Yes, it is looking like the limits are being respected. Will look a bit weird when I get up, I suspect. Near as I can tell, my 980s tend to do about 1000 MBs per day each, so I could end up with 5000 and 7000 GPU tasks respectively for the Linux boxes. Fortunately each machine has about 100gb free disk space.
The Win box is no longer asking for GPU work, and has filled its CPU to a reasonable level.
ID: 2022055 · Report as offensive     Reply Quote
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7379
Credit: 44,181,323
RAC: 238
United States
Message 2022062 - Posted: 7 Dec 2019, 11:00:47 UTC

Greetings,

I just set my older Linux PC to No New Tasks. This host should only get 200 WUs and it's got 600 in progress. Looks like something barfed all over the place. ;)

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 2022062 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14676
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2022063 - Posted: 7 Dec 2019, 11:02:00 UTC - in response to Message 2022055.  

So, is everybody saying that "We've got what we wished for" - cache sizes controlled only by our own request preferences, not by a rigid server cap?

Use it wisely. I hope we don't come to regret this.
ID: 2022063 · Report as offensive     Reply Quote
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7379
Credit: 44,181,323
RAC: 238
United States
Message 2022064 - Posted: 7 Dec 2019, 11:12:54 UTC - in response to Message 2022063.  

So, is everybody saying that "We've got what we wished for" - cache sizes controlled only by our own request preferences, not by a rigid server cap?

Use it wisely. I hope we don't come to regret this.

Hi Richard,

Are you saying that the powers-that-be undid the 100 max WUs?

It kinda doesn't make sense. My newer, faster Linux PC should have WAY more WUs than my older one then. It's got less than half what the older one has. I also noticed that the RTS is down under 100 WUs. I don't think the splitters are keeping up. ;)

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 2022064 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13841
Credit: 208,696,464
RAC: 304
Australia
Message 2022067 - Posted: 7 Dec 2019, 11:19:48 UTC - in response to Message 2022063.  
Last modified: 7 Dec 2019, 11:25:30 UTC

I hope we don't come to regret this.
That's the million dollar question. Have we been given what we wished for, or is it just because something's broken?
If we have been given it, I would have thought it would be a case of gradually increasing the limits to make sure the system doesn't fall over in a screaming heap. To just remove them seems rather, risky. Especially considering how sluggish the system has been over the last few weeks (although now without the Haveland graphs we can't actually see that occurring anymore).
Grant
Darwin NT
ID: 2022067 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : SETI@Home will now cache 150 CPU work units and 150 per installed GPU


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.