Message boards :
Number crunching :
Postponed: Waiting to acquire lock
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 14 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304 |
Crunching always uses a physical core, so never utilize more than the physical cores available. Maybe on Bulldozer CPUs but for my i7 I've always found running with HyperThreading on results in more work being done than with it off, even when using all cores (physical & virtual). Grant Darwin NT |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
yup, Hard to keep them straight lol... |
Mike Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80 |
Crunching always uses a physical core, so never utilize more than the physical cores available. Certainly not. With each crime and every kindness we birth our future. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Crunching always uses a physical core, so never utilize more than the physical cores available. So in my 6 cores / 12 threads CPU what is the right thing to do since I already run 4 GPU WU. Run 2 more CPU WU or keep running 4 As I do totay? I already know 8 CPU WU slow down everything. 6 CPU WU works fine but never actually check the times . |
Mike Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80 |
Crunching always uses a physical core, so never utilize more than the physical cores available. It might be worth trying 5 CPU tasks. Give me a notice and i will check your times again. Right now it looks perfect. With each crime and every kindness we birth our future. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304 |
I already know 8 CPU WU slow down everything. 6 CPU WU works fine but never actually check the times . On my i7 system (4 cores 8 threads) running on all available threads does result in longer CPU run times, however it results in more work being done per hour (like back in the day of CUDA on the GPU running more than 1 WU. Longer run times per WU, but more WUs done per hour). However, as I mentioned- I have reserved 1 CPU Core for every GPU WU that is being run, so no CPU cores are trying to process CPU work and support a GPU WU at the same time. I can see that causing significant processing & system responsiveness slow downs. My app_config.xml <app_config> <app> <name>setiathome_v8</name> <gpu_versions> <gpu_usage>1.00</gpu_usage> <cpu_usage>1.00</cpu_usage> </gpu_versions> </app> <app> <name>astropulse_v7</name> <gpu_versions> <gpu_usage>0.5</gpu_usage> <cpu_usage>1.0</cpu_usage> </gpu_versions> </app> </app_config> When running AP, 2 WUs at a time gives the best amount of work per hour, so it takes away a CPU WU core to support that extra GPU WU. When the AP WUs are all done, it goes back to crunching CPU WUs. I also run Seti only. Running multiple projects, particularly if your system has the resources to run 2 of them at the same time, will result in much different system behavior than for me with just the one project. Grant Darwin NT |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
It might be worth trying 5 CPU tasks. Changed to 5 now. Takes about 1 Hr to crunch each CPU WU but seems like the new blc5 type crunch in less time. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304 |
It might be worth trying 5 CPU tasks. The best way to check is to compare the Run time to the CPU time for a given CPU WU. At present, the difference for your CPU WUs is round 3 seconds. Mine is up to 3 minutes. If you've got more than 10min on many WUs, then the system is showing signs of being over committed. If it's 15min or more, it's overcommitted. Reserve Cores for the GPU, or just use less. My personal preference, reserve Cores and use them all. Grant Darwin NT |
Mike Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80 |
It might be worth trying 5 CPU tasks. Please don`t give wrong advice. Your CPU times on the I7 are slower than it was on my FX . If you are happy with the way you are running the host its fine but you should know by now that i know what i`m talking about. With each crime and every kindness we birth our future. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
This is my app_config file. I believe this already reserve the core or no? <app_config> <project_max_concurrent>9</project_max_concurrent> <app_version> <app_name>setiathome_v8</app_name> <plan_class>cuda90</plan_class> <avg_ncpus>1.0</avg_ncpus> <ngpus>1.0</ngpus> <cmdline>-unroll 15 -pfb 16 -pfp 600 -nobs</cmdline> </app_version> <app_version> <app_name>astropulse_v7</app_name> <plan_class>opencl_nvidia_100</plan_class> <avg_ncpus>1.0</avg_ncpus> <ngpus>1.0</ngpus> <cmdline>-use_sleep -unroll 15 -sbs 256 -ffa_block 12288 -ffa_block_fetch 6144</cmdline> </app_version> </app_config> <edit> just remembering my host has 4 GPU's |
Mike Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80 |
Don`t change anything for know. With each crime and every kindness we birth our future. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Don`t change anything for know. By your command. LOL Running 4 GPU + 5 CPU WU Out off topic: blc5 type WU are crunched in 2:21 Min on the GPU and in 45 min on the CPU. I like that. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304 |
Please don`t give wrong advice. I'm aware of the work you have done, i'm also aware of the number of big crunchers that run more than 1 WU at a time on their GPUs. Yet my personal experience on my system is that running more than 1 GPU WU at a time doesn't result in more work per hour, and with HyperThreading on, and using all the cores, I get more work per hour done than with HyperThreading off with faster WU run times. True, I haven't tried HyperThreading off and more than 1 GPU WU, but it works & is stable so i'm not going to fiddle further with it. Grant Darwin NT |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Some of the confusion comes from how the AMD Bulldozer/Piledriver cpu was arranged. It was arguably a 4core/8HT thread processor. The virtual cores or thread had to share time accessing the single FPU register in the (2 core) module. So trying to run a cpu thread on the virtual core slowed cpu processing down for both threads in the module. For FX processor best to limit cpu core usage to only the 4 physical cores for cpu tasks. You can let the virtual threads support any gpu tasks as a gpu task does not need any math functions from the cpu. Math work is done on the gpu. With Ryzen, everything has changed. Now the current AMD architecture bears more resemblance to Intel architecture with respect to hyperthreading. Each thread has access to its own FPU register so you don't see the slowdown with cpu tasks on a virtual core like you did with FX. Still, as has been pointed out by Mike and Grant, ideally you should keep cpu_time and run_time equal or very close. If the times diverge a large amount, it shows the processor is overcommitted and is not running the tasks as efficiently as possible. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Mike Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80 |
Please don`t give wrong advice. Nobody said you should. With each crime and every kindness we birth our future. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Juan, it looks like yesterday afternoon you also had a period of tasks with the "Can't acquire lockfile" messages. Take a look at these 6 tasks while they're still on the server: https://setiathome.berkeley.edu/result.php?resultid=6285905907 https://setiathome.berkeley.edu/result.php?resultid=6285905928 https://setiathome.berkeley.edu/result.php?resultid=6286130580 https://setiathome.berkeley.edu/result.php?resultid=6286157029 https://setiathome.berkeley.edu/result.php?resultid=6286176312 https://setiathome.berkeley.edu/result.php?resultid=6286204543 The problem started at about 16:15:55 local time and cleared shortly after 17:27:51 local time. I suspect that if you'd let your other 4 tasks continue this morning, they might have eventually cleared, also. What I would suggest is that you first review your BOINC Event Log for that time period and see if you can determine what might have occurred around 16:15 to trigger the issue. If there's nothing unusual there, then perhaps a system log might tell you something. Or, perhaps you can recall running some sort of resource hogging application on that machine about that time. If you can pinpoint a cause, then look for the same sort of thing when the problem cropped up again this morning. Good luck! |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304 |
True, I haven't tried HyperThreading off and more than 1 GPU WU, but it works & is stable so i'm not going to fiddle further with it. I know that, but I was just pointing out it was the one thing that I haven't tried (that and limiting the number of cores used. So that makes it 2 things I haven't tried) to get the same results that others say they are getting. Grant Darwin NT |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Wow, you've been busy while I've been out supping a pint or two. Anyone actually matching symptoms to solutions? |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
What I would suggest is that you first review your BOINC Event Log for that time period Slowly... what is the file name of this? Or, perhaps you can recall running some sort of resource hogging application on that machine about that time. If you can pinpoint a cause, then look for the same sort of thing when the problem cropped up again this morning. Almost sure not, unless something running without my knowledge of course. When i use this host is just to browse besides this forums and the S@H site itself, some very light sites like read a newspaper. Not runs anything else on the host. Mainly because i not learn how to run more things on this Linux host. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Wow, you've been busy while I've been out supping a pint or two. Anyone actually matching symptoms to solutions?Didn't bring back any to share, huh? ;^) Well, after discovering that Juan's machine experienced an earlier episode with the same symptoms, I'm thinking that maybe my "overcommitted resources" theory might be back in play. When I was looking at Bill G's earlier problem a couple months ago, I discovered from an old Process Monitor log that the BOINC client polls all the slots every 5 minutes checking for lockfiles. What those logs don't tell me is how long the client waits for a response from the OS before timing out and deciding that it can't acquire a lockfile in a particular slot. Unless somebody knows that off the top of their head, I'd say it would probably require an experienced code walker to ferret out that info. Know anybody like that? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.