Postponed: Waiting to acquire lock

Author	Message
Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 1910926 - Posted: 5 Jan 2018, 21:57:06 UTC - in response to Message 1910920. Crunching always uses a physical core, so never utilize more than the physical cores available. It will slow down significantly. Maybe on Bulldozer CPUs but for my i7 I've always found running with HyperThreading on results in more work being done than with it off, even when using all cores (physical & virtual). Grant Darwin NT ID: 1910926 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1910927 - Posted: 5 Jan 2018, 21:59:25 UTC - in response to Message 1910921. yup, Hard to keep them straight lol... ID: 1910927 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80	Message 1910930 - Posted: 5 Jan 2018, 22:05:42 UTC - in response to Message 1910926. Crunching always uses a physical core, so never utilize more than the physical cores available. It will slow down significantly. Maybe on Bulldozer CPUs but for my i7 I've always found running with HyperThreading on results in more work being done than with it off, even when using all cores (physical & virtual). Certainly not. With each crime and every kindness we birth our future. ID: 1910930 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1910932 - Posted: 5 Jan 2018, 22:07:22 UTC - in response to Message 1910920. Crunching always uses a physical core, so never utilize more than the physical cores available. It will slow down significantly. Maybe its possible to feed the GPU`s with available threads on modern CPU`s. So in my 6 cores / 12 threads CPU what is the right thing to do since I already run 4 GPU WU. Run 2 more CPU WU or keep running 4 As I do totay? I already know 8 CPU WU slow down everything. 6 CPU WU works fine but never actually check the times . ID: 1910932 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80	Message 1910934 - Posted: 5 Jan 2018, 22:13:57 UTC - in response to Message 1910932. Crunching always uses a physical core, so never utilize more than the physical cores available. It will slow down significantly. Maybe its possible to feed the GPU`s with available threads on modern CPU`s. So in my 6 cores / 12 threads CPU what is the right thing to do since I already run 4 GPU WU. Run 2 more CPU WU or keep running 4 As I do totay? I already know 8 CPU WU slow down everything. 6 CPU WU works fine but never actually check the times . It might be worth trying 5 CPU tasks. Give me a notice and i will check your times again. Right now it looks perfect. With each crime and every kindness we birth our future. ID: 1910934 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 1910936 - Posted: 5 Jan 2018, 22:16:37 UTC - in response to Message 1910932. Last modified: 5 Jan 2018, 22:18:04 UTC I already know 8 CPU WU slow down everything. 6 CPU WU works fine but never actually check the times . On my i7 system (4 cores 8 threads) running on all available threads does result in longer CPU run times, however it results in more work being done per hour (like back in the day of CUDA on the GPU running more than 1 WU. Longer run times per WU, but more WUs done per hour). However, as I mentioned- I have reserved 1 CPU Core for every GPU WU that is being run, so no CPU cores are trying to process CPU work and support a GPU WU at the same time. I can see that causing significant processing & system responsiveness slow downs. My app_config.xml <app_config> <app> <name>setiathome_v8</name> <gpu_versions> <gpu_usage>1.00</gpu_usage> <cpu_usage>1.00</cpu_usage> </gpu_versions> </app> <app> <name>astropulse_v7</name> <gpu_versions> <gpu_usage>0.5</gpu_usage> <cpu_usage>1.0</cpu_usage> </gpu_versions> </app> </app_config> When running AP, 2 WUs at a time gives the best amount of work per hour, so it takes away a CPU WU core to support that extra GPU WU. When the AP WUs are all done, it goes back to crunching CPU WUs. I also run Seti only. *Running multiple projects, particularly if your system has the resources to run 2 of them at the same time, will result in much different system behavior than for me with just the one project.* Grant Darwin NT ID: 1910936 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1910938 - Posted: 5 Jan 2018, 22:21:45 UTC - in response to Message 1910934. Last modified: 5 Jan 2018, 22:24:05 UTC It might be worth trying 5 CPU tasks. Changed to 5 now. Takes about 1 Hr to crunch each CPU WU but seems like the new blc5 type crunch in less time. ID: 1910938 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 1910941 - Posted: 5 Jan 2018, 22:25:41 UTC - in response to Message 1910938. Last modified: 5 Jan 2018, 22:26:08 UTC It might be worth trying 5 CPU tasks. Changed to 5 now. Takes about 1 Hr to crunch each WU. The best way to check is to compare the Run time to the CPU time for a given CPU WU. At present, the difference for your CPU WUs is round 3 seconds. Mine is up to 3 minutes. If you've got more than 10min on many WUs, then the system is showing signs of being over committed. If it's 15min or more, it's overcommitted. Reserve Cores for the GPU, or just use less. My personal preference, reserve Cores and use them all. Grant Darwin NT ID: 1910941 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80	Message 1910944 - Posted: 5 Jan 2018, 22:30:16 UTC - in response to Message 1910941. Last modified: 5 Jan 2018, 22:31:32 UTC It might be worth trying 5 CPU tasks. Changed to 5 now. Takes about 1 Hr to crunch each WU. The best way to check is to compare the Run time to the CPU time for a given CPU WU. At present, the difference for your CPU WUs is round 3 seconds. Mine is up to 3 minutes. If you've got more than 10min on many WUs, then the system is showing signs of being over committed. If it's 15min or more, it's overcommitted. Reserve Cores for the GPU, or just use less. My personal preference, reserve Cores and use them all. Please don`t give wrong advice. Your CPU times on the I7 are slower than it was on my FX . If you are happy with the way you are running the host its fine but you should know by now that i know what i`m talking about. With each crime and every kindness we birth our future. ID: 1910944 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1910946 - Posted: 5 Jan 2018, 22:30:43 UTC Last modified: 5 Jan 2018, 22:32:49 UTC This is my app_config file. I believe this already reserve the core or no? <app_config> <project_max_concurrent>9</project_max_concurrent> <app_version> <app_name>setiathome_v8</app_name> <plan_class>cuda90</plan_class> <avg_ncpus>1.0</avg_ncpus> <ngpus>1.0</ngpus> <cmdline>-unroll 15 -pfb 16 -pfp 600 -nobs</cmdline> </app_version> <app_version> <app_name>astropulse_v7</app_name> <plan_class>opencl_nvidia_100</plan_class> <avg_ncpus>1.0</avg_ncpus> <ngpus>1.0</ngpus> <cmdline>-use_sleep -unroll 15 -sbs 256 -ffa_block 12288 -ffa_block_fetch 6144</cmdline> </app_version> </app_config> <edit> just remembering my host has 4 GPU's ID: 1910946 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80	Message 1910948 - Posted: 5 Jan 2018, 22:32:16 UTC Don`t change anything for know. With each crime and every kindness we birth our future. ID: 1910948 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1910951 - Posted: 5 Jan 2018, 22:35:10 UTC Last modified: 5 Jan 2018, 22:35:50 UTC Don`t change anything for know. By your command. LOL Running 4 GPU + 5 CPU WU Out off topic: blc5 type WU are crunched in 2:21 Min on the GPU and in 45 min on the CPU. I like that. ID: 1910951 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 1910952 - Posted: 5 Jan 2018, 22:36:59 UTC - in response to Message 1910944. Please don`t give wrong advice. Your CPU times on the I7 are slower than it was on my FX . If you are happy with the way you are running the host its fine but you should know by now that i know what i`m talking about. I'm aware of the work you have done, i'm also aware of the number of big crunchers that run more than 1 WU at a time on their GPUs. Yet my personal experience on my system is that running more than 1 GPU WU at a time doesn't result in more work per hour, and with HyperThreading on, and using all the cores, I get more work per hour done than with HyperThreading off with faster WU run times. True, I haven't tried HyperThreading off and more than 1 GPU WU, but it works & is stable so i'm not going to fiddle further with it. Grant Darwin NT ID: 1910952 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1910963 - Posted: 5 Jan 2018, 22:50:13 UTC - in response to Message 1910952. Last modified: 5 Jan 2018, 22:52:00 UTC Some of the confusion comes from how the AMD Bulldozer/Piledriver cpu was arranged. It was arguably a 4core/8HT thread processor. The virtual cores or thread had to share time accessing the single FPU register in the (2 core) module. So trying to run a cpu thread on the virtual core slowed cpu processing down for both threads in the module. For FX processor best to limit cpu core usage to only the 4 physical cores for cpu tasks. You can let the virtual threads support any gpu tasks as a gpu task does not need any math functions from the cpu. Math work is done on the gpu. With Ryzen, everything has changed. Now the current AMD architecture bears more resemblance to Intel architecture with respect to hyperthreading. Each thread has access to its own FPU register so you don't see the slowdown with cpu tasks on a virtual core like you did with FX. Still, as has been pointed out by Mike and Grant, ideally you should keep cpu_time and run_time equal or very close. If the times diverge a large amount, it shows the processor is overcommitted and is not running the tasks as efficiently as possible. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1910963 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80	Message 1910964 - Posted: 5 Jan 2018, 22:50:39 UTC - in response to Message 1910952. Please don`t give wrong advice. Your CPU times on the I7 are slower than it was on my FX . If you are happy with the way you are running the host its fine but you should know by now that i know what i`m talking about. I'm aware of the work you have done, i'm also aware of the number of big crunchers that run more than 1 WU at a time on their GPUs. Yet my personal experience on my system is that running more than 1 GPU WU at a time doesn't result in more work per hour, and with HyperThreading on, and using all the cores, I get more work per hour done than with HyperThreading off with faster WU run times. True, I haven't tried HyperThreading off and more than 1 GPU WU, but it works & is stable so i'm not going to fiddle further with it. Nobody said you should. With each crime and every kindness we birth our future. ID: 1910964 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1910966 - Posted: 5 Jan 2018, 22:52:01 UTC Juan, it looks like yesterday afternoon you also had a period of tasks with the "Can't acquire lockfile" messages. Take a look at these 6 tasks while they're still on the server: https://setiathome.berkeley.edu/result.php?resultid=6285905907 https://setiathome.berkeley.edu/result.php?resultid=6285905928 https://setiathome.berkeley.edu/result.php?resultid=6286130580 https://setiathome.berkeley.edu/result.php?resultid=6286157029 https://setiathome.berkeley.edu/result.php?resultid=6286176312 https://setiathome.berkeley.edu/result.php?resultid=6286204543 The problem started at about 16:15:55 local time and cleared shortly after 17:27:51 local time. I suspect that if you'd let your other 4 tasks continue this morning, they might have eventually cleared, also. What I would suggest is that you first review your BOINC Event Log for that time period and see if you can determine what might have occurred around 16:15 to trigger the issue. If there's nothing unusual there, then perhaps a system log might tell you something. Or, perhaps you can recall running some sort of resource hogging application on that machine about that time. If you can pinpoint a cause, then look for the same sort of thing when the problem cropped up again this morning. Good luck! ID: 1910966 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 1910971 - Posted: 5 Jan 2018, 22:57:15 UTC - in response to Message 1910964. True, I haven't tried HyperThreading off and more than 1 GPU WU, but it works & is stable so i'm not going to fiddle further with it. Nobody said you should. I know that, but I was just pointing out it was the one thing that I haven't tried (that and limiting the number of cores used. So that makes it 2 things I haven't tried) to get the same results that others say they are getting. Grant Darwin NT ID: 1910971 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1910974 - Posted: 5 Jan 2018, 22:59:01 UTC Wow, you've been busy while I've been out supping a pint or two. Anyone actually matching symptoms to solutions? ID: 1910974 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1910982 - Posted: 5 Jan 2018, 23:09:02 UTC - in response to Message 1910966. Last modified: 5 Jan 2018, 23:09:53 UTC What I would suggest is that you first review your BOINC Event Log for that time period Slowly... what is the file name of this? Or, perhaps you can recall running some sort of resource hogging application on that machine about that time. If you can pinpoint a cause, then look for the same sort of thing when the problem cropped up again this morning. Almost sure not, unless something running without my knowledge of course. When i use this host is just to browse besides this forums and the S@H site itself, some very light sites like read a newspaper. Not runs anything else on the host. Mainly because i not learn how to run more things on this Linux host. ID: 1910982 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1910985 - Posted: 5 Jan 2018, 23:12:12 UTC - in response to Message 1910974. Wow, you've been busy while I've been out supping a pint or two. Anyone actually matching symptoms to solutions? Didn't bring back any to share, huh? ;^) Well, after discovering that Juan's machine experienced an earlier episode with the same symptoms, I'm thinking that maybe my "overcommitted resources" theory might be back in play. When I was looking at Bill G's earlier problem a couple months ago, I discovered from an old Process Monitor log that the BOINC client polls all the slots every 5 minutes checking for lockfiles. What those logs don't tell me is how long the client waits for a response from the OS before timing out and deciding that it can't acquire a lockfile in a particular slot. Unless somebody knows that off the top of their head, I'd say it would probably require an experienced code walker to ferret out that info. Know anybody like that? ID: 1910985 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.