Message boards :
Number crunching :
Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 162 · Next
Author | Message |
---|---|
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . Being impatient I decided to try the rig with the 2 x1060s (because it was easier). Strangely there is very little noticeable effect. If anything run times may be a few seconds quicker without it. What exactly does -bs do? With it off CPU use is slightly lower, runtimes are maybe a few seconds faster, GPU temps are up a degree or 3. It seems to be using the GPUs more and the CPU less with it off. Stephen ?? |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
I'm sure a lot of people would like to know if and how it effects the performance. . . I have now repeated the change on the Core2 Duo with the GTX1050ti. The results are more definite on this rig. There is a clear reduction in run times. I will have to check again over the next few days with different batches of tasks to confirm it is not random, but it seems clear that these rigs like it better with -bs off, and that's no bs. Stephen :) |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Thanks Stephen, When set the -bs tells the CUDA driver CPU part of the code to wait for GPU to finish its tasks in blocking sync mode. That should reduce CPU usage and increase run time a bit. GPU will run less hot. Blocking sync allows the process to go idle while waiting. Without -bs the CPU spin loops waiting actively for GPU to finish its tasks. It should be quicker, use more CPU and push GPU harder because it is getting more work done. So your measurements confirm that without bs it is faster, produces more heat on the GPU, but that is strange if CPU is less heat stressed. One explanation could be that you have all CPU threads/cores running setiMBCPU and the active spin-loop lets the math intensive setiMBCPU to run less (slowing it down). Petri To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
Rather than using an outdated Ubuntu this is what I used with Debian. This will get you the latest Debian (Jessie), latest kernel (4.9) and the 7.6.33 BOINC client. This won't give you the CUDA80 app but you should be up and running with a CUDA capable machine after doing this. Part 1 - Install Debian I used the Debian 8.7 net install for this. You’ll need a thumb drive or a blank CD. Download it from http://www.debian.org/distrib/ and write the ISO image to CD or thumb drive. Boot off the thumb drive or CD. It will start up the Debian installer Install Debian select, SSH server and whatever desktop you prefer and remove all other selections. Once done it will reboot. Part 2 - Install Nvidia software Login as root, open a xterm window and type the following commands: cd /etc/apt nano sources.list (nano is a text editor) Change “Jessie Main†lines to “jessie main contrib non-free†and add a Jessie-backports line. It should look like this when you're done. I'm using httpredir as it will pick the fastest server. deb http://httpredir.debian.org/debian/ jessie main contrib non-free deb http://security.debian.org/ jessie/updates main contrib non-free deb http://httpredir.debian.org/debian/ jessie-updates main contrib non-free deb http://httpredir.debian.org/debian/ jessie-backports main contrib non-free Exit out of nano and save the file (Control-O followed by Control-X) apt update apt install –t jessie-backports firmware-realtek (if needed) apt install –t jessie-backports linux-image-amd64 apt install –t jessie-backports nvida-kernel-dkms nvidia-smi nvidia-xconfig apt install –t jessie-backports nvidia-opencl-icd (if you want OpenCL support) nvidia-xconfig sync reboot   Part 3 – Install BOINC login as root. Start xterm again and type the following commands: apt install –t Jessie-backports boinc-nvidia-cuda boinc-manager I got the cuda libraries that Petri posted earlier in this thread and put them in the /var/lib/boinc-client directory. They may not be needed. Make sure they're marked as executable and are owned by user boinc (do a chown boinc:boinc lib* command). sync reboot BOINC blog |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Thanks Stephen, . . On the rig where the reduction in run times is marginal it has only a 2 core Pentium D supporting 2 GTX1060s and both CPU cores are busy doing that, no CPU crunching. The other rig where the run times reduced more significantly has a Core2 Duo with only one GTX1050ti and one CPU core crunching. I guess the better ratio of CPU resources for the GPU made the greater difference. . . The A/C has been working so temps are well down at the moment. But I will be shutting it down for the night ... Stephen . |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Did you remember to restart BOINC after changing the app_info settings? Changes to the app_info require a restart. Removing the -bs setting will cause the App to use 100% CPU, I've never seen a case where it didn't. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Did you remember to restart BOINC after changing the app_info settings? Changes to the app_info require a restart. Removing the -bs setting will cause the App to use 100% CPU, I've never seen a case where it didn't. . . So selecting "read config files" won't do it ?? . . I will execute a restart Stephen oops. |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Did you remember to restart BOINC after changing the app_info settings? Changes to the app_info require a restart. Removing the -bs setting will cause the App to use 100% CPU, I've never seen a case where it didn't. Right! To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Did you remember to restart BOINC after changing the app_info settings? Changes to the app_info require a restart. Removing the -bs setting will cause the App to use 100% CPU, I've never seen a case where it didn't. You could use an app_config.xml I guess to make changes to parameters 'immediate' to next task starting. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Did you remember to restart BOINC after changing the app_info settings? Changes to the app_info require a restart. Removing the -bs setting will cause the App to use 100% CPU, I've never seen a case where it didn't. . . Prepare to have your world rocked :) . . On the Pentium rig I closed and restarted BOINC, no significant change, run times are similar and CPU use is still similar. I shut down and restarted whole rig. Run times are similar and CPU usage is similar. I rechecked app_info.xml and confirmed I definitely have removed -bs. For whatever reason (I am presuming because of the relatively limited resources of the old architecture) the presence or absence of -bs on that machine makes little or no real difference. . . On the Core2 Duo, I had to restart the whole rig anyway because I cannot get Boinc manager to re-launch the BOINC client once halted, but it launches AOK on start up. The CPU use is now as predicted, about 100%, but drops noticeably between running tasks. Run times have formed slightly tighter groupings but show only moderate/minor reductions. Halflings still take 1.66 to 1.75 mins and are tightly grouped, NARA (normal AR Arecibo) take slightly under 5 mins now where they were more typically 5 to 5.25 mins before. Blc04 are taking about the same at 5.5 to 5.66 mins and Blc13 take about the same at 5 to 5.25 mins. The main difference is in NARA runtimes which are significantly quicker with about a 10 to 20 sec reduction, not bad out of a 300 to 330 second runtime, 5% is still 5% :) Stephen . |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Have you considered how many watts that 5% is costing you? You forget I'm the one that built and tested that App. On a machine with ample CPU resources it was common to see the CPU use spike to 110% per task on the CPU monitor before I convinced Petri to add the Blocking Sync feature. If you're OK with burning up near 60% CPU wattage on your Dual core CPU for a 5% gain then go for it. |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
I got the cuda libraries that Petri posted earlier in this thread and put them in the /var/lib/boinc-client directory. They may not be needed. Make sure they're marked as executable and are owned by user boinc (do a chown boinc:boinc lib* command). Apparently you don't need these to get it to recognise CUDA. You will of course need to put them and the app into the projects/setiathome folder along with an app_info. BOINC blog |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Have you considered how many watts that 5% is costing you? You forget I'm the one that built and tested that App. On a machine with ample CPU resources it was common to see the CPU use spike to 110% per task on the CPU monitor before I convinced Petri to add the Blocking Sync feature. If you're OK with burning up near 60% CPU wattage on your Dual core CPU for a 5% gain then go for it. . . My rigs all run off separate power boards that draw their power from the wall via indivdual power meters (I am a very inquisitive sort of person). And with -bs on the C2D was drawing about 115W, now it is drawing about 120W or maybe 125W. So yes it is probably slightly better value running with -bs on but I would have to restart the whole thing again and it is only a small difference after all. The Pentium rig was drawing 330 to 335W, now about 340 to 345W so again not enough of a difference to make me shut it all down again. If I need to make any other changes I can take care of it then. . . If there is anything you want me to try out of curiosity I would be happy to oblige. Stephen :) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
I got the cuda libraries that Petri posted earlier in this thread and put them in the /var/lib/boinc-client directory. They may not be needed. Make sure they're marked as executable and are owned by user boinc (do a chown boinc:boinc lib* command). . . Well am of the understanding that those libraries are essential for CUDA80 to do its thing. But you can discuss that with TBar or Petri, it is their app. Of course if you only want to run stock CUDA60 then no worries :) Stephen . |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
. . If there is anything you want me to try out of curiosity I would be happy to oblige.I've noticed you're running Sleep with the OpenCL App in Windows. Sleep is basically the same as -BS on the CUDA App. Have you calculated the difference between using Sleep in OpenCL verses not using Sleep? That might be interesting. Prior to Sleep & BS both Apps used near 100% CPU, you can even speed up the Old CUDA App by using the -poll cmd which also uses 100% CPU. There is a larger speed up on the older CUDA App using -poll, but very few people have chosen to use it. |
Mark Loukko Send message Joined: 7 Jun 99 Posts: 52 Credit: 40,406,567 RAC: 108 |
Hi Petri, How much effort would it take to create a Windows version of your app? It sounds like there are real benefits. I would really like to use the most efficient application possible and contribute as much as possible to SETI using the resources I have. Cheers Mark |
rob smith Send message Joined: 7 Mar 03 Posts: 22441 Credit: 416,307,556 RAC: 380 |
I believe Jason (and maybe a few others) are working on one - It has proven to be less easy due to the way Windows does various timing things that Petri & TBar have utilised to great effect. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Hi Petri, Hi Mark, rob_smith kind of answered that. It would need that you have a working windows compilation environment and then copying the new code over that. Then some fiddling with C/C++ header files and a lot of testing. Jason_gee is testing. He can compile and he has a version for windows, but due to the big number of errors in pulseFind still it is not ready to be released. My code has an issue with sometimes reporting a wrong pulse as the best found. I'm sure the windows version will come eventually. Petri To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . If there is anything you want me to try out of curiosity I would be happy to oblige. . . I didn't even hear about the -poll switch until I was well and truly into SoG and trying to learn it's tuning options. If I had still been doing CUDA50 I would have given that a try. There is a school of thought (Hi Grant and Zalster) that sleep should not be used as it reduces the overall output, but I am definitely not of that school, though there are hardware configurations where it will hold up. But with a nice little i5-6400 I would lose a CPU core to do that, and I am sure the smallish improvement on my modest GTX950 would not come close to making up for the lost productivity of that one core crunching. If this rig was running a 1070 or 1080 then I would probably give that a run. If you are really curious I can change to that setup for a test period. . . I am actually considering reducing the load cycle for the CPU in the C2D from 100% to maybe 80% or 75% to see if the reduced CPU load speeds up the CUDA80 app at all. Damn the torpedos, I'll give that a try . It is always nice to know the answers to questions like this. Stephen :) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . I have set Max CPU load to 75% to free some CPU time for CUDA80, it has increased the GPU usage by a small amount, it is sitting in the mid to high 90's now. Mon Apr 17 08:58:47 2017 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.39 Driver Version: 375.39 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 105... Off | 0000:01:00.0 On | N/A | | 80% 60C P0 61W / 75W | 1552MiB / 4033MiB | 96% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1053 G /usr/lib/xorg/Xorg 101MiB | | 0 3122 G compiz 29MiB | | 0 9009 C ...ome_x41p_zi3k+_x86_64-pc-linux-gnu_cuda80 1417MiB | +-----------------------------------------------------------------------------+ . . The CPU use is cycling between about 60% and 100% as BOINC adjusts the CPU time for crunching. . . Run times seem to have dropped by a few more seconds but not conclusive yet, I will monitor it for while. . . Having fun! Stephen :) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.