Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 123 · 124 · 125 · 126 · 127 · 128 · 129 . . . 136 · Next

AuthorMessage
Profile Mike Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 31866
Credit: 75,350,443
RAC: 25,439
Germany
Message 2003804 - Posted: 22 Jul 2019, 16:21:10 UTC
Last modified: 22 Jul 2019, 16:22:33 UTC

And this lines.

GPU not found: type=NVIDIA, opencl_device_index=1, device_num=-1
WARNING: boinc_get_opencl_ids failed with code -1
OpenCL platform detected: NVIDIA Corporation
WARNING: BOINC supplied wrong platform!

With each crime and every kindness we birth our future.
ID: 2003804 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 9860
Credit: 928,333,861
RAC: 1,523,950
United States
Message 2003808 - Posted: 22 Jul 2019, 16:31:33 UTC

Shut down BOINC and delete the Compute Cache. It is located at /home/{username}/.nv/ComputeCache

Delete the files and folders in the Compute Cache folder. You could reboot the host for good measure too to get a clean copy of the Nvidia drivers installed.

The Compute Cache contains the compute kernel primitives for the applications. If they are corrupted or missing, it could cause your errors.

Once you restart BOINC the compute kernels will be regenerated.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2003808 · Report as offensive     Reply Quote
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9863
Credit: 84,797,402
RAC: 55,653
United Kingdom
Message 2003812 - Posted: 22 Jul 2019, 16:47:34 UTC

It is located at /home/{username}/.nv/ComputeCache


Sorry couldn't find that.

Looking at all the lines that Mike pointed out it looks like the problem with the riser again where it suddenly doesn't detect the card. Hence the 142 "waiting to run ".

I will see if there are any better risers around.
ID: 2003812 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 1805
Credit: 767,218,052
RAC: 2,609,093
United States
Message 2003813 - Posted: 22 Jul 2019, 16:52:23 UTC - in response to Message 2003812.  
Last modified: 22 Jul 2019, 16:53:49 UTC

Do you have the opencl drivers installed?

I can’t see your systems since they are hidden.

What is the output of clinfo?
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2003813 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 9860
Credit: 928,333,861
RAC: 1,523,950
United States
Message 2003814 - Posted: 22 Jul 2019, 16:53:07 UTC - in response to Message 2003812.  

It is located at /home/{username}/.nv/ComputeCache


Sorry couldn't find that.

Looking at all the lines that Mike pointed out it looks like the problem with the riser again where it suddenly doesn't detect the card. Hence the 142 "waiting to run ".

I will see if there are any better risers around.

{username} is just the stand in for your user name. The .nv directory is hidden and needs to be shown by toggling the Show Hidden Files in the File Manager.

So for example my ComputeCache is located at /home/keith/.nv/ComputeCache
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2003814 · Report as offensive     Reply Quote
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9863
Credit: 84,797,402
RAC: 55,653
United Kingdom
Message 2003817 - Posted: 22 Jul 2019, 17:02:01 UTC
Last modified: 22 Jul 2019, 20:21:01 UTC

Do you have the opencl drivers installed?


See the bottom of my first post, the answer is yes and I have already successfully completed AP's
ID: 2003817 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 1805
Credit: 767,218,052
RAC: 2,609,093
United States
Message 2003821 - Posted: 22 Jul 2019, 17:15:53 UTC - in response to Message 2003817.  

Ah I see. You definitely had a card crash.

Check the std_err output of your v8 tasks also.

https://setiathome.berkeley.edu/result.php?resultid=7884480804

You can see it trying to use a card and failing over and over until the other card becomes available and it fails over.

What kind of system is this that you need risers for only 2 GPUs?
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2003821 · Report as offensive     Reply Quote
Loren Datlof

Send message
Joined: 24 Jan 14
Posts: 73
Credit: 19,635,754
RAC: 4,312
United States
Message 2003826 - Posted: 22 Jul 2019, 17:27:58 UTC
Last modified: 22 Jul 2019, 17:30:32 UTC

#!/bin/bash

/usr/bin/nvidia-settings -a "[gpu:0]/GPUPowerMizerMode=1"
/usr/bin/nvidia-settings -a "[gpu:1]/GPUPowerMizerMode=1"
/usr/bin/nvidia-settings -a "[gpu:2]/GPUPowerMizerMode=1"
/usr/bin/nvidia-settings -a "[gpu:0]/GPUFanControlState=1"
/usr/bin/nvidia-settings -a "[fan:0]/GPUTargetFanSpeed=100"
/usr/bin/nvidia-settings -a "[gpu:1]/GPUFanControlState=1"
/usr/bin/nvidia-settings -a "[fan:1]/GPUTargetFanSpeed=100"
/usr/bin/nvidia-settings -a "[gpu:2]/GPUFanControlState=1"
/usr/bin/nvidia-settings -a "[fan:2]/GPUTargetFanSpeed=100"

/usr/bin/nvidia-settings -a "[gpu:0]/GPUMemoryTransferRateOffset[3]=800" -a "[gpu:0]/GPUGraphicsClockOffset[3]=40"
/usr/bin/nvidia-settings -a "[gpu:1]/GPUMemoryTransferRateOffset[3]=800" -a "[gpu:1]/GPUGraphicsClockOffset[3]=0"
/usr/bin/nvidia-settings -a "[gpu:2]/GPUMemoryTransferRateOffset[3]=800" -a "[gpu:2]/GPUGraphicsClockOffset[3]=0"


Thanks for the code Keith it is very helpful. I had to add "DISPLAY=:0 XAUTHORITY=/var/run/lightdm/root/:0 before each line because I am running the host headless.

loren@T7400 ~ $ inxi -AG
Graphics:  Card-1: NVIDIA GP107 [GeForce GTX 1050]
           Card-2: NVIDIA GP106 [GeForce GTX 1060 3GB]
           Display Server: N/A driver: nvidia tty size: 142x39 Advanced Data: N/A out of X


I run this script on the GTX 1060 3GB because it runs very hot and your script seems to have cooled it down by about 5C. Here is my script:
#!/bin/bash

nvidia-smi -i 1 -pl 60 ### set power consumption to 60 watts

DISPLAY=:0 XAUTHORITY=/var/run/lightdm/root/:0 /usr/bin/nvidia-settings -a "[gpu:1]/GPUPowerMizerMode=1"

DISPLAY=:0 XAUTHORITY=/var/run/lightdm/root/:0 /usr/bin/nvidia-settings -a "[gpu:1]/GPUFanControlState=1"
DISPLAY=:0 XAUTHORITY=/var/run/lightdm/root/:0 /usr/bin/nvidia-settings -a "[fan:1]/GPUTargetFanSpeed=100"


My question is whether limiting the 1060 to 60 watts and using PowerMizer = 1 are at odds with each other. The GPU seems to run just fine like this.
ID: 2003826 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 9860
Credit: 928,333,861
RAC: 1,523,950
United States
Message 2003830 - Posted: 22 Jul 2019, 17:45:56 UTC - in response to Message 2003826.  
Last modified: 22 Jul 2019, 17:46:56 UTC

My question is whether limiting the 1060 to 60 watts and using PowerMizer = 1 are at odds with each other. The GPU seems to run just fine like this.

If you don't pass any value to the PowerMizer parameter, the host boots up the gpus in Adaptive Power mode:Mode 0. You can set the values to 0,1 or 2. 1 sets to Prefer Maximum Performance which is not the one to use if you are having troubles cooling the card. A value of 2 sets the PowerMizer to Auto. I've never found a good explanation of the difference between Adaptive and Auto. Has something to do with how aggressively it downclocks the card for power savings especially for battery powered laptops.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2003830 · Report as offensive     Reply Quote
elec999 Project Donor

Send message
Joined: 24 Nov 02
Posts: 366
Credit: 341,628,938
RAC: 730,084
Canada
Message 2003843 - Posted: 22 Jul 2019, 19:23:16 UTC

Can I mix a AMD R260x with the NVIDIA Quadro P2000 ? Will this cause issues, do I need to change anything in the config files?
ID: 2003843 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 9860
Credit: 928,333,861
RAC: 1,523,950
United States
Message 2003847 - Posted: 22 Jul 2019, 20:03:25 UTC - in response to Message 2003843.  

Can I mix a AMD R260x with the NVIDIA Quadro P2000 ? Will this cause issues, do I need to change anything in the config files?

Should be doable. You will just have to have both types of drivers loaded. The numbers of task per card would depend on the OS and the application. So setting up different parameters for each card would be more involved than a simple blanket configuration. Don't know if those two cards are similar enough in compute power.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2003847 · Report as offensive     Reply Quote
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9863
Credit: 84,797,402
RAC: 55,653
United Kingdom
Message 2003849 - Posted: 22 Jul 2019, 20:18:44 UTC

What kind of system is this that you need risers for only 2 GPUs?


Er a cheap and cheerful one, you know you might have seen them, ordinary PC boards, one full length PCIE 16 slot only.

This particular board was the only new one I brought for my Linux conversions.

Oh yes and my GPU's are all second hand as well.
ID: 2003849 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 1805
Credit: 767,218,052
RAC: 2,609,093
United States
Message 2003856 - Posted: 22 Jul 2019, 21:05:28 UTC - in response to Message 2003849.  

Ah so you’re using a splitter on the only PCIe slot. I get it.

No issue with second hand cards. Almost all of mine are second hand also. Else I wouldn’t have so many haha.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2003856 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 9860
Credit: 928,333,861
RAC: 1,523,950
United States
Message 2003860 - Posted: 22 Jul 2019, 21:31:49 UTC - in response to Message 2003843.  

Can I mix a AMD R260x with the NVIDIA Quadro P2000 ? Will this cause issues, do I need to change anything in the config files?

I remembered reading in the fora about issues with the two different vendors OpenCL drivers stepping on each other. So you might have difficulty keeping the Nvidia OpenCL drivers installed along with the AMD OpenCL drivers. And vice versa. Seen most of the issues over at Einstein. Never dealt with AMD before so I am of no help.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2003860 · Report as offensive     Reply Quote
elec999 Project Donor

Send message
Joined: 24 Nov 02
Posts: 366
Credit: 341,628,938
RAC: 730,084
Canada
Message 2003869 - Posted: 22 Jul 2019, 21:47:06 UTC - in response to Message 2003860.  

Can I mix a AMD R260x with the NVIDIA Quadro P2000 ? Will this cause issues, do I need to change anything in the config files?

I remembered reading in the fora about issues with the two different vendors OpenCL drivers stepping on each other. So you might have difficulty keeping the Nvidia OpenCL drivers installed along with the AMD OpenCL drivers. And vice versa. Seen most of the issues over at Einstein. Never dealt with AMD before so I am of no help.


I decided not to mix. This gpu was sitting on my shelf for some time. I also got two 280x.which are power hungry --this is complete system ready to go.
ID: 2003869 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 3551
Credit: 210,724,292
RAC: 524,951
United States
Message 2003890 - Posted: 23 Jul 2019, 0:23:20 UTC - in response to Message 2003860.  

Can I mix a AMD R260x with the NVIDIA Quadro P2000 ? Will this cause issues, do I need to change anything in the config files?

I remembered reading in the fora about issues with the two different vendors OpenCL drivers stepping on each other. So you might have difficulty keeping the Nvidia OpenCL drivers installed along with the AMD OpenCL drivers. And vice versa. Seen most of the issues over at Einstein. Never dealt with AMD before so I am of no help.


I certainly had that issue when I tried to run a gtx 750ti or two on a AMD 2400g motherboard trying to crunch with the iGPU and the NVIDIA's. Through up my hands.

Tom
Oh NO.... I lost my tagline....
ID: 2003890 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 3551
Credit: 210,724,292
RAC: 524,951
United States
Message 2004027 - Posted: 24 Jul 2019, 4:17:18 UTC - in response to Message 2001937.  

Quick question

If I have a standard Linux install of Boinc and I wanted to change to the AIO, what is the easiest way?


EASIEST?

In my opinion. Run your current Linux install dry (NNT).
Download Tbar's AIO.
Do a clean reinstall of Linux.
Then follow the directions for installing AIO.

Plan B is run your Linux dry (NNT).
Download/unarchive the AIO in downloads.
Locate the current places where Linux is installed. A non-trivial task.
Copy files from the unpacked archive to the equivalent places in your current install.
Confirm via properties that the exe files can be executed by "anyone".
Re-start Boinic.

Both of these procedures have been previously described here.
Running Seti dry is not officially required but I believe it is safer.

What is NOT recommended is trying to move the original Seti install to another location. There has been discussion trying that and there turns out to be permissions and other issues that I don't understand that made it less trouble to follow the reinstall the OS process.

HTH,
Tom


Apparently there has been a problem with some older messages links to Tbar's All-in-One:
http://www.arkayn.us/lunatics/BOINC.7z Should get you what you want.

Tom
Oh NO.... I lost my tagline....
ID: 2004027 · Report as offensive     Reply Quote
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 4697
Credit: 150,743,756
RAC: 241,059
Australia
Message 2004031 - Posted: 24 Jul 2019, 5:14:08 UTC - in response to Message 2001907.  

Quick question

If I have a standard Linux install of Boinc and I wanted to change to the AIO, what is the easiest way?


. . I noticed Tom's recent response to this message but was thinking that with the age of the original message you have probably sorted it out by now, am I right?

Stephen

? ?
ID: 2004031 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 3551
Credit: 210,724,292
RAC: 524,951
United States
Message 2004111 - Posted: 24 Jul 2019, 20:11:32 UTC - in response to Message 2004031.  

Quick question

If I have a standard Linux install of Boinc and I wanted to change to the AIO, what is the easiest way?


. . I noticed Tom's recent response to this message but was thinking that with the age of the original message you have probably sorted it out by now, am I right?

Stephen

? ?


I quoted for a context in case someone like the person I was PMing with entered the end of the message thread. I hope the original question as fully answered but couldn't find a direct reference to Tbar's All-In-One aka: AIO so I thought I would refresh that link.

Tom
Oh NO.... I lost my tagline....
ID: 2004111 · Report as offensive     Reply Quote
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 4697
Credit: 150,743,756
RAC: 241,059
Australia
Message 2004153 - Posted: 24 Jul 2019, 23:37:16 UTC - in response to Message 2004111.  
Last modified: 24 Jul 2019, 23:41:50 UTC

Quick question
If I have a standard Linux install of Boinc and I wanted to change to the AIO, what is the easiest way?

. . I noticed Tom's recent response to this message but was thinking that with the age of the original message you have probably sorted it out by now, am I right?
Stephen

I quoted for a context in case someone like the person I was PMing with entered the end of the message thread. I hope the original question as fully answered but couldn't find a direct reference to Tbar's All-In-One aka: AIO so I thought I would refresh that link.
Tom


. . Sorry Tom, I wasn't having a dig at you, I was just hoping it had been sorted out. I just didn't want to seem too interrogative or accusative. But it certainly doesn't hurt to have a current link available.

Stephen

< shrug>
ID: 2004153 · Report as offensive     Reply Quote
Previous · 1 . . . 123 · 124 · 125 · 126 · 127 · 128 · 129 . . . 136 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2019 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.