Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 48 · 49 · 50 · 51 · 52 · 53 · 54 . . . 162 · Next

AuthorMessage
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1931023 - Posted: 20 Apr 2018, 11:46:58 UTC - in response to Message 1931011.  

Hi Stephen,
When using the repository version of BOINC, don't you have to install OpenCL and Cuda for BOINC as well as for the video driver?
Open Synaptic and type in BOINC, you should see OpenCL and Cuda listings for it. You need to install those also.
Maybe that will fix your problem. Hope so.
Good Luck.


. . Thanks Bruce,

. . The openCL support is in the video drivers and the app. The AP app used to work on that machine but then suddenly it didn't. It is very hard to even guess when because it used to get so very few AP tasks. My guess is that because I have been having problems with Linux upgrades failing to boot off the system flashdrive causing me to revert to the last version that works properly, along the way something got twisted around and lost the plot. Hard to say what or when but since AP tasks just slow things down compared to CUDA80 Special I thought it easier just to give up and stop receiving AP work. Problem solved. The strange thing is that the AP task would start and try to run for a few seconds but fail and go to "waiting to run" status.

Stephen

<shrug>
ID: 1931023 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1931048 - Posted: 20 Apr 2018, 14:57:07 UTC - in response to Message 1931023.  

Hi Stephen,
When using the repository version of BOINC, don't you have to install OpenCL and Cuda for BOINC as well as for the video driver?
Open Synaptic and type in BOINC, you should see OpenCL and Cuda listings for it. You need to install those also.
Maybe that will fix your problem. Hope so.
Good Luck.


. . Thanks Bruce,

. . The openCL support is in the video drivers and the app. The AP app used to work on that machine but then suddenly it didn't. It is very hard to even guess when because it used to get so very few AP tasks. My guess is that because I have been having problems with Linux upgrades failing to boot off the system flashdrive causing me to revert to the last version that works properly, along the way something got twisted around and lost the plot. Hard to say what or when but since AP tasks just slow things down compared to CUDA80 Special I thought it easier just to give up and stop receiving AP work. Problem solved. The strange thing is that the AP task would start and try to run for a few seconds but fail and go to "waiting to run" status.

Stephen

<shrug>

No, that is a negative on the OpenCL support being in the video drivers and the app. You MUST install the support libraries along with the video driver. That means the OpenCL-ICD libraries for the video driver version level along with the CUDA libraries matching the driver level. Normally when you select the graphics driver in Synaptic it selects all the support libraries to be installed along with the driver.

As you state, in all your kernel mishaps, it looks like the link dependencies between the driver and the OpenCL library got twisted up. You should just start from scratch as the easiest solution.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1931048 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1931104 - Posted: 20 Apr 2018, 22:43:55 UTC - in response to Message 1931048.  
Last modified: 20 Apr 2018, 22:49:30 UTC

Hi Stephen,
When using the repository version of BOINC, don't you have to install OpenCL and Cuda for BOINC as well as for the video driver?
Open Synaptic and type in BOINC, you should see OpenCL and Cuda listings for it. You need to install those also.
Maybe that will fix your problem. Hope so.
Good Luck.


. . Thanks Bruce,

. . The openCL support is in the video drivers and the app. The AP app used to work on that machine but then suddenly it didn't. It is very hard to even guess when because it used to get so very few AP tasks. My guess is that because I have been having problems with Linux upgrades failing to boot off the system flashdrive causing me to revert to the last version that works properly, along the way something got twisted around and lost the plot. Hard to say what or when but since AP tasks just slow things down compared to CUDA80 Special I thought it easier just to give up and stop receiving AP work. Problem solved. The strange thing is that the AP task would start and try to run for a few seconds but fail and go to "waiting to run" status.

Stephen

<shrug>

No, that is a negative on the OpenCL support being in the video drivers and the app. You MUST install the support libraries along with the video driver. That means the OpenCL-ICD libraries for the video driver version level along with the CUDA libraries matching the driver level. Normally when you select the graphics driver in Synaptic it selects all the support libraries to be installed along with the driver.

As you state, in all your kernel mishaps, it looks like the link dependencies between the driver and the OpenCL library got twisted up. You should just start from scratch as the easiest solution.


. . Hi Keith,

. . OK I guess this is down to my layman's way of looking at things. What I meant was that I don't believe I have to install OpenCL support into BOINC. It is in the vdeo drivers (executable and libraries and other support files that get installed with it) and the app (again, the executables and support files). I guess it is because as a primarily Windows user I am accustomed to ready mades with everything being included in the installation package, I am not accustomed to the "roll your own" world of Linux :)

. . But TBar and yourself are probably right that only a rebuild from the ground up will resolve the issue. At some stage I intend to upgrade to a later version of Ubuntu (probably Lubuntu) with the later versions of BOINC and CUDA90. That will probably (hopefully) fix the issue, but as TBar also said, with the special app AP tasks just slow things down :), so I will probably still not run them. I have gotten used to not getting them anyway.

Stephen

:)
ID: 1931104 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1936111 - Posted: 18 May 2018, 3:05:10 UTC

Just put my first Windows7 to Ubuntu 18.04 LTS conversion back online. Hardware was easy. Software took me a few days to sort out because nvidia-xconfig was pulling some shenanigans. Hope to recover my lost RAC while the system was offline getting converted. Should make it back up and start increasing it shortly with the special app and more cpu cores in play.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1936111 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1936120 - Posted: 18 May 2018, 4:17:33 UTC - in response to Message 1936111.  

Just put my first Windows7 to Ubuntu 18.04 LTS conversion back online. Hardware was easy. Software took me a few days to sort out because nvidia-xconfig was pulling some shenanigans. Hope to recover my lost RAC while the system was offline getting converted. Should make it back up and start increasing it shortly with the special app and more cpu cores in play.


. . If CreditScrew plays nice ...

Stephen

:)
ID: 1936120 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1936611 - Posted: 22 May 2018, 2:27:02 UTC

so i finally converted my linux machines over to the special app.

basically just extracted the archive, and dumped the various files into the seti project folder right? (i chmod 777 the whole directory for ease with permissions, heh). everything seems to be working.

is there no need to tweak the cmd line parameters with the special app like you do with the SoG app? i dont even see a text file for that.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1936611 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1936623 - Posted: 22 May 2018, 3:56:45 UTC - in response to Message 1936611.  

No there is not much you need to do with the special app. It is designed to use all of the resources available. So it uses all of the graphics card and grabs a full cpu core to feed it. The only useful parameter is the -nobs command which should be either be placed in the app_info command line or in the command line in an app_config file. The application itself has the built-in -auto unroll parameter which handles disparate cards with different quantities of compute units. You can play with some other parameters like -pfp and such, just look for the docs on the old x41zi application for tuning. I think Brent or Grant uses some of those. I have tried them and really see no improvement in processing times other than just letting the app run natively. Your $0.02 may make a difference.

The -nobs stands for no blocking sync and makes the card use all of a cpu core. If you are running low on cpu resources you could remove that and the impact on the cpu will diminish but your processing times will increase.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1936623 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1939020 - Posted: 10 Jun 2018, 21:45:26 UTC - in response to Message 1936623.  


The -nobs stands for no blocking sync and makes the card use all of a cpu core. If you are running low on cpu resources you could remove that and the impact on the cpu will diminish but your processing times will increase.


. . The increase in run times with -nobs removed is not that great, but I would suggest it is only of benefit when CPU resources are very limited or with single/low end GPUs . Such as with older dual or quad core CPUs. If using a more modern quad core or better and multiple GPUs then I think you will achiver better results with -nobs on and devoting that extra CPU core per GPU. On my i5-6600 I am crunching on only one CPU core and using the other 3 for housekeeping and support to keep the 970s purring along.

Stephen

:)
ID: 1939020 · Report as offensive     Reply Quote
J. Mileski
Volunteer tester
Avatar

Send message
Joined: 9 Jun 02
Posts: 632
Credit: 172,116,532
RAC: 572
United States
Message 1942973 - Posted: 7 Jul 2018, 1:00:52 UTC

It seem like I have a host throwing errors. https://setiathome.berkeley.edu/results.php?hostid=8538256
I tried the cuda90 on this Linux machine, but I got it wrong. It has a GTX 750 with 1 gig of ram. I'm not sure, but I think it has the GM107 rather than the the newer GM206
ID: 1942973 · Report as offensive     Reply Quote
J. Mileski
Volunteer tester
Avatar

Send message
Joined: 9 Jun 02
Posts: 632
Credit: 172,116,532
RAC: 572
United States
Message 1942985 - Posted: 7 Jul 2018, 1:22:18 UTC - in response to Message 1942973.  

It seem like I have a host throwing errors. https://setiathome.berkeley.edu/results.php?hostid=8538256
I tried the cuda90 on this Linux machine, but I got it wrong. It has a GTX 750 with 1 gig of ram. I'm not sure, but I think it has the GM107 rather than the the newer GM206

I need some help. the first run I got out of memory after 1 second, then I entered in the command line -unroll 1. now I get "Cuda error 'cufftPlan1d(&fft_analysis_plans[FftNum][0], FftLen, CUFFT_C2C, NumDataPoints / FftLen)' in file 'cuda/cudaAcc_fft.cu' in line 29 : invalid argument.
" after about 4 seconds.
ID: 1942985 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1942990 - Posted: 7 Jul 2018, 1:34:44 UTC - in response to Message 1942973.  

Yep,
Cuda error 'cufftPlan1d(&fft_analysis_plans[FftNum][0], FftLen, CUFFT_C2C, NumDataPoints / FftLen)' in file 'cuda/cudaAcc_fft.cu' in line 29 : invalid argument.
Is the error you get when you don't have enough vRam.
The thing to remember about vRam usage is generally, the newer the Code, the More vRam it uses. Newer OSes, and CUDA Apps use more.
Also, the Monitor is usually what pushes it over the edge, so, if possible don't connect a Monitor to a GPU with Low vRam. It might work if you add a 2 GB GPU to the machine and connect the Monitor to that 2 GB GPU. Also, you could try the CUDA 6.0 App. 6.0 uses the least vRam while 9.0 uses the most. As for the OS, Mint uses more vRam than say Lubuntu.
ID: 1942990 · Report as offensive     Reply Quote
J. Mileski
Volunteer tester
Avatar

Send message
Joined: 9 Jun 02
Posts: 632
Credit: 172,116,532
RAC: 572
United States
Message 1943002 - Posted: 7 Jul 2018, 2:11:18 UTC - in response to Message 1942990.  

Yep,
Cuda error 'cufftPlan1d(&fft_analysis_plans[FftNum][0], FftLen, CUFFT_C2C, NumDataPoints / FftLen)' in file 'cuda/cudaAcc_fft.cu' in line 29 : invalid argument.
Is the error you get when you don't have enough vRam.
The thing to remember about vRam usage is generally, the newer the Code, the More vRam it uses. Newer OSes, and CUDA Apps use more.
Also, the Monitor is usually what pushes it over the edge, so, if possible don't connect a Monitor to a GPU with Low vRam. It might work if you add a 2 GB GPU to the machine and connect the Monitor to that 2 GB GPU. Also, you could try the CUDA 6.0 App. 6.0 uses the least vRam while 9.0 uses the most. As for the OS, Mint uses more vRam than say Lubuntu.


I have the monitor connected to the onboard vga. this dell does not have a pci power connector, so I need to find a new card that uses less than 75 watts
ID: 1943002 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1943019 - Posted: 7 Jul 2018, 8:31:53 UTC - in response to Message 1942990.  
Last modified: 7 Jul 2018, 8:44:48 UTC

Yep,
Cuda error 'cufftPlan1d(&fft_analysis_plans[FftNum][0], FftLen, CUFFT_C2C, NumDataPoints / FftLen)' in file 'cuda/cudaAcc_fft.cu' in line 29 : invalid argument.
Is the error you get when you don't have enough vRam.
The thing to remember about vRam usage is generally, the newer the Code, the More vRam it uses. Newer OSes, and CUDA Apps use more.
Also, the Monitor is usually what pushes it over the edge, so, if possible don't connect a Monitor to a GPU with Low vRam. It might work if you add a 2 GB GPU to the machine and connect the Monitor to that 2 GB GPU. Also, you could try the CUDA 6.0 App. 6.0 uses the least vRam while 9.0 uses the most. As for the OS, Mint uses more vRam than say Lubuntu.

. . Is it possible that Cuda80 might save enough vram over cuda90 to get him out of trouble? Or is the difference too small to likely be of any help?

. . If he has 2 slots available maybe some old clunker of a video card to drive the monitor and leave the 750 as crunching only?? ... sorry answered the message too early. :)

. . Is your Dell a full height or low profile unit? My HP is a low profile unit and I am running a low profile GTX1050ti card in it without an external PCIe power connection (card is rated at 75W) and finding no problems. The unit only has a 220W PSU but at full crunch it is only drawing about 110W. Both Gigabyte and MSI make these units in low profile. If your 750 needs the external PCIe power connection and you have 2 (one might be enough) spare molex connectors a simple power adaptor will provide that connection ...

Stephen

??
ID: 1943019 · Report as offensive     Reply Quote
J. Mileski
Volunteer tester
Avatar

Send message
Joined: 9 Jun 02
Posts: 632
Credit: 172,116,532
RAC: 572
United States
Message 1943070 - Posted: 7 Jul 2018, 14:28:07 UTC - in response to Message 1943019.  


. . Is your Dell a full height or low profile unit? My HP is a low profile unit and I am running a low profile GTX1050ti card in it without an external PCIe power connection (card is rated at 75W) and finding no problems. The unit only has a 220W PSU but at full crunch it is only drawing about 110W. Both Gigabyte and MSI make these units in low profile. If your 750 needs the external PCIe power connection and you have 2 (one might be enough) spare molex connectors a simple power adaptor will provide that connection ...

Stephen

??


It's a poweredge t-320 raid server without the redundant power supply, that I got without hard drives. I took out the raid card and connected an ssd to the motherboard. I'm out of town, away from my computer until Tuesday
ID: 1943070 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1943209 - Posted: 8 Jul 2018, 1:01:35 UTC - in response to Message 1943070.  


It's a poweredge t-320 raid server without the redundant power supply, that I got without hard drives. I took out the raid card and connected an ssd to the motherboard. I'm out of town, away from my computer until Tuesday


. . I couldn't find any reference to molex connectors in the manual so I am presuming the PSU only provides SATA power connectors. So, since I have never found any SATA to PCIe adaptors, I guess you will be limited to GPUs rated at 75W or less and without need of the external PCIe power connector. But the good news is that unit supports full height and double width PCI cards so you can choose from almost all the 1050/1050ti products on the market except for the high end games oriented units.

. . Good luck with it.

Stephen

. .
ID: 1943209 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1943219 - Posted: 8 Jul 2018, 4:19:24 UTC - in response to Message 1943209.  


. . I couldn't find any reference to molex connectors in the manual so I am presuming the PSU only provides SATA power connectors. So, since I have never found any SATA to PCIe adaptors, I guess you will be limited to GPUs rated at 75W or less and without need of the external PCIe power connector. But the good news is that unit supports full height and double width PCI cards so you can choose from almost all the 1050/1050ti products on the market except for the high end games oriented units.

. . Good luck with it.

Stephen

. .

I found SATA to PCIe adaptors in my first Google search. Pretty much any cable can be adapted and has been by smart vendors who don't leave any stone unturned to make a profit.
Branded 8inch SATA 15pin to 6pin PCI Express Card Power Cable
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1943219 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1943227 - Posted: 8 Jul 2018, 6:33:04 UTC - in response to Message 1943219.  


I found SATA to PCIe adaptors in my first Google search. Pretty much any cable can be adapted and has been by smart vendors who don't leave any stone unturned to make a profit.
Branded 8inch SATA 15pin to 6pin PCI Express Card Power Cable

. . Typical, when I was trying to find one for my HP there was nada, zilch and nothing. Timing is everything :)

. . But there is always a catch ... price $5-79 US, shipping to Oz $30.07 US, cost to me delivered about $50 AUD :(

. . There were some alternate links to Asian sellers where the shipping is much more attractively priced :) I am almost tempted to buy a few just in case ... You never know I may decide to upgrade to a better GPU in my i5 with the 950 :)

. . In any case it can get our friend out of trouble if his 750 needs external PCIe power.

Stephen

:)
ID: 1943227 · Report as offensive     Reply Quote
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1943230 - Posted: 8 Jul 2018, 6:42:10 UTC - in response to Message 1943227.  

. . But there is always a catch ... price $5-79 US, shipping to Oz $30.07 US, cost to me delivered about $50 AUD :(
You can always go the SATA --> Molex --> PCI route, if you already have the Molex --> PCI cable. SATA --> Molex is common and likely cheap.
ID: 1943230 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1943308 - Posted: 8 Jul 2018, 16:21:17 UTC - in response to Message 1943227.  

Just remember that the SATA connection can only deliver about 54 (12V) watts per cable.(12V@4.5A) The 12V lines are the only ones that can be used for gpus. The 5V and 3.3V lines do nothing for you.

So if considering a beefier gpu with either two 6-pin or a 6-pin and a 8-pin connection, you will always need one SATA cable per 6-pin connection and you will need two SATA cables per 8-pin connection. Don't think that just plugging an adapter into two ports on the SATA cable nets you twice the 54W. Its one SATA cable connection each at the power supply ports.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1943308 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1943380 - Posted: 8 Jul 2018, 22:25:24 UTC - in response to Message 1943308.  

Just remember that the SATA connection can only deliver about 54 (12V) watts per cable.(12V@4.5A) The 12V lines are the only ones that can be used for gpus. The 5V and 3.3V lines do nothing for you.

So if considering a beefier gpu with either two 6-pin or a 6-pin and a 8-pin connection, you will always need one SATA cable per 6-pin connection and you will need two SATA cables per 8-pin connection. Don't think that just plugging an adapter into two ports on the SATA cable nets you twice the 54W. Its one SATA cable connection each at the power supply ports.


. . Good point Keith. If needing 2 SATA connectors using 2 on the same SATA power cable will spread the load between the connectors but will NOT increase the power available to the GPU which is limited to the 54W Keith mentions. You would need two SATA power cables and to use only one connector of each to achieve 108W for the GPU, so with less than 75W from the socket this would still limit you to a 180W GPU. Even if you have 3 SATA ports to use and enough adaptors, the extra one only takes you up to a 230W GPU. So NO Titan V's :)

. . But it would be enough for 1060 or 1070 cards.

Stephen

:)
ID: 1943380 · Report as offensive     Reply Quote
Previous · 1 . . . 48 · 49 · 50 · 51 · 52 · 53 · 54 . . . 162 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.