Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 129 · 130 · 131 · 132 · 133 · 134 · 135 . . . 162 · Next

AuthorMessage
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2009870 - Posted: 29 Aug 2019, 18:26:46 UTC - in response to Message 2009864.  
Last modified: 29 Aug 2019, 18:29:30 UTC


This is the board in that system:
https://www.gigabyte.com/Motherboard/G1Sniper-Z87-rev-11#kf


I also have GEN=4 (two identical socket 1050 TB85) and I had to disable the on board graphics on the system with the I7 chip. The other board, the one I just posted about) had a Xeon chip which did not have intel graphics. I originaly tried getting the intel gen4 hd graphics to run OpenCL but intel does not support Gen4 or earlier from what I read so the 3nd board I built I got a xeon which was a lot cheaper.

You might check your on-board graphics and see if the desktop is there or simply make sure the onboard is disabled on that gigabyte

HTH
ID: 2009870 · Report as offensive     Reply Quote
elec999 Project Donor

Send message
Joined: 24 Nov 02
Posts: 375
Credit: 416,969,548
RAC: 141
Canada
Message 2009872 - Posted: 29 Aug 2019, 18:29:08 UTC - in response to Message 2009870.  


This is the board in that system:
https://www.gigabyte.com/Motherboard/G1Sniper-Z87-rev-11#kf


I also have GEN=4 (two identical socket 1050 TB85) and I had to disable the on board graphics on the system with the I7 chip. The other board, the one I just posted about) had a Xeon chip which did not have intel graphics.

You might check your on-board graphics and see if the desktop is there or simply make sure the onboard is disabled on that gigabyte

HTH


I have it disabled. I can see the computer post, and ubuntu boot. Then it goes to a black screen.
ID: 2009872 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2009876 - Posted: 29 Aug 2019, 18:44:21 UTC - in response to Message 2009872.  


This is the board in that system:
https://www.gigabyte.com/Motherboard/G1Sniper-Z87-rev-11#kf


I also have GEN=4 (two identical socket 1050 TB85) and I had to disable the on board graphics on the system with the I7 chip. The other board, the one I just posted about) had a Xeon chip which did not have intel graphics.

You might check your on-board graphics and see if the desktop is there or simply make sure the onboard is disabled on that gigabyte

HTH


I have it disabled. I can see the computer post, and ubuntu boot. Then it goes to a black screen.

I would say an update bunged up the Nvidia drivers. Try reverting to Nouveau drivers in recovery mode and the reinstall the Nvidia drivers.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2009876 · Report as offensive     Reply Quote
elec999 Project Donor

Send message
Joined: 24 Nov 02
Posts: 375
Credit: 416,969,548
RAC: 141
Canada
Message 2009968 - Posted: 30 Aug 2019, 11:50:49 UTC

Anyone know what this error means?

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
Cuda error 'Couldn't get cuda device count
' in file 'cuda/cudaAcceleration.cu' in line 154 : no CUDA-capable device is detected.

</stderr_txt>
]]>
ID: 2009968 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2009969 - Posted: 30 Aug 2019, 12:01:15 UTC - in response to Message 2009968.  

It means that the SETI application couldn't find any CUDA devices when it tried to run a SETI task.

The SETI (CUDA) application wouldn't have been launched by BOINC if BOINC couldn't find any capable GPUs. So the cards, or drivers, must have been working when BOINC started, but must have failed subsequently. Something must have changed.

What were you doing on the machine after starting BOINC, but before getting that message?
ID: 2009969 · Report as offensive     Reply Quote
elec999 Project Donor

Send message
Joined: 24 Nov 02
Posts: 375
Credit: 416,969,548
RAC: 141
Canada
Message 2009978 - Posted: 30 Aug 2019, 14:16:39 UTC - in response to Message 2009969.  

It means that the SETI application couldn't find any CUDA devices when it tried to run a SETI task.

The SETI (CUDA) application wouldn't have been launched by BOINC if BOINC couldn't find any capable GPUs. So the cards, or drivers, must have been working when BOINC started, but must have failed subsequently. Something must have changed.

What were you doing on the machine after starting BOINC, but before getting that message?


I didnt do anything, this is a dedicated machine. I noticed its giving these errors after the gpu is done processing the work.
ID: 2009978 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2009981 - Posted: 30 Aug 2019, 14:43:28 UTC - in response to Message 2009978.  
Last modified: 30 Aug 2019, 14:45:58 UTC

It means that the SETI application couldn't find any CUDA devices when it tried to run a SETI task.

The SETI (CUDA) application wouldn't have been launched by BOINC if BOINC couldn't find any capable GPUs. So the cards, or drivers, must have been working when BOINC started, but must have failed subsequently. Something must have changed.

What were you doing on the machine after starting BOINC, but before getting that message?


I didnt do anything, this is a dedicated machine. I noticed its giving these errors after the gpu is done processing the work.

Besides the obvious GPU malfunction, there is a high probability of something corrupts your host. .

Normally a simple reset of the host solves the problem.
ID: 2009981 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22190
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2009983 - Posted: 30 Aug 2019, 14:58:33 UTC

Or a GPU has failed.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2009983 · Report as offensive     Reply Quote
elec999 Project Donor

Send message
Joined: 24 Nov 02
Posts: 375
Credit: 416,969,548
RAC: 141
Canada
Message 2010374 - Posted: 1 Sep 2019, 19:43:06 UTC

Does CPU,ram have a huge effect on results. I noticed my dual 2060 doing really well. Beating even my super card and the 2070.
ID: 2010374 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2010376 - Posted: 1 Sep 2019, 20:05:40 UTC - in response to Message 2010374.  
Last modified: 1 Sep 2019, 20:10:12 UTC

Since you seem to be referring to gpu performance, the gpus need enough cpu support to run their best. Since data has to flow through both the memory and cpu to get from the gpu task in storage to the gpu card to crunch, if either or both of those components are sped up, they will increase the throughput of all gpus in the host.

[Edit]Looking at those hosts I also see a difference in BOINC version, OS versions and kernel versions. Generally the 5.0 kernels are seeing a speed regression from the earlier kernels like 4.15.0.
For reference look at some of the charts and tests at phoronix.com for kernel benchmark testing.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2010376 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2010377 - Posted: 1 Sep 2019, 20:09:52 UTC - in response to Message 2010374.  

Does CPU,ram have a huge effect on results. I noticed my dual 2060 doing really well. Beating even my super card and the 2070.


For GPU work, no.

Just make sure the CPU isn’t overcommitted (ie trying to run 100% CPU work + GPU work on top) and you’ll be fine. I run one of my systems with low power 2.4GHz CPUs and DDR3 ram with zero impact to the GPU work.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2010377 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2010378 - Posted: 1 Sep 2019, 20:13:14 UTC - in response to Message 2010377.  

Does CPU,ram have a huge effect on results. I noticed my dual 2060 doing really well. Beating even my super card and the 2070.


For GPU work, no.

Just make sure the CPU isn’t overcommitted (ie trying to run 100% CPU work + GPU work on top) and you’ll be fine. I run one of my systems with low power 2.4GHz CPUs and DDR3 ram with zero impact to the GPU work.

YMMV. I do see a difference in system performance comparing hosts with slower cpu and memory clocks compared to faster cpu and memories clocks on the same cpu and hardware on other hosts.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2010378 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2010382 - Posted: 1 Sep 2019, 20:23:18 UTC - in response to Message 2010374.  

Does CPU,ram have a huge effect on results. I noticed my dual 2060 doing really well. Beating even my super card and the 2070.

If you compare the CPU tasks of those two systems you will see the CPU with the 2070 is Over-committed compared to the one with the two 2060s.
GPUs Must receive free CPU time to run their best. It's the Number One Item in the ReadME.
Try reducing the CPU tasks on the machine with the 2070 using the Use at most __ % of the CPUs setting.
2070 CPU = Computer ID 8765031
Run time 2 hours 16 min 39 sec
CPU time 1 hours 43 min 27 sec

2060 CPU = Computer ID 8775954
Run time 1 hours 45 min 14 sec
CPU time 1 hours 44 min 41 sec
ID: 2010382 · Report as offensive     Reply Quote
elec999 Project Donor

Send message
Joined: 24 Nov 02
Posts: 375
Credit: 416,969,548
RAC: 141
Canada
Message 2010632 - Posted: 4 Sep 2019, 12:58:12 UTC

Is there anyway of knowing without checking the boinc client if both gpus are being used?
ID: 2010632 · Report as offensive     Reply Quote
Sleepy
Volunteer tester
Avatar

Send message
Joined: 21 May 99
Posts: 219
Credit: 98,947,784
RAC: 28,360
Italy
Message 2010641 - Posted: 4 Sep 2019, 14:32:45 UTC - in response to Message 2010632.  

Is there anyway of knowing without checking the boinc client if both gpus are being used?
nvidia-smi?
ID: 2010641 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2010643 - Posted: 4 Sep 2019, 15:06:07 UTC - in response to Message 2010632.  
Last modified: 4 Sep 2019, 15:23:31 UTC

Is there anyway of knowing without checking the boinc client if both gpus are being used?


Please be more specific when asking questions. You have 16 active hosts, and each one with different OS/hardware configs. Without context no one knows what you are asking about.

include which host you're referencing with your questions. providing a link to the host details page, or at the very least describe which host you are asking about will go a long way. Instead of having the first responses being requests for more information from you.

but in general, there are several ways to tell if a GPU is being used.

check the nvidia-smi output (linux) or GPUz for windows
check the boincmgr 'Tasks' tab and see if you see 2 GPUs processing tasks there
put your hand on the GPU, does it feel warm?
look at the GPU, are the fans spinning?

I looked through all your dual GPU systems, and you have Valid submissions from both GPUs in all cases, so yes, both of your GPUs are being used.

Perhaps you can also include WHY you think you are not using them both that you need to ask?
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2010643 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2010644 - Posted: 4 Sep 2019, 15:12:39 UTC - in response to Message 2010643.  

but in general, there are several ways to tell if a GPU is being use.

check the nvidia-smi output (linux) or GPUz for windows
check the boincmgr 'Tasks' tab and see if you see 2 GPUs processing tasks there
put your hand on the GPU, does it feel warm?
look at the GPU, are the fans spinning?
Look at the std_err text in the results listed on this website. The single host I checked showed tasks being completed on Device 1, and other tasks being completed on Device 2. But I'm not going to check all of them without guidance.
ID: 2010644 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2010645 - Posted: 4 Sep 2019, 15:15:19 UTC - in response to Message 2010644.  

but in general, there are several ways to tell if a GPU is being use.

check the nvidia-smi output (linux) or GPUz for windows
check the boincmgr 'Tasks' tab and see if you see 2 GPUs processing tasks there
put your hand on the GPU, does it feel warm?
look at the GPU, are the fans spinning?
Look at the std_err text in the results listed on this website. The single host I checked showed tasks being completed on Device 1, and other tasks being completed on Device 2. But I'm not going to check all of them without guidance.



yeah, I just checked all of his dual GPU hosts and edited my comment. he has valid work from both GPUs in all cases, so i'm not sure why he is worried that he's not using them.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2010645 · Report as offensive     Reply Quote
elec999 Project Donor

Send message
Joined: 24 Nov 02
Posts: 375
Credit: 416,969,548
RAC: 141
Canada
Message 2010646 - Posted: 4 Sep 2019, 15:17:45 UTC - in response to Message 2010643.  

Is there anyway of knowing without checking the boinc client if both gpus are being used?


Please be more specific when asking questions. You have 16 active hosts, and each one with different OS/hardware configs. Without context no one knows what you are asking about.

include which host you're referencing with your questions. providing a link to the host details page, or at the very least describe which host you are asking about will go a long way. Instead of having the first responses being requests for more information from you.

but in general, there are several ways to tell if a GPU is being use.

check the nvidia-smi output (linux) or GPUz for windows
check the boincmgr 'Tasks' tab and see if you see 2 GPUs processing tasks there
put your hand on the GPU, does it feel warm?
look at the GPU, are the fans spinning?

I looked through all your dual GPU systems, and you have Valid submissions from both GPUs in all cases, so yes, both of your GPUs are being used.

Perhaps you can also include WHY you think you are not using them both that you need to ask?


ID: 8759418
Details | Tasks
Cross-project stats:
BOINCstats.com Free-DC z440 home 46,645.77 2,395,820 7.9.3 GenuineIntel
Intel(R) Xeon(R) CPU E5-1620 v3 @ 3.50GHz [Family 6 Model 63 Stepping 2]
(8 processors) [2] NVIDIA Quadro P2000 (4095MB) driver: 410.10 Linux Ubuntu
Ubuntu 18.04.2 LTS [4.15.0-20-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)] 4 Sep 2019, 14:22:23 UTC

I installed a 1050ti this morning. Thats why I asked. I forgot to check before I left the datacenter.
ID: 2010646 · Report as offensive     Reply Quote
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 715
Credit: 8,032,827
RAC: 62
France
Message 2010726 - Posted: 5 Sep 2019, 9:06:09 UTC
Last modified: 5 Sep 2019, 9:07:53 UTC

some examples of validated tasks


<core_client_version>7.9.3</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_CUDA: Found 2 CUDA device(s):
Device 1: Quadro P2000, 5059 MiB, regsPerBlock 65536
computeCap 6.1, multiProcs 8
pciBusID = 3, pciSlotID = 0
Device 2: GeForce GTX 1050 Ti, 4038 MiB, regsPerBlock 65536
computeCap 6.1, multiProcs 6
pciBusID = 2, pciSlotID = 0
In cudaAcc_initializeDevice(): Boinc passed DevPref 2
setiathome_CUDA: CUDA Device 2 specified, checking...
Device 2: GeForce GTX 1050 Ti is okay
<===
SETI@home using CUDA accelerated device GeForce GTX 1050 Ti <===
Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1

setiathome v8 enhanced x41p_V0.98b1, Cuda 9.00 special
Modifications done by petri33, compiled by TBar



<core_client_version>7.9.3</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_CUDA: Found 2 CUDA device(s):
Device 1: Quadro P2000, 5059 MiB, regsPerBlock 65536
computeCap 6.1, multiProcs 8
pciBusID = 3, pciSlotID = 0
Device 2: GeForce GTX 1050 Ti, 4038 MiB, regsPerBlock 65536
computeCap 6.1, multiProcs 6
pciBusID = 2, pciSlotID = 0
In cudaAcc_initializeDevice(): Boinc passed DevPref 1
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: Quadro P2000 is okay
<===
SETI@home using CUDA accelerated device Quadro P2000 <===
Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1

setiathome v8 enhanced x41p_V0.98b1, Cuda 9.00 special
Modifications done by petri33, compiled by TBar

ID: 2010726 · Report as offensive     Reply Quote
Previous · 1 . . . 129 · 130 · 131 · 132 · 133 · 134 · 135 . . . 162 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.