CUDA for boinc 6.6.41/6.10.17 on Linux 2.6.31

Questions and Answers : Unix/Linux : CUDA for boinc 6.6.41/6.10.17 on Linux 2.6.31
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 946585 - Posted: 11 Nov 2009, 20:54:37 UTC
Last modified: 11 Nov 2009, 20:56:07 UTC

OK, so my Mandriva Linux system has:

Linux 2.6.31.5-server-1mnb #1 SMP Fri Oct 23 01:19:00 EDT 2009 x86_64

CUDA toolkit v2.3 (2.2 is available also)

nVidia driver version 185.18.36

on a GeForce GTS 250

and... Boinc 6.6.41 doesn't work for the cuda:


Starting BOINC client version 6.6.40 for x86_64-pc-linux-gnu                                                                           
[---] log flags: task, file_xfer, sched_ops                                                                                                  
[---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3c-ares/1.5.1                                                                       
[---] Data directory: /home/bgusr/.boinc                                                                                                     
[---] Processor: 4 AuthenticAMD AMD Phenom(tm) II X4 955 Processor [Family 16 Model 4 Stepping 2]
[---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_leg
[---] OS: Linux: 2.6.31.5-server-1mnb
[---] Memory: 7.82 GB physical, 15.99 GB virtual
[---] Disk: 64.00 GB total, 4.11 GB free
[---] Local time is UTC +0 hours
[---] No CUDA-capable NVIDIA GPUs found
[---] No coprocessors
[SETI@home] Found app_info.xml; using anonymous platform
[---] [error] Application uses nonexistent coprocessor CUDA
[---] Not using a proxy


The real interesting bit is that the boinc_cuda application for s@h gets deleted and also the links to the cuda libraries get deleted. Those are named in the app_info.xml file.

Also, I downloaded the boinc 6.6.41 x86_64 and note that it reports v6.6.40...


Known problem for that card/cuda? Any suggestions as to what to try?

Cheers,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 946585 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 946608 - Posted: 11 Nov 2009, 23:17:58 UTC - in response to Message 946585.  

Can you also check that Mandriva actually says you have a GTS 250? The message you get now seems to indicate that it is recognized wrong: "No CUDA-capable Nvidia GPUs found" not "Can't load library libcudart" if the correct library file wasn't loaded (libcudart.so).


ID: 946608 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 946710 - Posted: 12 Nov 2009, 13:07:45 UTC - in response to Message 946608.  

Can you also check that Mandriva actually says you have a GTS 250? The message you get now seems to indicate that it is recognized wrong: "No CUDA-capable Nvidia GPUs found" not "Can't load library libcudart" if the correct library file wasn't loaded (libcudart.so).


The GUI tools all name it as a GTS 250...

lspci lists:
01:00.0 VGA compatible controller: nVidia Corporation G92 [GeForce GTS 250] (rev a2)

And the nVidia display tool lists:

Graphics Processor: GeForce GTS 250
VBIOS Version: 62.92.5d.00.01
Memory: 512 MB

file /usr/lib64/libcudart.so.2.3
/usr/lib64/libcudart.so.2.3: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, stripped

And all the various libcudart all point to that version via filesystem symbolic links.


Putting an old libcudart in place (it might be v2.1?) and voila:

[---] NVIDIA GPU 0: GeForce GTS 250 (driver version 0, compute capability 1.1, 512MB, est. 84GFLOPS)
[SETI@home] Found app_info.xml; using anonymous platform
[...]
[---] file projects/setiathome.berkeley.edu/setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu not found
[---] file projects/setiathome.berkeley.edu/libcudart.so.2 not found
[---] file projects/setiathome.berkeley.edu/libcufft.so.2 not found
[---] [error] No URL for file transfer of setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu
[---] [error] No URL for file transfer of libcudart.so.2
[---] [error] No URL for file transfer of libcufft.so.2



OK... So why doesn't v2.3 work?

Is it worth risking the old cudart even though I have the v2.3 nVidia toolkit installed?

Cheers,
Martin



See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 946710 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 946715 - Posted: 12 Nov 2009, 13:45:37 UTC - in response to Message 946710.  

Using the Mandriva package manager to upgrade to the dkms nVidia 190.42 drivers now allows boinc to find the graphics card ok even with the v2.3 CUDA toolkit.

Is there anything to be gained by jumping up to the Boinc 6.10.x series?

Happy crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 946715 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 946719 - Posted: 12 Nov 2009, 13:59:25 UTC - in response to Message 946715.  

Other than that the detection of the CUDA GPUs is different in 6.10 and there are some improvements to the CPU/GPU schedulers? No, not really. ;-)

By the way, BOINC 6.6 came with its own libcudart.so, so unless you overwrote that one... I don't think there's much difference between the 2.1, 2.2 and 2.3 versions, other than that the later versions can recognize more (newer) GPUs.

In BOINC 6.10 the detection is changed to using an Nvidia API in combination with the drivers you have installed. I'll explain that one again on a cold winter evening in front of the fire. :-)
ID: 946719 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 946723 - Posted: 12 Nov 2009, 14:22:06 UTC - in response to Message 946719.  
Last modified: 12 Nov 2009, 14:22:28 UTC

Other than that the detection of the CUDA GPUs is different in 6.10 and there are some improvements to the CPU/GPU schedulers? No, not really. ;-)

By the way, BOINC 6.6 came with its own libcudart.so, so unless you overwrote that one... I don't think there's much difference between the 2.1, 2.2 and 2.3 versions, other than that the later versions can recognize more (newer) GPUs.

In BOINC 6.10 the detection is changed to using an Nvidia API in combination with the drivers you have installed. I'll explain that one again on a cold winter evening in front of the fire. :-)

OK, so for the sake of experimenting and furthering the development and whatnot...

[---] Starting BOINC client version 6.10.17 for x86_64-pc-linux-gnu
[---] log flags: file_xfer, sched_ops, task
[---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3 c-ares/1.5.1
[---] Data directory: /home/bgusr/.boinc
[---] Processor: 4 AuthenticAMD AMD Phenom(tm) II X4 955 Processor [Family 16 Model 4 Stepping 2]
[---] Processor: 512.00 KB cache
[---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_leg
[---] OS: Linux: 2.6.31.5-server-1mnb
[---] Memory: 7.82 GB physical, 15.99 GB virtual
[---] Disk: 64.00 GB total, 3.83 GB free
[---] Local time is UTC +0 hours
[---] NVIDIA GPU 0: GeForce GTS 250 (driver version unknown, CUDA version 2030, compute capability 1.1, 512MB, 470 GFLOPS peak)
[SETI@home] Found app_info.xml; using anonymous platform
[---] Not using a proxy
[---] Version change (6.6.40 -> 6.10.17)


I noticed the libcudart.so that comes with the boinc_6.10.17_x86_64-pc-linux-gnu.sh. That binary has remained unchanged since before 2009-09-21 from previous installs. I've relinked the filesystem link to the version that came with the nVidia v2.3 toolkit:


[cd to the BOINC directory]
mv libcudart.so libcudart.so.boinc
ln -s /usr/lib64/libcudart.so


And has there been API convolutions with just nVidia or is there the usual Microsoft twists in there also?...

Hope you're keeping warm!

Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 946723 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 946725 - Posted: 12 Nov 2009, 14:37:24 UTC - in response to Message 946723.  

And has there been API convolutions with just nVidia or is there the usual Microsoft twists in there also?...

In the past (6.4, 6.5, 6.6, 6.8) BOINC would detect your GPU using the libcudart.* file that came with BOINC. Since 6.10, it's done through an API that Nvidia provided, with a minimum driver requirement of CUDA 2.2/185.85 (as you know).

Checking through the coproc.cpp code, I see that the new library file Linux BOINC checks for is libcuda.so, which is the API in this case.
ID: 946725 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 946758 - Posted: 12 Nov 2009, 19:46:03 UTC
Last modified: 12 Nov 2009, 19:46:35 UTC

OK, the first WU is through ok and just awaiting validation. Looking cool...

resultid=1419291897


Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 946758 · Report as offensive
Raymond Sutton

Send message
Joined: 14 Sep 09
Posts: 2
Credit: 5,578,151
RAC: 3
United States
Message 948198 - Posted: 19 Nov 2009, 5:20:06 UTC

CUDA not working!

I have a Gforce 9800 based Grapics card with 1024Mb video ram on a Core 7 12GB X58 based system running OpenSuSE 11.2 (2.6.31-5 kernel) with the NVIDIA 190.42 driver installed (also tried 190.18 from the CUDA download page) BOINC 6.10.17 reports no usable GPU found

WHY?

Active projects: Seti@home, Einstein@home & World Community Grid.


As I understand it this is a valid CUDA 2.3 environment so I don't understand why the GPU is not detected (at least for the S@H project)

Thoughts, Opinions, Suggestions (if legal and relevant :-)) Welcome

Thanks

Ray
ID: 948198 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 948212 - Posted: 19 Nov 2009, 9:05:36 UTC - in response to Message 948198.  

As I understand it this is a valid CUDA 2.3 environment so I don't understand why the GPU is not detected (at least for the S@H project)

The detection is done by BOINC independently of the projects.

Does the BOINC user/group have sufficient access to the graphics drivers (/dev)?

Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)

SETI@home classic workunits 3,758
SETI@home classic CPU time 66,520 hours
ID: 948212 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 948245 - Posted: 19 Nov 2009, 15:19:14 UTC - in response to Message 948212.  

As I understand it this is a valid CUDA 2.3 environment so I don't understand why the GPU is not detected (at least for the S@H project)

The detection is done by BOINC independently of the projects.

Does the BOINC user/group have sufficient access to the graphics drivers (/dev)?

Good thought there.

To give further detail, for my system from a command line:

ls -lh /dev/nv*

crw-rw---- 1 root video 195, 0 2009-11-17 19:50 /dev/nvidia0
crw-rw---- 1 root video 195, 255 2009-11-17 19:50 /dev/nvidiactl

and the user id for the Boinc process is a member of the video group.

Post/fix what is seen for the problem system?

Also describe how you installed Boinc.

Happy crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 948245 · Report as offensive
Raymond Sutton

Send message
Joined: 14 Sep 09
Posts: 2
Credit: 5,578,151
RAC: 3
United States
Message 948379 - Posted: 20 Nov 2009, 1:27:40 UTC - in response to Message 948245.  

Thanks everybody, it was the file permissions on /dev/nv* GPU now reporting for duty as ordered :-)
ID: 948379 · Report as offensive
David Trethewey

Send message
Joined: 23 Jan 03
Posts: 1
Credit: 909
RAC: 0
United Kingdom
Message 948492 - Posted: 20 Nov 2009, 10:06:57 UTC - in response to Message 948379.  

I'm using OpenSUSE 11.2, nvidia 190.42 (GeForce 9500 GT), BOINC 6.10.17.

I downloaded and installed CUDA toolkit 2.3

However the messages still say "No Usable GPUs found"
ID: 948492 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 948500 - Posted: 20 Nov 2009, 13:20:34 UTC - in response to Message 948492.  
Last modified: 20 Nov 2009, 13:23:00 UTC

I'm using OpenSUSE 11.2, nvidia 190.42 (GeForce 9500 GT), BOINC 6.10.17.

I downloaded and installed CUDA toolkit 2.3

However the messages still say "No Usable GPUs found"

Sorry, more detail needed.

Have you used the SUSE package manager for all the installs?

Have you read through and checked each point in this thread to see if everything is in place? Especially the symbolic links in the Boinc directories to link over to the CUDA libs?

For example, I have:

In the boinc directory:
lrwxrwxrwx  1 bgusr bgusr   21 2009-01-22 21:06 libboinc_api.so -> libboinc_api.so.6.6.2*                                
lrwxrwxrwx  1 bgusr bgusr   21 2009-01-22 21:06 libboinc_api.so.6 -> libboinc_api.so.6.6.2*                              
-rwxr-xr-x  1 bgusr bgusr 178K 2009-01-22 19:03 libboinc_api.so.6.6.2*                                                   
lrwxrwxrwx  1 bgusr bgusr   17 2009-01-22 21:06 libboinc.so -> libboinc.so.6.6.2*                                        
lrwxrwxrwx  1 bgusr bgusr   17 2009-01-22 21:06 libboinc.so.6 -> libboinc.so.6.6.2*                                      
-rwxr-xr-x  1 bgusr bgusr 1.5M 2009-01-22 19:03 libboinc.so.6.6.2*                                                       
lrwxrwxrwx  1 bgusr bgusr   21 2009-01-22 21:06 libboinc_zip.so -> libboinc_zip.so.6.6.2*                                
lrwxrwxrwx  1 bgusr bgusr   21 2009-01-22 21:06 libboinc_zip.so.6 -> libboinc_zip.so.6.6.2*                              
-rwxr-xr-x  1 bgusr bgusr 741K 2009-01-22 19:03 libboinc_zip.so.6.6.2*                                                   
lrwxrwxrwx  1 bgusr bgusr   23 2009-11-12 14:09 libcudart.so -> /usr/lib64/libcudart.so*                                 
lrwxrwxrwx  1 bgusr bgusr   25 2009-01-19 15:35 libcudart.so.2 -> /usr/lib64/libcudart.so.2*                             
lrwxrwxrwx  1 bgusr bgusr   27 2009-01-19 15:35 libcudart.so.2.1 -> /usr/lib64/libcudart.so.2.1*                         
lrwxrwxrwx  1 bgusr bgusr   27 2009-11-11 17:43 libcudart.so.2.3 -> /usr/lib64/libcudart.so.2.3*                         
-rwxr-xr-x  1 bgusr bgusr 249K 2009-09-21 23:01 libcudart.so.boinc*                                                      
lrwxrwxrwx  1 bgusr bgusr   25 2009-11-11 17:47 libcufftemu.so -> /usr/lib64/libcufftemu.so*                             
lrwxrwxrwx  1 bgusr bgusr   27 2009-11-11 18:58 libcufftemu.so.2 -> /usr/lib64/libcufftemu.so.2*                         
lrwxrwxrwx  1 bgusr bgusr   29 2009-11-11 17:47 libcufftemu.so.2.3 -> /usr/lib64/libcufftemu.so.2.3*                     
lrwxrwxrwx  1 bgusr bgusr   22 2009-11-11 17:48 libcufft.so -> /usr/lib64/libcufft.so*                                   
lrwxrwxrwx  1 bgusr bgusr   24 2009-11-11 18:58 libcufft.so.2 -> /usr/lib64/libcufft.so.2*                               
lrwxrwxrwx  1 bgusr bgusr   26 2009-11-11 17:47 libcufft.so.2.3 -> /usr/lib64/libcufft.so.2.3*                           


And in the projects/setiathome.berkeley.edu directory:
lrwxrwxrwx 1 bgusr bgusr   23 2009-01-19 15:40 libcudart.so -> /usr/lib64/libcudart.so*
lrwxrwxrwx 1 bgusr bgusr   25 2009-11-12 13:37 libcudart.so.2 -> /usr/lib64/libcudart.so.2*
lrwxrwxrwx 1 bgusr bgusr   27 2009-01-19 15:40 libcudart.so.2.1 -> /usr/lib64/libcudart.so.2.1*
lrwxrwxrwx 1 bgusr bgusr   27 2009-11-11 17:45 libcudart.so.2.3 -> /usr/lib64/libcudart.so.2.3*
lrwxrwxrwx 1 bgusr bgusr   25 2009-11-11 17:46 libcufftemu.so -> /usr/lib64/libcufftemu.so*
lrwxrwxrwx 1 bgusr bgusr   27 2009-11-11 18:57 libcufftemu.so.2 -> /usr/lib64/libcufftemu.so.2*
lrwxrwxrwx 1 bgusr bgusr   29 2009-11-11 17:46 libcufftemu.so.2.3 -> /usr/lib64/libcufftemu.so.2.3*
lrwxrwxrwx 1 bgusr bgusr   22 2009-01-19 15:40 libcufft.so -> /usr/lib64/libcufft.so*
lrwxrwxrwx 1 bgusr bgusr   24 2009-11-12 13:37 libcufft.so.2 -> /usr/lib64/libcufft.so.2*
lrwxrwxrwx 1 bgusr bgusr   26 2009-01-19 15:40 libcufft.so.2.1 -> /usr/lib64/libcufft.so.2.1*
lrwxrwxrwx 1 bgusr bgusr   26 2009-11-11 17:46 libcufft.so.2.3 -> /usr/lib64/libcufft.so.2.3*


(I upgraded from cuda 2.1 -> cuda 2.3. You don't need the 2.1 links.)

Also, once the GPU is detected, you will need an app_info.xml with a CUDA enabled optimised s@h application...

Good luck,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 948500 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 948501 - Posted: 20 Nov 2009, 13:29:30 UTC - in response to Message 946719.  

... By the way, BOINC 6.6 came with its own libcudart.so, so unless you overwrote that one... I don't think there's much difference between the 2.1, 2.2 and 2.3 versions, other than that the later versions can recognize more (newer) GPUs. ...

I suspect that the Boinc bundled libcudart.so is cuda 2.1. I renamed it out of the way to libcudart.so.boinc during my tests.

There's a comment from lunatics that there was about a 30% speedup moving from cuda 2.1 -> 2.2 and about the same again 2.2 -> 2.3. They've found a much smaller speedup for cuda 2.3 -> cuda 3.0.


At the moment, I seem to have stumbled across a problem with the Boinc scheduler in Boinc 6.10.17 whereby the s@h cuda app can be left idle. Details posted on the devs list.

Happy fast crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 948501 · Report as offensive

Questions and Answers : Unix/Linux : CUDA for boinc 6.6.41/6.10.17 on Linux 2.6.31


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.