CUDA Errors Out on Linux

Message boards : Number crunching : CUDA Errors Out on Linux
Message board moderation

To post messages, you must log in.

AuthorMessage
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 932760 - Posted: 12 Sep 2009, 13:50:32 UTC

Hio All
I'm attempting to get Cuda running on my Linux boxes. I'm using crunch3r's Linux 6.08 app with the Nvidia v185.18 video drivers and Boinc V6.4.5 for x86_64. OS is Mandriva 2008.1 PowerPack, kernel version is 2.6.24.7. CUDA card is an 8600GT, if I can get this working I'll upgrade the card.

The problem is, when BOINC starts is appears to recognise the the card ok but the unit errors out straight away. I've checked the permissions of the BOINC directory and they're ok. Starting BOINC from a console shows no errors. It also deletes the AK_v8 app on start up.

stderr out looks like this
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>

SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r :p

setiathome_CUDA: Found 1 CUDA device(s):
Device 1 : GeForce 8600 GT
totalGlobalMem = 268107776
sharedMemPerBlock = 16384
regsPerBlock = 8192
warpSize = 32
memPitch = 262144
maxThreadsPerBlock = 512
clockRate = 1188000
totalConstMem = 65536
major = 1
minor = 1
textureAlignment = 256
deviceOverlap = 1
multiProcessorCount = 4
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce 8600 GT is okay
SIGSEGV: segmentation violation
Stack trace (16 frames):
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x47cba9]
/lib64/libpthread.so.0[0x2b1ddde43350]
/usr/lib64/libcuda.so.1[0x2b1ddd6ba940]
/usr/lib64/libcuda.so.1[0x2b1ddd6c06a4]
/usr/lib64/libcuda.so.1[0x2b1ddd689a2f]
/usr/lib64/libcuda.so.1[0x2b1ddd415296]
/usr/lib64/libcuda.so.1[0x2b1ddd425bab]
/usr/lib64/libcuda.so.1[0x2b1ddd40d198]
/usr/lib64/libcuda.so.1(cuCtxCreate+0xaa)[0x2b1ddd40700a]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x5ace4b]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x40d4ca]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x419f23]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x424c7d]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x407f60]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x2b1dde06f074]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu(__gxx_personality_v0+0x241)[0x407be9]

Exiting...

</stderr_txt>
]]>


libcuda.so.1 is actually a link which points to libcuda.so.185.18.36 which is located in both the usr/lib and usr/lib64 directories

My app_info file was cobbled together from the existing app_info file and the sample provided by crunch3r looks like this
<app_info>
<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>setiathome-CUDA-6.08.x86_64-pc-linux-gnu</name>
<executable/>
</file_info>
<file_info>
<name>libcudart.so.2.1</name>
<executable/>
</file_info>
<file_info>
<name>libcufft.so.2.1</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>608</version_num>
<plan_class>cuda</plan_class>
<avg_ncpus>0.050000</avg_ncpus>
<max_ncpus>0.050000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>setiathome-CUDA-6.08.x86_64-pc-linux-gnu</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libcudart.so.2.1</file_name>
</file_ref>
<file_ref>
<file_name>libcufft.so.2.1</file_name>
</file_ref>
</app_version>

<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>AK_V8_linux64_ssse3</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>603</version_num>
<file_ref>
<file_name>AK_V8_linux64_ssse3</file_name>
<main_program/>
</file_ref>
</app_version>
<app>
<name>astropulse</name>
</app>
<file_info>
<name>astropulse-5.0.i686-pc-linux-gnu</name>
<executable/>
</file_info>
<app_version>
<app_name>astropulse</app_name>
<version_num>500</version_num>
<file_ref>
<file_name>astropulse-5.0.i686-pc-linux-gnu</file_name>
<main_program/>
</file_ref>
</app_version>
<app>
<name>astropulse_v5</name>
</app>
<file_info>
<name>astropulse-5.03.x86_64-pc-linux-gnu</name>
<executable/>
</file_info>
<app_version>
<app_name>astropulse_v5</app_name>
<version_num>503</version_num>
<file_ref>
<file_name>astropulse-5.03.x86_64-pc-linux-gnu</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>


The BOINC start up looks like this
Sat 12 Sep 2009 22:54:36 CST||Starting BOINC client version 6.4.5 for x86_64-pc-linux-gnu
Sat 12 Sep 2009 22:54:36 CST||log flags: task, file_xfer, sched_ops
Sat 12 Sep 2009 22:54:36 CST||Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3 c-ares/1.5.1
Sat 12 Sep 2009 22:54:36 CST||Data directory: /home/brodo/BOINC
Sat 12 Sep 2009 22:54:36 CST|SETI@home|Found app_info.xml; using anonymous platform
Sat 12 Sep 2009 22:54:36 CST||Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz [Family 6 Model 15 Stepping 11]
Sat 12 Sep 2009 22:54:36 CST||Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall lm constant_tsc pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
Sat 12 Sep 2009 22:54:36 CST||OS: Linux: 2.6.24.7-desktop-2mnb
Sat 12 Sep 2009 22:54:36 CST||Memory: 1.97 GB physical, 2.16 GB virtual
Sat 12 Sep 2009 22:54:36 CST||Disk: 5.03 GB total, 3.94 GB free
Sat 12 Sep 2009 22:54:36 CST||Local time is UTC +9 hours
Sat 12 Sep 2009 22:54:36 CST||Not using a proxy
Sat 12 Sep 2009 22:54:36 CST||CUDA devices found
Sat 12 Sep 2009 22:54:36 CST||Coprocessor: GeForce 8600 GT (1)
Sat 12 Sep 2009 22:54:36 CST|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 4518307; location: school; project prefs: school
Sat 12 Sep 2009 22:54:36 CST|World Community Grid|URL: http://www.worldcommunitygrid.org/; Computer ID: 674391; location: (none); project prefs: default
Sat 12 Sep 2009 22:54:36 CST|SETI@home Beta Test|URL: http://setiweb.ssl.berkeley.edu/beta/; Computer ID: 42131; location: (none); project prefs: default
Sat 12 Sep 2009 22:54:36 CST||General prefs: from SETI@home (last modified 02-May-2009 11:58:29)
Sat 12 Sep 2009 22:54:36 CST||Computer location: school
Sat 12 Sep 2009 22:54:36 CST||General prefs: using separate prefs for school
Sat 12 Sep 2009 22:54:36 CST||Preferences limit memory usage when active to 1006.86MB
Sat 12 Sep 2009 22:54:36 CST||Preferences limit memory usage when idle to 2013.71MB
Sat 12 Sep 2009 22:54:36 CST||Preferences limit disk usage to 1.00GB


As usual any assistance given will be greatly appreciated.

TIA
Brodo
ID: 932760 · Report as offensive
Profile trigggl
Volunteer tester
Avatar

Send message
Joined: 9 Jan 09
Posts: 5
Credit: 1,120,435
RAC: 0
United States
Message 932835 - Posted: 12 Sep 2009, 18:42:00 UTC - in response to Message 932760.  

I'm having the same problem in Ubuntu Jaunty with nVidia driver 190, cuda lib 2.3.

1357953361

I also tried it with the supplied cuda libs. No difference.

When I did get one to crunch, it used the CPU instead and gave a bogus result.
ID: 932835 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 933081 - Posted: 13 Sep 2009, 16:41:55 UTC

If someone could just explain what the error messages mean it would be a help. It appears I'm not the only one having this problem.
What driver versions, client versions etc are others using successfully ? Are there any additional libraries etc. that need to be installed ? Are the libs in the correct directory to begin with ?

TIA
Brodo
ID: 933081 · Report as offensive

Message boards : Number crunching : CUDA Errors Out on Linux


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.