linux cuda computation errors

Questions and Answers : GPU applications : linux cuda computation errors
Message board moderation

To post messages, you must log in.

AuthorMessage
Chuck Gorish

Send message
Joined: 19 Jun 00
Posts: 156
Credit: 29,589,106
RAC: 0
United States
Message 908246 - Posted: 16 Jun 2009, 23:58:52 UTC

every cuda work unit is winding up with 'computation error'.

our hardware is a dual processor opteron (single core) server with an nvidia 8400gs pci card mounted on a riser since it is a 1u unit.

we are running AMD64 x86_64 o/s along with 64bit compiled boinc 6.6.20 and the V 6.08 Linux x64 SM 1.0 cuda app by crunch3r. the nvidia driver is version 180.51. the nvidia driver is loaded. there is no X installed since this is a server. below is the complete uploaded copy of a typical computation error workunit it completes within seconds:

Name 01mr09ad.20702.8252.5.8.218_2
Workunit 459529162
Created 16 Jun 2009 14:37:13 UTC
Sent 16 Jun 2009 23:28:26 UTC
Received 16 Jun 2009 23:31:04 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 193 (0xc1)
Computer ID 4874432
Report deadline 10 Jul 2009 15:39:13 UTC
Run time 33.01
stderr out

<core_client_version>6.6.20</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>

SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r :p

setiathome_CUDA: Found 1 CUDA device(s):
Device 1 : GeForce 8400 GS
totalGlobalMem = 536608768
sharedMemPerBlock = 16384
regsPerBlock = 8192
warpSize = 32
memPitch = 262144
maxThreadsPerBlock = 512
clockRate = 1400000
totalConstMem = 65536
major = 1
minor = 1
textureAlignment = 256
deviceOverlap = 0
multiProcessorCount = 1
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce 8400 GS is okay
SETI@home using CUDA accelerated device GeForce 8400 GS
setiathome_enhanced 6.01 Revision: 402 g++ (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
libboinc: BOINC 6.5.0

Work Unit Info:
...............
WU true angle range is : 0.411166
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_GetPowerSpectrum 0.00075 0.00000
v_ChirpData 0.02875 0.00000
v_Transpose4 0.01596 0.00000
FPU opt folding 0.00284 0.00000
SIGILL: illegal instruction
Stack trace (9 frames):
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x47cba9]
/lib/libc.so.6[0x7f187825b2b0]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu(_Z16v_BaseLineSmoothPA2_fiii+0x25e)[0x40be98]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x40da23]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x419f23]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x424c7d]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu[0x407f60]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7f1878247486]
setiathome-CUDA-6.08.x86_64-pc-linux-gnu(__gxx_personality_v0+0x241)[0x407be9]

Exiting...

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 0.074659318426745
Granted credit 0
application version 6.08



any ideas what is going on?


ID: 908246 · Report as offensive
Evangelos Katikos
Avatar

Send message
Joined: 18 Oct 99
Posts: 17
Credit: 101,612,081
RAC: 0
Greece
Message 908948 - Posted: 18 Jun 2009, 22:11:50 UTC - in response to Message 908246.  

Are you absolutely sure that nvidia module is loaded? Checked with lsmod?

Have you installed properly the cuda libraries?

Any link to your host?
ID: 908948 · Report as offensive
Chuck Gorish

Send message
Joined: 19 Jun 00
Posts: 156
Credit: 29,589,106
RAC: 0
United States
Message 909358 - Posted: 20 Jun 2009, 1:09:30 UTC - in response to Message 908948.  

Are you absolutely sure that nvidia module is loaded? Checked with lsmod?

Have you installed properly the cuda libraries?

Any link to your host?


nvidia module is loaded and cuda sdk bandwidth and nbody tests gave proper results. i have installed 64bit cuda in several machines including my own workstation. the only difference is this is the first opteron and the first machine without any X installed i have tried this on. i have many standard multibeam 64bit opteron installations running but never tried a cuda on an opteron server before.

i suspect this is simply not usable in this machine. first even though tyan assures me it is compatible i do not like them sticking a standard pci card into a pci-x slot, and tyan also states that their server bios does not properly initialize 'external' video cards since it has on board video. i am about to give this project up and advise my tech to install the card in a workstation instead.



ID: 909358 · Report as offensive

Questions and Answers : GPU applications : linux cuda computation errors


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.