SETI Cuda Errors

Questions and Answers : GPU applications : SETI Cuda Errors
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
kevin6912
Volunteer tester

Send message
Joined: 18 Jul 99
Posts: 17
Credit: 10,539,602
RAC: 0
United States
Message 843990 - Posted: 23 Dec 2008, 4:01:47 UTC

This Host

Windows Vista 32

Nvidia 9800GT 1024MB
nvlddmkm: 7.15.11.8048 ForceWare: 180.48

SETI Cuda version: 6.05

Workunit with errors:

Workunit 383990980 Task 1098128917
Message(s):
- exit code -5 (0xfffffffb)

Cuda error 'find_pulse_kernel2<3, false>' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.
SETI@home error -5 Can't open file
(work_unit.sah) in read_wu_state() errno=2

File: ..\worker.cpp
Line: 122


Workunit 383990999 Task 1098128915
Message(s):
Incorrect function. (0x1) - exit code 1 (0x1)


Workunit 383990770 Task 1098128475
Message(s):
Cuda error 'find_pulse_kernel2<3, false>' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.


Workunit 383991014 Task 1098128933
Message(s):
No heartbeat from core client for 30 sec - exiting
Cuda error 'find_pulse_kernel2<3, false>' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.


Workunit 383991490 Task 1098129942
Message(s):
Cuda error 'find_pulse_kernel2<3, false>' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.

Processing task 1098129942 caused the video driver to fail and recover with Windows BSoD.

These are the errors I have so far.

Thanks
Kevin
ID: 843990 · Report as offensive
Maik

Send message
Joined: 15 May 99
Posts: 163
Credit: 9,208,555
RAC: 0
Germany
Message 844079 - Posted: 23 Dec 2008, 7:50:22 UTC
Last modified: 23 Dec 2008, 8:22:32 UTC

And here my List of Cuda errors ;)

  • OS: WinXP Pro x86 SP3
  • Co-CPU: GeForce 9600 GT, installed driver: nv4_disp 6.14.11.8048 - nVIDIA ForceWare 180.48
  • Cuda-app: MB_6.04_Winx86_CUDA.exe file-version 6.2.0.0


Task's aborted manual via BM (because Cuda-app was stuck):


  • Task: 1098051286
    Error: Unhandled Exception Detected...
    - Unhandled Exception Record -
    Reason: Breakpoint Encountered (0x80000003) at address 0x7C91120E
  • Task: 1097953753
    Error: same as above
  • Task: 1097953246
    Error: same as above


Task's aborted by Cuda-app (compute error):


  • Task: 1098051279
    Error: Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.
  • Task: 1098051244
    Error: same as above
  • Task: 1098051239
    Error: same as above
  • Task: 1097953252
    Error: same as above
  • Task: 1097953247
    Error: same as above
  • Task: 1097953233
    Error: same as above
  • Task: 1094382786
    ==>> I think, this WU was crunched [edit] with beta drivers (Cuda v2.1 ForceWare 180.60), if i remeber well[/edit]
    Error: SETI@home error -12 Unknown error
    cudaAcc_find_triplets doesn't support more than MAX_TRIPLETS_ABOVE_THRESHOLD numBinsAboveThreshold in find_triplets_kernel
    File: c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu
    Line: 232


I have a lot more errors, but it seems that all reporting the same problem ...


ID: 844079 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 844132 - Posted: 23 Dec 2008, 11:50:21 UTC

Thanks guys, forwarded to the developers.
ID: 844132 · Report as offensive
Maik

Send message
Joined: 15 May 99
Posts: 163
Credit: 9,208,555
RAC: 0
Germany
Message 844511 - Posted: 24 Dec 2008, 9:29:27 UTC
Last modified: 24 Dec 2008, 9:57:31 UTC

some "new" crashed task's ... looks like that all are those "VLAR" (VeryLowAngelRange) - WU's

  • OS: WinXP Pro x86 SP3
  • Co-CPU: GeForce 9600 GT, installed driver: nv4_disp 6.14.11.8048 - nVIDIA ForceWare 180.48
  • Cuda-app: MB_6.04_Winx86_CUDA.exe file-version 6.2.0.0

all task's aborted by Cuda-app (compute error) with error message:
Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.


  • Task: 1098051287
    WU true angle range is : 0.008693
  • Task: 1098051279
    WU true angle range is : 0.008693
  • Task: 1098051254
    WU true angle range is : 0.008693
  • Task: 1098051250
    WU true angle range is : 0.011198
  • Task: 1098051248
    WU true angle range is : 0.011198
  • Task: 1098051244
    WU true angle range is : 0.008693
  • Task: 1098051239
    WU true angle range is : 0.011198


[edit]
Last night, after project downtime i got a lot of new WU's for Cuda.
One task was blocking the Cuda-crunching the half night. I stopped seti and did a search for those VLAR-WU's how reported at this threat
I've aborted 29! task's and ... surprise, surprise ... no errors ;P
[/edit]


ID: 844511 · Report as offensive
Jörg

Send message
Joined: 10 Dec 02
Posts: 51
Credit: 1,547,286
RAC: 0
Germany
Message 844572 - Posted: 24 Dec 2008, 15:03:26 UTC - in response to Message 844511.  
Last modified: 24 Dec 2008, 15:04:43 UTC

Hello,

same problem here.

Vista Ultimate 64bit
8800 GTS (512MB)
Driver 180.48

WU-ID:
385018060
385018044
385018054
385018038
385018048
385018020
385018024
385017996
385018000
385017994
384429952
384603180
384603280
384583107
384574941
384574934
384574928
384556073
384556136
384555982
384556124
384555847
384555841
384555835
384555812
384555823
384551996
384551153
384551091
384547812
384520477
384520467
ID: 844572 · Report as offensive
Maik

Send message
Joined: 15 May 99
Posts: 163
Credit: 9,208,555
RAC: 0
Germany
Message 844975 - Posted: 25 Dec 2008, 13:22:14 UTC
Last modified: 25 Dec 2008, 13:55:13 UTC

A further request from the developers. When you report errors, could you also please:

a) state if you overclock the GPU and if so by how much?
b) tell if you have tried to clock the GPU to default speeds and still see the problem?
c) did you set the fan speed manually to anything else than default?
d) which program(s) do you use to overclock the GPU, fan etc.?
e) which program(s) do you use to keep check of the GPU?
source

a) I did not overclock my GPU by myself. I don't have any clue in that.
b) ~
c) Fan-speed is not set manually.
d) ~
e) CUDA-Z | Everest Ultimate Edition

Everest Informations (seti-cuda is running)

    GPU-Diode: 49°C
    Fan-Speed: 35%
    GPU Clock (Shader Domain): 1600 MHz (original: 1625 MHz)
    GPU Clock (Geometric Domain): 650 MHz (original: 650 MHz)
    RAMDAC Clock: 400 MHz
    Memory-Bustyp: GDDR3
    Memory-Bus Width: 256-bit
    Memory-Real-Clock: 900 MHz (DDR) (original: 900 MHz)
    Memory-Effective-Clock: 1800 MHz
    Bandwidth: 56.3 GB/s


CUDA-Z Report (report was created while seti-cuda was running)
=============
Version: 0.4.74
http://cuda-z.sourceforge.net/
OS Version: Windows x86 5.1.2600 Service Pack 3

Core Information
----------------


    Name: GeForce 9600 GT
    Compute Capability: 1.1
    Clock Rate: 1600 MHz
    Multiprocessors: 8
    Warp Size: 32
    Regs Per Block: 8192
    Threads Per Block: 512
    Threads Dimentions: 512 x 512 x 64
    Grid Dimentions: 65535 x 65535 x 1


Memory Information
------------------


    Total Global: 511.688 MB
    Shared Per Block: 16 KB
    Pitch: 256 KB
    Total Constant: 64 KB
    Texture Alignment: 256
    GPU Overlap: Yes


Performance Information
-----------------------
Memory Copy


    Host Pinned to Device: 1611.2 MB/s
    Host Pageable to Device: 719.592 MB/s
    Device to Host Pinned: 1565.75 MB/s
    Device to Host Pageable: 768.56 MB/s
    Device to Device: 9582.05 MB/s


GPU Core Performance


    Single-precision Float: 99915.7 Mflop/s
    Double-precision Float: Not Supported
    32-bit Integer: 37078.1 Miop/s
    24-bit Integer: 134620 Miop/s



Official Specifications from nvidia (source)

GPU Engine Specs:


    Processor Cores 64
    Graphics Clock (MHz) 650 MHz
    Processor Clock (MHz) 1625 MHz
    Texture Fill Rate (billion/sec) 20.8

Memory Specs:


    Memory Clock (MHz) 900 MHz
    Standard Memory Config 512 MB
    Memory Interface Width 256-bit
    Memory Bandwidth (GB/sec) 57.6


ID: 844975 · Report as offensive
kevin6912
Volunteer tester

Send message
Joined: 18 Jul 99
Posts: 17
Credit: 10,539,602
RAC: 0
United States
Message 845090 - Posted: 25 Dec 2008, 20:28:39 UTC

A further request from the developers. When you report errors, could you also please:

a) state if you overclock the GPU and if so by how much?
b) tell if you have tried to clock the GPU to default speeds and still see the problem?
c) did you set the fan speed manually to anything else than default?
d) which program(s) do you use to overclock the GPU, fan etc.?
e) which program(s) do you use to keep check of the GPU?

Quoted Source


a) GPU is running at default speed.
b) GPU has always been running at default speed and has had problems.
c) The GPU fan is under default control.
d) I have not used any programs to overclock GPU
e) cuda-z Program information here and/or gpu-z Program information here

Thanks,
Kevin
ID: 845090 · Report as offensive
alpina

Send message
Joined: 18 Dec 08
Posts: 22
Credit: 32,011
RAC: 0
Belgium
Message 846196 - Posted: 29 Dec 2008, 0:00:50 UTC
Last modified: 29 Dec 2008, 0:01:49 UTC

The CUDA application was going quite well(I aborted the VLAR's) untill the following result went wrong with an error that's new for me:

task

OS= Vista 32bit
GPU= 8800GTS 320 MB
driver version= 178.24

My graphics card is not overclocked and the temperature is a constant 81C(highest of the 3 reported temps) under full stress, this is the same temperature I get with the folding@home application and that application runs flawlessly.
ID: 846196 · Report as offensive
alpina

Send message
Joined: 18 Dec 08
Posts: 22
Credit: 32,011
RAC: 0
Belgium
Message 846684 - Posted: 30 Dec 2008, 2:28:38 UTC

Quite a remarkable thing going on with this workunit. While my CUDA task ran fine, the other CUDA task somehow reported an overflow. I tought the "overflow type" of errors were consistent and dependend on the true angle value but this workunit proves otherwise.
ID: 846684 · Report as offensive
alpina

Send message
Joined: 18 Dec 08
Posts: 22
Credit: 32,011
RAC: 0
Belgium
Message 846902 - Posted: 30 Dec 2008, 15:31:35 UTC

Another strange phenomen I discovered, this resultgot validated altough it has no output defining the number of spike counts etc. found. On wath basis got this result trough validation?
ID: 846902 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 846908 - Posted: 30 Dec 2008, 15:43:56 UTC - in response to Message 846902.  
Last modified: 30 Dec 2008, 16:08:23 UTC

Another strange phenomen I discovered, this resultgot validated altough it has no output defining the number of spike counts etc. found. On wath basis got this result trough validation?

You think that's strange, check this task done with CUDA and validated ;). Apparently even though the stderr part is incomplete the science part is in tact and valid.

Quite a remarkable thing going on with this workunit. While my CUDA task ran fine, the other CUDA task somehow reported an overflow. I tought the "overflow type" of errors were consistent and dependend on the true angle value but this workunit proves otherwise.


That was explained in the message
ID: 846908 · Report as offensive
Andrew Roberts

Send message
Joined: 27 Jun 99
Posts: 1
Credit: 1,026,974
RAC: 0
United Kingdom
Message 847139 - Posted: 31 Dec 2008, 8:39:38 UTC

I've seen multiple errors with CUDA.

1) work units with errors
2) multiple graphics driver restarts "Display driver nvlddmkm stopped responding and has successfully recovered."
3) most recently snow on the display

1 and 2 happened in the last couple of weeks, 3 today.

I've now disabled CUDA via the online profile.

Microsoft Windows Vista: Ultimate x64 Editon, Service Pack 1, (06.00.6001.00)
GeForce 9800 GTX
Driver 181.00 (official Nvidia OpenGL 3 driver)
Default clock speeds (675MHz Core, 1688MHz Shader, 1100MHz Memory)
8Gb system RAM, 512Mb GPU ram
video BIOS 62.92.38.00.05
BOINC client version 6.4.5 for windows_x86_64
SETI@home Enchanced 6.06 (cuda)
Processor: 8 GenuineIntel Intel(R) Xeon(R) CPU E5462 @ 2.80GHz [Intel64 Family 6 Model 23 Stepping 6]

No other issues on system prior to installing CUDA client.

Regards

Andrew


ID: 847139 · Report as offensive
Holmis
Volunteer tester

Send message
Joined: 1 Jun 99
Posts: 30
Credit: 951,184
RAC: 0
Sweden
Message 847359 - Posted: 31 Dec 2008, 16:45:48 UTC

I've run almost 60 cuda-tasks and only had one error, this one:

Cuda error 'cufftExecC2C' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_fft.cu' in line 54 : unknown error.

Intel Q9450, Win Vista SP1 64-bit, 4 Gb ram.
GeForce 9600 GT 512Mb
All running at stock speed.
Did not try to alter fan-speed.
Using "Nvidia System Monitor" v6.02.13.01 to monitor GPU-temp.
ID: 847359 · Report as offensive
Slow_Target

Send message
Joined: 5 Oct 02
Posts: 58
Credit: 6,704,641
RAC: 2
United States
Message 847368 - Posted: 31 Dec 2008, 17:08:49 UTC

I stopped a while ago because of soo many errors and driver troubles. I installed the newer beta 180.84(for mine) drivers and decided to give it another try. Well it downloaded a bunch (~50) along with 6.06. They all errored out until I aborted entire batches with the same wu numbering and then all the remaining. I haven't seen what the issues were but I'm backing off again until fixes are in place. The drivers seemed better though, with no pixelation or screen scrambling.
ID: 847368 · Report as offensive
Profile mr.kjellen
Volunteer tester
Avatar

Send message
Joined: 4 Jan 01
Posts: 195
Credit: 71,324,196
RAC: 0
Sweden
Message 848608 - Posted: 3 Jan 2009, 10:09:04 UTC

Had a VLAR unit that errored out, (nothing unusual of course), it gave quite a list of errors though...a little while later the oter GPU (both GPUs are 8800GTSs as seen below) was hit with a similar type of unit and produced the same. All subsequent units errored out.

On the host where I have a 280GTX this does not happen. It does error out the VLARs but then it continues the next unit as usual without errors. Anyone else get this?

Both boxes have Penryn CPUs (X9650 and Q9450), Win Vista32 and driver 180.48. One is a P965 and one an X38 chipset.


<core_client_version>6.4.5</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_CUDA: Found 2 CUDA device(s):
Device 1 : GeForce 8800 GTS 512
totalGlobalMem = 536870912
sharedMemPerBlock = 16384
regsPerBlock = 8192
warpSize = 32
memPitch = 262144
maxThreadsPerBlock = 512
clockRate = 1625000
totalConstMem = 65536
major = 1
minor = 1
textureAlignment = 256
deviceOverlap = 0
multiProcessorCount = 16
Device 2 : GeForce 8800 GTS 512
totalGlobalMem = 536870912
sharedMemPerBlock = 16384
regsPerBlock = 8192
warpSize = 32
memPitch = 262144
maxThreadsPerBlock = 512
clockRate = 1625000
totalConstMem = 65536
major = 1
minor = 1
textureAlignment = 256
deviceOverlap = 0
multiProcessorCount = 16
setiathome_CUDA: CUDA Device 2 specified, checking...
Device 2: GeForce 8800 GTS 512 is okay
SETI@home using CUDA accelerated device GeForce 8800 GTS 512
Rise priority modification by Raistmer based on rev380 of SETI@home sources
Priority of worker thread rised successfully
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
libboinc: 6.3.22

Work Unit Info:
...............
WU true angle range is : 0.007320
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_GetPowerSpectrum 0.00014 0.00000
v_ChirpData 0.01085 0.00000
v_Transpose4 0.00278 0.00000
FPU opt folding 0.00516 0.00000
Cuda error 'find_pulse_kernel2<3, false>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'cudaMemcpy(&flags, dev_find_pulse_flag, sizeof(*dev_find_pulse_flag), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1250 : unknown error.
Cuda error 'cudaMemcpy(PulseResults, dev_PulseResults, 4 * (cudaAcc_NumDataPoints / AdvanceBy + 1) * sizeof(*dev_PulseResults), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1262 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(tmp_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1269 : unknown error.
Cuda error 'cufftExecC2C' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_fft.cu' in line 63 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 160 : unknown error.
Cuda error 'find_triplets_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 224 : unknown error.
Cuda error 'find_triplets_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 224 : unknown error.
Cuda error 'cudaMemcpy(&flags, dev_flag, sizeof(*dev_flag), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 228 : unknown error.
Cuda error 'find_pulse_kernel2<3, false>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.
Cuda error 'find_pulse_kernel2<3, false>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<4, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1172 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'find_pulse_kernel2<5, true>' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1178 : unknown error.
Cuda error 'cudaMemcpy(&flags, dev_find_pulse_flag, sizeof(*dev_find_pulse_flag), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1250 : unknown error.
Cuda error 'cudaMemcpy(PulseResults, dev_PulseResults, 4 * (cudaAcc_NumDataPoints / AdvanceBy + 1) * sizeof(*dev_PulseResults), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1262 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaAcc_transpose' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_transpose.cu' in line 74 : unknown error.
Cuda error 'cudaMemcpy(tmp_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_pulsefind.cu' in line 1269 : unknown error.
Cuda error 'cufftExecC2C' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_fft.cu' in line 63 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaAcc_summax32_kernel' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error.
Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_summax.cu' in line 160 : unknown error.
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected exceeds the storage space allocated.

Flopcounter: 20886869681.456421

Spike count: 30
Pulse count: 0
Triplet count: 0
Gaussian count: 0
called boinc_finish

</stderr_txt>
ID: 848608 · Report as offensive
Maik

Send message
Joined: 15 May 99
Posts: 163
Credit: 9,208,555
RAC: 0
Germany
Message 848651 - Posted: 3 Jan 2009, 11:46:09 UTC - in response to Message 844975.  
Last modified: 3 Jan 2009, 11:48:38 UTC

A further request from the developers. When you report errors, could you also please:

a) state if you overclock the GPU and if so by how much?
b) tell if you have tried to clock the GPU to default speeds and still see the problem?
c) did you set the fan speed manually to anything else than default?
d) which program(s) do you use to overclock the GPU, fan etc.?
e) which program(s) do you use to keep check of the GPU?
source

a) GPU isnt overclocked.
b) ~
c) Fan-speed is now under control of RivaTuner. GPU temp is ~45-47°C while cuda is running.
d) see c)
e) CUDA-Z | Everest Ultimate Edition | RivaTuner


This one killed my host's grafic driver this morning ...
http://setiathome.berkeley.edu/result.php?resultid=1111080345
After them i got a bunch of compute errors / -9 result overflows ...
I had to restart my host. Now it looks that all is running normal.
ID: 848651 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 848654 - Posted: 3 Jan 2009, 12:08:05 UTC - in response to Message 848651.  

This one killed my host's grafic driver this morning ...
http://setiathome.berkeley.edu/result.php?resultid=1111080345

Nice one, but since you're using Raistmer's modified CUDA application, it's of no use to report it here. The Seti CUDA developers won't be able to fix that.

Try if you can reproduce the problem with the 6.06 stock application.
ID: 848654 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 848699 - Posted: 3 Jan 2009, 14:05:47 UTC

Two tasks crashed today - both with the same error

Cuda error 'cudaMemcpy(best_PoT, dev_tmp_pot, max_nb_of_elems * sizeof(float), cudaMemcpyDeviceToHost)' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 1265 : unknown error.

These were tasks
1111358087
1110302101

There was a third earlier on but it has gone from the host tasks list

Have also had to abort two units today
1111358665
1111358623

These 2 just did nothing with no CPU or GPU activity indicated, I let the first one "run" for 2.5 hours and it still showed zero progress after this time

Computer is a P4/3000 + Geforce 8600GT - All running at stock speeds
OS= XP-Home 32bit SP3
Nvidia driver is V6.14.11.7828
BOINC V6.4.5
CUDA V6.06
GPU-Z 0.3.0 indicates a GPU temp of 62 degrees C

Hyperthreading is disabled, normal CPU usage is 10 to 20% bursting to 100% for approx 30secs at completion of one unit and the start of the next

Only other problem is the CUDA client overclaiming credits by approx 20% on long WU's compared to the normal MB CPU client

Regards
Brodo
ID: 848699 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 848894 - Posted: 3 Jan 2009, 21:14:47 UTC - in response to Message 848654.  

This one killed my host's grafic driver this morning ...
http://setiathome.berkeley.edu/result.php?resultid=1111080345

Nice one, but since you're using Raistmer's modified CUDA application, it's of no use to report it here. The Seti CUDA developers won't be able to fix that.

Try if you can reproduce the problem with the 6.06 stock application.


My build almost identical to stock 6.06. Changes not touch any CUDA logic. So surely devs can use errors reported by my build too. In most times even line numbers will be the same ;)

And I personally wanna know if some of results will differ between my build and 6.06 stock.
Please, report such differences via PM to me or post about it in thread dedicated to modded build (or here, will monitor this thread too).
ID: 848894 · Report as offensive
Holmis
Volunteer tester

Send message
Joined: 1 Jun 99
Posts: 30
Credit: 951,184
RAC: 0
Sweden
Message 849201 - Posted: 4 Jan 2009, 10:34:29 UTC

This one crashed for me.

Work Unit Info:
...............
WU true angle range is : 0.006683
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_GetPowerSpectrum 0.00021 0.00000
v_ChirpData 0.02319 0.00000
v_Transpose4 0.00400 0.00000
FPU opt folding 0.00807 0.00000
Cuda error 'find_pulse_kernel2<3, false>' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 1166 : unknown error.


Intel Q9450, Win Vista SP1 64-bit, 4 Gb ram.
GeForce 9600 GT 512Mb
Driver version 180.48
All running at stock speed.
Did not try to alter fan-speed.
Using "Nvidia System Monitor" v6.02.13.01 to monitor GPU-temp, it's about 40C when crunching and 35C idle.
ID: 849201 · Report as offensive
1 · 2 · Next

Questions and Answers : GPU applications : SETI Cuda Errors


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.