Cuda error 'find_pulse_kernel'

Questions and Answers : GPU applications : Cuda error 'find_pulse_kernel'
Message board moderation

To post messages, you must log in.

AuthorMessage
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 916773 - Posted: 11 Jul 2009, 8:15:56 UTC

I'm now running Boinc 6.6.37 and drivers 186.18 on my GTX295. The graphics card is under-clocked by 15% with the fan at 100% to keep the gfx core temps around 80C. It has been running for several weeks without the underclock, temp around 85C, and intermittently causing the host to re-boot but not giving me this error.

Over the past couple of days it has started producing this error:

Work Unit Info:
...............
WU true angle range is :  0.439247
Optimal function choices:
-----------------------------------------------------
name                
-----------------------------------------------------
              v_BaseLineSmooth (no other)
            v_GetPowerSpectrum 0.00015 0.00000 
                   v_ChirpData 0.01706 0.00000 
                  v_Transpose4 0.00338 0.00000 
               FPU opt folding 0.00248 0.00000 
Cuda error 'find_pulse_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 622 : unknown error.
Cuda error 'find_pulse_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 627 : unknown error.
Cuda error 'find_pulse_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 627 : unknown error.
Cuda error 'find_pulse_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 632 : unknown error.
Cuda error 'find_pulse_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 632 : unknown error.
Cuda error 'cudaMemcpy(&flags, dev_find_pulse_flag, sizeof(*dev_find_pulse_flag), cudaMemcpyDeviceToHost)' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu' in line 1318 : unknown error.



Previous discussion on this error seems to centre on VLAR's and these are no VLAR's. This is happening intermittently and sometimes (only sometimes) when it happens, all following CUDA WU's until a total re-boot of the machine report -9's.

Have tried so far:
- Copying all cuda-related exe's and dll's from backup
- Upgrading graphics driver to 186.18
- Upgrading Boinc from 6.6.36 to 6.6.37 (clutching at any straw within reach!)

Any thoughts on possible causes would be welcome. I am nervous about leaving the box crucnhing unattended for any length of time as I am likely to come back and find it has downloaded its daily quota of WU's as CUDA's and returned them all as -9's ;((

F.
ID: 916773 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 916778 - Posted: 11 Jul 2009, 9:51:16 UTC - in response to Message 916773.  

Hi,

Just a thought, since you didn't mention it, but if the pc is OC'd, you might try bringing that down some if not to stock and see what effect that has.
ID: 916778 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 916780 - Posted: 11 Jul 2009, 10:03:41 UTC - in response to Message 916778.  

Hi,

Just a thought, since you didn't mention it, but if the pc is OC'd, you might try bringing that down some if not to stock and see what effect that has.

Yes, the CPU is overclocked. But that hasn't changed in years and it has always been solid as a rock. And this error is so CUDA specific - I'll put that one on the back-burner for now and see if any more CUDA specific solutions are proposed. Thanks for the input.

F.
ID: 916780 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 916781 - Posted: 11 Jul 2009, 10:28:29 UTC - in response to Message 916780.  

Hi,

Just a thought, since you didn't mention it, but if the pc is OC'd, you might try bringing that down some if not to stock and see what effect that has.

Yes, the CPU is overclocked. But that hasn't changed in years and it has always been solid as a rock. And this error is so CUDA specific

Is the PCIe bus running at stock speed?
Years back i had problems with a system because the Adaptec SCSI card didn't like the PCI bus running any faster than stock.
Grant
Darwin NT
ID: 916781 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 916783 - Posted: 11 Jul 2009, 11:12:30 UTC - in response to Message 916781.  

Hi,

Just a thought, since you didn't mention it, but if the pc is OC'd, you might try bringing that down some if not to stock and see what effect that has.

Yes, the CPU is overclocked. But that hasn't changed in years and it has always been solid as a rock. And this error is so CUDA specific

Is the PCIe bus running at stock speed?
Years back i had problems with a system because the Adaptec SCSI card didn't like the PCI bus running any faster than stock.

Yup. Locked to 105 - again unchanged since I built the machine. Thanks for the input.

F.
ID: 916783 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 919974 - Posted: 21 Jul 2009, 2:49:43 UTC
Last modified: 21 Jul 2009, 2:56:34 UTC


Uhh.. normally I don't look here.. only NC and little Café forum..

Because you say 105..

I remember to the time I OCed my old QX6700 with Intel D975XBX2 mobo..
The PCIe was locked @ 100 .

(At my current mobo is all AUTO, to now I didn't found time to learn AMD OC ;-)

I made a short google search and the first URL of the search describes little the topic:
[http://www.google.de/search?hl=de&q=pcie+100+mhz&meta=]---
------> [http://www.tomshardware.co.uk/forum/page-247128_11_0.html]


Hope this help.


[ EDIT: Maybe because of this your GPUs run also too hot? ]



BTW.
Mods/Admins..
It would be better to let CUDA in NC forum..
Or make a new (NC) forum for CUDA.
Here in the Q&A it's not a correct place.
If you think yes, then should be the NC forum also under Q&A.

ID: 919974 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 920020 - Posted: 21 Jul 2009, 8:02:06 UTC - in response to Message 919974.  

I made a short google search and the first URL of the search describes little the topic:
[http://www.google.de/search?hl=de&q=pcie+100+mhz&meta=]---
------> [http://www.tomshardware.co.uk/forum/page-247128_11_0.html]


Hope this help.


[ EDIT: Maybe because of this your GPUs run also too hot? ]


Thanks for the suggestion. I have had the PCI-E locked at 105 for years now, since I first overclocked my Q6600. It has been running perfectly stable on my current setup with the Q9450 / GTX295 for several months. I know that the overclock on the Q9450 has taken the North Bridge on the Asus P5k motherboard very close to the limit and now I have this graphics card squirting out air at 80+ deg C located just below the North Bridge. I have insulated the North Bridge with a length of corrugated cardboard and am trying ways of improving the airflow through the lower part of the case (currently an old AMD CPU cooler which fits without case modification but is too noisy - soon a new hole in the side of the case for a 120mm fan right by the GTX295). I take some pride in the fact that my rig does not (normally) give errors or fail to validate so either of these occurences is of concern to me.

F.
ID: 920020 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 920038 - Posted: 21 Jul 2009, 10:35:00 UTC


You are welcome.

It's possible to reduce the PCIe to 100 ?
I guess normally it's not needed for CUDA crunching to OC the slot.

Yes.. GPU cooling.
In the time I had only two GPUs in/on the mobo, they were ~ 5 °C cooler.
Now with 4 double slot GPUs they are very close and can't get well fresh air.

Ambient ~ 22 °C, GPUs ~ 80 °C, fan AUTO ~ 60 %.

It's well that my GPU cruncher stay in the bottom of the house.
In winter very cool (~ 11 °C) and in summer (~ 23 °C) if outside > 30 °C.

But I wanted also to make them cooler.
So I took a paperboard for the left side of the PC case made 5 holes in it, for 120 mm fans for tests.
But - if 'out-in' or 'in-out' or 'in-out and out-in' the GPUs were ~ 3 °C hotter.
And if I made 'in-out', at the back of the GPUs came nearly no (hot) air out.
The air came left out of the GPU case. (SLI connection openings)
But, your GTX295 have heatsink openings in the GPU case side.. ;-)

So current I have changed the left door with a thin metall with only openings every some mm, that the hot air can go out and fresh air can go in.
Now also little hot air come out of the SLI openings.

But also the dust can go in the PC case - not well.

ID: 920038 · Report as offensive

Questions and Answers : GPU applications : Cuda error 'find_pulse_kernel'


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.