Could someone look at this please

Message boards : Number crunching : Could someone look at this please
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Neil Blaikie
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 143
Credit: 6,652,341
RAC: 0
Canada
Message 1025010 - Posted: 14 Aug 2010, 14:26:54 UTC

Please if someone could have a look at this an tell me what seems to be going on, getting errors off CUDA that are becoming annoying now.

http://setiathome.berkeley.edu/result.php?resultid=1681831492
ID: 1025010 · Report as offensive
Profile Dave Cummings
Volunteer tester

Send message
Joined: 16 May 09
Posts: 219
Credit: 1,193,729
RAC: 0
United Kingdom
Message 1025014 - Posted: 14 Aug 2010, 14:39:24 UTC - in response to Message 1025010.  

Hi,
Can you please give us some more details - do you know what model of graphics card you have and what driver you are running?

also does this happen right away, or as with mine does it do it after a few hours? I have mine set to reboot onc every 12 hours to stop problems like this, its easily done and helps clear the system out. I have mine set to autologin and atuolock at boot

dave
ID: 1025014 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1025022 - Posted: 14 Aug 2010, 15:18:06 UTC

Are you using a screen saver as well as CUDA at the same time on the same videocard?

For more solutions to Incorrect function. (0x1) - exit code 1 (0x1), see this FAQ. If not the screen saver, I'd be going for a stuck task in memory as well.
ID: 1025022 · Report as offensive
Profile Area 51
Avatar

Send message
Joined: 31 Jan 04
Posts: 965
Credit: 42,193,520
RAC: 0
United Kingdom
Message 1025061 - Posted: 14 Aug 2010, 17:41:15 UTC - in response to Message 1025010.  

Build features: Non-graphics CUDA VLAR autokill enabled FFTW USE_SSE x86


Are you using the vlar kill optimised app by any chance?
ID: 1025061 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1025065 - Posted: 14 Aug 2010, 17:59:56 UTC

Hi

If this behaviour has become worse or so i suspect that you'd try to use lunatics latest universal installer over at. lunatics.kwsn.net .

When that doesn't work i'd try to download gpu-z to monitor temps and behaviour.
I'd be worried if gpu temps are over 90 degrees C on the gpu.

You could even try to download the GPU Caps Viewer benchmark and run the hairy cube test to see if your gpu locks down.

If it don't uninstall your driver completelly and back down to a nvidia 19x.xx driver to see if that helps.

When nothing else works there is a testprogram for memory errors called Memtest G80. With that you can allocate blocks to see if your gpu memory is healthy.

And if that doesn't work either check your PSU voltages so it can sustain power enough but that's just a hunch!

As you can see there could be loads of stuff that makes the gpu behave strange!


Kind regards Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1025065 · Report as offensive
Profile Neil Blaikie
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 143
Credit: 6,652,341
RAC: 0
Canada
Message 1025240 - Posted: 15 Aug 2010, 6:55:15 UTC - in response to Message 1025014.  

Graphics card is a GTS250, running driver 258.96 on windows 7 64bit with the lunatics unified installer package for x64.

Almost immediately it comes up with computation errors however on some occasions units complete.

Temperatures are fine as I have a large fan blowing into the open sided case, case had to be modded to fit my water cooling loop, (which does not include the GPU).

I do not have a screensaver running.

To -= Vyper =- thank you I will give Memtest G80 a try, I have used others and nothing showed up with them but using another one can't hurt. I do not have anything that I can check the voltages with on the PSU, PSU though is 750w.

I will run Memtest sometime soon and post any additional findings here. Thank you to everyone who has suggested fixes.
ID: 1025240 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1025270 - Posted: 15 Aug 2010, 11:37:41 UTC - in response to Message 1025240.  

No problem mate.

When and if you find a solution, please keep us updated in what it was.

Hey check your PCI-E clock too! Sometimes if you O/C computers the PCI-E bridge tend to follow the FSB clock.
If you have the oppertunity in your bios, lock the PCIE freq to 100 mhz.

Regards Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1025270 · Report as offensive
Profile Neil Blaikie
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 143
Credit: 6,652,341
RAC: 0
Canada
Message 1025294 - Posted: 15 Aug 2010, 13:45:44 UTC

Overnight with no intervention from me, I have had a lot of CUDA units complete.

Just downloaded and run the Furry Cube test from GPU Caps Viewer, got the dreaded "Nvidia Driver Kernel module 258.96 stopped responding and recovered" after a few seconds running it.

GPU Temp is around 55oC, as mentioned the fan is doing it's job.

Not run Memtest G80 yet, that will have to wait until I finish work later on today and will post results on here once complete.

Added info, computer is not OC'ed at all but will check the PCI-E freq anyway just to make sure.
ID: 1025294 · Report as offensive
Profile Neil Blaikie
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 143
Credit: 6,652,341
RAC: 0
Canada
Message 1025352 - Posted: 15 Aug 2010, 18:18:52 UTC

Seems like I did not check properly if I had a screensaver running, sure enough there was but with the monitors set to turn off after a few minutes when I was at work then I didn't notice.

Fingers crossed so far CUDA seems to be running without a hitch and is validating results instead of erroring out. Minor "artifacts" showing up on my screens occasionally but nothing to get too worried about yet.

Probably will add the GPU to the water cooling loop at some point so will get a 900w or 1kW PSU in case of any future expansions. My 750w can handle things at the moment.

Thank you to all for your help and I will not post any more to this thread unless something goes wrong again.

Happy Crunching to all.
ID: 1025352 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1025363 - Posted: 15 Aug 2010, 19:19:10 UTC

Ok

You seem like you have some sort of issue with the gpu or power to it for what it seems.

It's not a good sign that furry cube halts and wrecks the driver when only have been run for some seconds.
The gpu doesn't even have time to heat up until it breaks.

Quite positive that you have some issue going on with the graphics card, try running a hefty game with some resolution on and details and you would notice if there is more errors.

Suspect ram errors on the gpu!

Regards Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1025363 · Report as offensive
Profile Neil Blaikie
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 143
Credit: 6,652,341
RAC: 0
Canada
Message 1025381 - Posted: 15 Aug 2010, 20:50:11 UTC - in response to Message 1025363.  

I think I do have a problem, ran memtest a few minutes ago and it crashed on test 15 and came up with the driver kernel module error again :-(

Time to send off the GPU and get it looked at, back to the old GPU for now which can do CUDA but not as fast as the GTS250. Ordered a 1kW PSU which should be here on Tuesday morning, see if that helps if not, then GPU is off for repair / Replacement.

Ah well, confirmation, Flight Sim X does not like the high graphics and a lot of details and came up with the same driver kernel error and crashed the game.
ID: 1025381 · Report as offensive
Profile Neil Blaikie
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 143
Credit: 6,652,341
RAC: 0
Canada
Message 1026684 - Posted: 20 Aug 2010, 13:34:03 UTC

So I have a new GTS250 and a 1Kw psu, I installed my old CUDA card GT8500 along with the new 250, I am having a problem getting them both to crunch CUDA, only seems to crunch on the slower card and I can't figure out why.

Any help appreciated.

Running windows 7 X64, have included the app_info in case I missed something from the lunatics unified installer

<app_info>
<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>AK_v8b_win_x64_SSE3.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>603</version_num>
<platform>windows_intelx86</platform>
<file_ref>
<file_name>AK_v8b_win_x64_SSE3.exe</file_name>
<main_program/>
</file_ref>
</app_version>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>603</version_num>
<platform>windows_x86_64</platform>
<file_ref>
<file_name>AK_v8b_win_x64_SSE3.exe</file_name>
<main_program/>
</file_ref>
</app_version>
<app>
<name>astropulse_v505</name>
</app>
<file_info>
<name>ap_5.05r409_SSE.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>astropulse_v505</app_name>
<version_num>505</version_num>
<platform>windows_intelx86</platform>
<file_ref>
<file_name>ap_5.05r409_SSE.exe</file_name>
<main_program/>
</file_ref>
</app_version>
<app_version>
<app_name>astropulse_v505</app_name>
<version_num>505</version_num>
<platform>windows_x86_64</platform>
<file_ref>
<file_name>ap_5.05r409_SSE.exe</file_name>
<main_program/>
</file_ref>
</app_version>
<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>MB_6.08_CUDA_V12_VLARKill_FPLim2048.exe</name>
<executable/>
</file_info>
<file_info>
<name>cudart.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft.dll</name>
<executable/>
</file_info>
<file_info>
<name>libfftw3f-3-1-1a_upx.dll</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>608</version_num>
<platform>windows_intelx86</platform>
<plan_class>cuda</plan_class>
<avg_ncpus>0.040000</avg_ncpus>
<max_ncpus>0.040000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>MB_6.08_CUDA_V12_VLARKill_FPLim2048.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft.dll</file_name>
</file_ref>
<file_ref>
<file_name>libfftw3f-3-1-1a_upx.dll</file_name>
</file_ref>
</app_version>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>608</version_num>
<platform>windows_intelx86</platform>
<plan_class>cuda23</plan_class>
<avg_ncpus>0.040000</avg_ncpus>
<max_ncpus>0.040000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>MB_6.08_CUDA_V12_VLARKill_FPLim2048.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft.dll</file_name>
</file_ref>
<file_ref>
<file_name>libfftw3f-3-1-1a_upx.dll</file_name>
</file_ref>
</app_version>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>608</version_num>
<platform>windows_x86_64</platform>
<plan_class>cuda</plan_class>
<avg_ncpus>0.040000</avg_ncpus>
<max_ncpus>0.040000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>MB_6.08_CUDA_V12_VLARKill_FPLim2048.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft.dll</file_name>
</file_ref>
<file_ref>
<file_name>libfftw3f-3-1-1a_upx.dll</file_name>
</file_ref>
</app_version>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>608</version_num>
<platform>windows_x86_64</platform>
<plan_class>cuda23</plan_class>
<avg_ncpus>0.040000</avg_ncpus>
<max_ncpus>0.040000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>MB_6.08_CUDA_V12_VLARKill_FPLim2048.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft.dll</file_name>
</file_ref>
<file_ref>
<file_name>libfftw3f-3-1-1a_upx.dll</file_name>
</file_ref>
</app_version>
</app_info>

ID: 1026684 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1026691 - Posted: 20 Aug 2010, 14:08:06 UTC

Take a look here.

http://boinc.berkeley.edu/wiki/Client_configuration

Create a cc_config.xml file in your boinc programdata directory with the parameter use_all_gpus 1 according to the specification.

Kind regards Vyper


_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1026691 · Report as offensive

Message boards : Number crunching : Could someone look at this please


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.