Cuda work units ending in error


log in

Advanced search

Questions and Answers : Windows : Cuda work units ending in error

Previous · 1 · 2
Author Message
ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1278449 - Posted: 31 Aug 2012, 20:55:02 UTC - in response to Message 1278435.
Last modified: 31 Aug 2012, 21:01:26 UTC

Mind answering the rest as well?
Do you uninstall older drivers before installing the newer ones?
Do you use Driver Sweeper or similar to remove all the remnants of old drivers between uninstalling & reinstalling?

What were the Seti preferences that you changed (it shouldn't make a difference, but tell us anyway)?


Added to that, how large is the hard drive that BOINC's Data directory is installed on?



500 GB/about 450 GB free space....not an issue

I mentioned how I uninstalled the drivers....I uninstalled, deleted leftover folders and used a registry cleaner to remove NVIDIA entries.


And here are my account preferences




Suspend work while computer is on battery power?
Matters only for portable computers

no



Suspend work while computer is in use?

no



Suspend GPU work while computer is in use?
Enforced by version 6.6.21+

no



'In use' means mouse/keyboard activity in last

3 minutes



Suspend work if no mouse/keyboard activity in last
Needed to enter low-power mode on some computers

--- minutes



Suspend work when non-BOINC CPU usage is above
0 means no restriction
Enforced by version 6.10.30+

--- %



Do work only between the hours of
No restriction if equal

---



Leave tasks in memory while suspended?
Suspended tasks will consume swap space if 'yes'

yes



Switch between tasks every
Recommended: 60 minutes

60 minutes



On multiprocessors, use at most

--- processors



On multiprocessors, use at most
Enforced by version 6.1+

100% of the processors



Use at most
Can be used to reduce CPU heat

50% of CPU time



Disk and memory usage



Disk: use at most

10 GB



Disk: leave free at least
Values smaller than 0.001 are ignored

0.1 GB



Disk: use at most

50% of total



Tasks checkpoint to disk at most every

60 seconds



Swap space: use at most

75% of total



Memory: when computer is in use, use at most

90% of total



Memory: when computer is not in use, use at most

95% of total



Network usage



Maintain enough tasks to keep busy for at least
(max 10 days).

0.1 days



... and up to an additional

0.5 days



Confirm before connecting to Internet?
Matters only if you have a modem, ISDN or VPN connection

no



Disconnect when done?
Matters only if you have a modem, ISDN or VPN connection

no



Maximum download rate:

--- Kbytes/sec



Maximum upload rate:

--- Kbytes/sec



Use network only between the hours of

---



Transfer at most
Enforced by version 6.10.46+

--- Mbytes every --- days



Skip image file verification?
Check this ONLY if your Internet provider modifies image files (UMTS does this, for example). Skipping verification reduces the security of BOINC.

no


Resource share
Determines the proportion of your computer's resources allocated to this project. Example: if you participate in two BOINC projects with resource shares of 100 and 200, the first will get 1/3 of your resources and the second will get 2/3.

100



Use CPU
Enforced by version 6.10+

yes



Use ATI GPU
Enforced by version 6.10+

yes



Use NVIDIA GPU
Enforced by version 6.10+

yes



Is it OK for SETI@home and your team (if any) to email you?
Emails will be sent from setiweb@ssl.berkeley.edu; make sure your spam filter accepts this address.

yes



Should SETI@home show your computers on its web site?

yes



Default computer location

home



Graphics preferences
Select Custom to edit individual parameters

SETI@home classic



Color preferences:
Select Custom to edit individual parameters

Rainbow



URL of background image
This JPEG image will be displayed in the background.






URL of logo image
This JPEG image will be displayed in the lower corner.






Run only the selected applications

SETI@home Enhanced: yes
Astropulse v505: no
SETI@home v7: yes
AstroPulse v6: no




If no work for selected applications is available, accept work from other applications?

no

____________

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24522
Credit: 521,562
RAC: 62
United States
Message 1278522 - Posted: 1 Sep 2012, 0:40:00 UTC

The only problem I see ( and it would NOT be causing GPU tasks to error out) is that you allow the GPU to crunch while the user is active at the keyboard. This setting causes many computers to become unresponsive. Of course YMMV.
____________


BOINC WIKI

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1279234 - Posted: 2 Sep 2012, 12:46:46 UTC - in response to Message 1278522.

Okay, here is something I notice. First off, if I Uninstall Boinc, remove all folders then reinstall, it works fine until I shut my computer off. When I restart I start getting the errors.

Next thing I noticed but I'm not sure yet, it seems that when I see the errors start, the WU's usually start out at 13.99%, if I suspend that current WU (the one that is about ready to have an error) then BOINC will skip to the next WU and complete it without any errors....
____________

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24522
Credit: 521,562
RAC: 62
United States
Message 1279449 - Posted: 3 Sep 2012, 0:36:08 UTC - in response to Message 1279234.

Okay, here is something I notice. First off, if I Uninstall Boinc, remove all folders then reinstall, it works fine until I shut my computer off. When I restart I start getting the errors.

Next thing I noticed but I'm not sure yet, it seems that when I see the errors start, the WU's usually start out at 13.99%, if I suspend that current WU (the one that is about ready to have an error) then BOINC will skip to the next WU and complete it without any errors....

Now that is interesting information. I am, however, not certain what it means.
____________


BOINC WIKI

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1279507 - Posted: 3 Sep 2012, 4:18:18 UTC - in response to Message 1279449.

Okay, here is something I notice. First off, if I Uninstall Boinc, remove all folders then reinstall, it works fine until I shut my computer off. When I restart I start getting the errors.

Next thing I noticed but I'm not sure yet, it seems that when I see the errors start, the WU's usually start out at 13.99%, if I suspend that current WU (the one that is about ready to have an error) then BOINC will skip to the next WU and complete it without any errors....

Now that is interesting information. I am, however, not certain what it means.


I guess its info for whoever got their crayon out and wrote the programming for BOINC. They don't seem to be too ontop of trying to fix the problems that everyone keeps talking about on these message boards.
____________

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8404
Credit: 56,958,444
RAC: 76,397
United Kingdom
Message 1279532 - Posted: 3 Sep 2012, 5:42:52 UTC - in response to Message 1275196.

OK, so a quick check show you are using a BETA version of the Nvidia driver, so roll back to the current advised production version. Nvidia advise for your card you use version 301.42. If you run Beta version you must expect problems, they are sent out by manufacturers as a cheap way of stress testing.

(I've seen a couple of adverse comments about the particular version you are using, particularly to do with greatly extended run times)
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1279689 - Posted: 3 Sep 2012, 16:57:24 UTC - in response to Message 1279532.

OK, so a quick check show you are using a BETA version of the Nvidia driver, so roll back to the current advised production version. Nvidia advise for your card you use version 301.42. If you run Beta version you must expect problems, they are sent out by manufacturers as a cheap way of stress testing.

(I've seen a couple of adverse comments about the particular version you are using, particularly to do with greatly extended run times)



I've tried all versions of Nvidia from 2xx to 3xx. Currently on the beta 306.
It seems to have stabilized somewhat. It running some WUs. It still seems to be encountering errors when it finishes up and changes to new WUs. I don't know if it is helping but I manually snooze the cpu before I shut the computer off. WUs start up again when I turn the computer back on without the error.

I don't know what is going on, they definitely have some glitches in the matrix.

I don't know what is going on, they definitely have some glitches in the matrix.
____________

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8404
Credit: 56,958,444
RAC: 76,397
United Kingdom
Message 1279709 - Posted: 3 Sep 2012, 17:30:53 UTC

With this having continued through a number of versions I would start to look at the common element - your GPU.
The fact that you are saying it is better soon after a restart makes me think this is a thermal problem rather than anything to do with software. GPU are "famous" for being very good dust collectors, and when the cooling fins and fan get coated in dust they run hot. Not quite "too hot", when they shut down, but hot enough to start to throw errors, visually they work OK, but for intensive processing they are just not right. Next time you power down take the card out and give it a good clean (I use a vacuum cleaner on mine every few months).
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12300
Credit: 2,599,207
RAC: 998
Netherlands
Message 1279725 - Posted: 3 Sep 2012, 17:51:35 UTC - in response to Message 1279709.

(I use a vacuum cleaner on mine every few months).

Great way to lose a piece of hardware! Vacuum cleaners, especially the hoses and thus mouth-pieces, are famous for building up large amounts of static electricity when they run (suck dust).

I'd rather use an old toothbrush and if it's really stuck a toothpick to pick the dust off.
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1279834 - Posted: 4 Sep 2012, 0:04:32 UTC - in response to Message 1279725.

Dust is not the issue. Brand new card, just installed it about 2 weeks ago. When I mention restarts, I mean turning the computer off and then turning right back on so there is no time for the CPU to cool down.
____________

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24522
Credit: 521,562
RAC: 62
United States
Message 1279871 - Posted: 4 Sep 2012, 3:06:55 UTC - in response to Message 1279834.

Dust is not the issue. Brand new card, just installed it about 2 weeks ago. When I mention restarts, I mean turning the computer off and then turning right back on so there is no time for the CPU to cool down.

That makes it sound like a driver issue.
____________


BOINC WIKI

Dell560
Send message
Joined: 5 Sep 12
Posts: 1
Credit: 0
RAC: 0
United States
Message 1280412 - Posted: 5 Sep 2012, 21:15:40 UTC

I have same geforce 210 card Running windows 7 with 500GB hard drive with 400GB free space. I have 4 GB DDR3 ram ... this video card geforce 210 does not have a fan on it and it gets VERY hot to the touch when Im running basic stuff on the net . It gets what looks like the picture is separating in lines and sliding a little like if you pushed it back up it would make one picture again. The computer completely freezes on me I cant ALT+TAB+DEL to the task have to force power down by holding the power button and restart it. I too am getting the 0000005 error whatever it is when it does it. Im pretty sure it has something to do with this card or GPU however possibly overheating? I take my card out and I get no errors my computer runs fine. I am running Avg internet security for virus... spybot S&D... and Malwarebytes Anti-Maleware...If not the card overheating its causing an error writing back to memory or something to that effect. Would love to hear anyone elses opinions or ideas on how to fix this issue I too JUST bought my card about a month ago and I think they are putting out some stinkers.

Profile Gatekeeper
Avatar
Send message
Joined: 14 Jul 04
Posts: 887
Credit: 176,479,616
RAC: 0
United States
Message 1280416 - Posted: 5 Sep 2012, 21:22:44 UTC - in response to Message 1280412.

I have same geforce 210 card Running windows 7 with 500GB hard drive with 400GB free space. I have 4 GB DDR3 ram ... this video card geforce 210 does not have a fan on it and it gets VERY hot to the touch when Im running basic stuff on the net . It gets what looks like the picture is separating in lines and sliding a little like if you pushed it back up it would make one picture again. The computer completely freezes on me I cant ALT+TAB+DEL to the task have to force power down by holding the power button and restart it. I too am getting the 0000005 error whatever it is when it does it. Im pretty sure it has something to do with this card or GPU however possibly overheating? I take my card out and I get no errors my computer runs fine. I am running Avg internet security for virus... spybot S&D... and Malwarebytes Anti-Maleware...If not the card overheating its causing an error writing back to memory or something to that effect. Would love to hear anyone elses opinions or ideas on how to fix this issue I too JUST bought my card about a month ago and I think they are putting out some stinkers.


What does any of this have to do with Seti or BOINC? You have no computers attached to the project.
____________

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1280626 - Posted: 6 Sep 2012, 16:39:29 UTC - in response to Message 1280412.

I have same geforce 210 card Running windows 7 with 500GB hard drive with 400GB free space. I have 4 GB DDR3 ram ... this video card geforce 210 does not have a fan on it and it gets VERY hot to the touch when Im running basic stuff on the net . It gets what looks like the picture is separating in lines and sliding a little like if you pushed it back up it would make one picture again. The computer completely freezes on me I cant ALT+TAB+DEL to the task have to force power down by holding the power button and restart it. I too am getting the 0000005 error whatever it is when it does it. Im pretty sure it has something to do with this card or GPU however possibly overheating? I take my card out and I get no errors my computer runs fine. I am running Avg internet security for virus... spybot S&D... and Malwarebytes Anti-Maleware...If not the card overheating its causing an error writing back to memory or something to that effect. Would love to hear anyone elses opinions or ideas on how to fix this issue I too JUST bought my card about a month ago and I think they are putting out some stinkers.


The 210 I have has a small fan, it is the 1 gig DDR3 card. It was on sale for only $30 from CompUSA. If you want to keep your current card, I think you can buy an auxillary fan for you tower. Check out Compusa, they always have stuff on sale.
____________

Profile BilBg
Volunteer tester
Avatar
Send message
Joined: 27 May 07
Posts: 2695
Credit: 6,101,462
RAC: 5,115
Bulgaria
Message 1281760 - Posted: 9 Sep 2012, 2:21:15 UTC - in response to Message 1279234.

Okay, here is something I notice. First off, if I Uninstall Boinc, remove all folders then reinstall, it works fine until I shut my computer off.
When I restart I start getting the errors.

Next thing I noticed but I'm not sure yet, it seems that when I see the errors start, the WU's usually start out at 13.99%, if I suspend that current WU (the one that is about ready to have an error) then BOINC will skip to the next WU and complete it without any errors....

Maybe something is not fully loaded (after computer restart) when BOINC try to start the apps.
(What happens if you restart BOINC without restarting the computer?)

You may try in cc_config.xml
<cc_config> <options> <start_delay>33</start_delay> </options> </cc_config>



Did you possibly have set to sleep your HDD after a short period (1 minute)?


____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1282491 - Posted: 11 Sep 2012, 3:17:07 UTC - in response to Message 1281760.

Okay, here is something I notice. First off, if I Uninstall Boinc, remove all folders then reinstall, it works fine until I shut my computer off.
When I restart I start getting the errors.

Next thing I noticed but I'm not sure yet, it seems that when I see the errors start, the WU's usually start out at 13.99%, if I suspend that current WU (the one that is about ready to have an error) then BOINC will skip to the next WU and complete it without any errors....

Maybe something is not fully loaded (after computer restart) when BOINC try to start the apps.
(What happens if you restart BOINC without restarting the computer?)

You may try in cc_config.xml
<cc_config> <options> <start_delay>33</start_delay> </options> </cc_config>



Did you possibly have set to sleep your HDD after a short period (1 minute)?




Nope, don't have the power setting set to put the HDD to sleep. The only thing I have it set for is the BOINC screensaver.

Maybe I should point in another direction. I have both Norton and Microsoft AntiVirus installed on this machine. I exluded the BOINC directories from the scans but maybe I'm missing something.
____________

Previous · 1 · 2

Questions and Answers : Windows : Cuda work units ending in error

Copyright © 2014 University of California