Cuda work units ending in error


log in

Advanced search

Questions and Answers : Windows : Cuda work units ending in error

1 · 2 · Next
Author Message
ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1275196 - Posted: 25 Aug 2012, 4:53:26 UTC

The last 25 CUDA work units I started today ended in with ERROR WHILE COMPUTING.
I just installed the GPU in this new computer. It was working fine for the first two days, processing WU's in about 25 minutes. Now I'm getting errors with CUDA packets.

I'm still using the same NVIDia driver from yesterday. I've changed some the computing preferences and added BOINClogX and seti map view. Now I'm getting the errors. Any suggestions?

Profile Gatekeeper
Avatar
Send message
Joined: 14 Jul 04
Posts: 887
Credit: 176,479,616
RAC: 36
United States
Message 1275225 - Posted: 25 Aug 2012, 5:56:32 UTC
Last modified: 25 Aug 2012, 5:57:17 UTC

Well, the predominant error seems to be: SETI@home error -1 Can't create file -- disk full?
in checkpoint()
File: ..\seti.cpp
Line: 395

Off the top of my head, I'd say either your HDD is near full, or, you don't have enough space allocated to BOINC/S@H.
____________

rob smith
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8122
Credit: 52,313,310
RAC: 78,066
United Kingdom
Message 1275253 - Posted: 25 Aug 2012, 6:49:18 UTC

You also get this error if the directory is write protected to the BOINC user (or at least that's what happens under Linux)
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1275339 - Posted: 25 Aug 2012, 12:01:25 UTC - in response to Message 1275253.

You also get this error if the directory is write protected to the BOINC user (or at least that's what happens under Linux)


I was playing with the directories in an attempt to get BOINCLogX to work. I had to change the properties on the programdata file from hidden. But I don't think any of them were changed to Read Only...
____________

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1275572 - Posted: 26 Aug 2012, 0:07:52 UTC - in response to Message 1275339.

It's run an astropulse work unit without error. See what happens with the next cuda unit I get.
____________

Profile BilBg
Volunteer tester
Avatar
Send message
Joined: 27 May 07
Posts: 2568
Credit: 5,856,675
RAC: 2,242
Bulgaria
Message 1275624 - Posted: 26 Aug 2012, 5:32:05 UTC - in response to Message 1275572.
Last modified: 26 Aug 2012, 5:57:11 UTC


http://setiathome.berkeley.edu/result.php?resultid=2574833931

Since sometimes tasks start/restart OK and only sometimes error:
Restarted at 7.08 percent. (OK)
Restarted at 8.33 percent. (OK)
Restarted at 9.21 percent. (SETI@home error -1 Can't create file -- disk full?)

- How much free space do you have on the HDD (on the partition where BOINC Data is, probably C:)?
- How much HDD space do you allow for BOINC use (look in Disk tab in BOINC Manager)?
- Do you see high HDD load (HDD LED/light is ON almost all the time)?

- Your Antivirus may be locking files for too long when it scans them (when a file is created/written it is usually scanned by the resident (On-Access) Antivirus module)
- Check Antivirus logs/Quarantine for any BOINC related files (some Antiviruses delete 'suspicious' files (they scramble and move the files to something called 'Quarantine'/'Vault'/'Chest' or similar))
- If you see that Antivirus is responsible for errors - try to exclude BOINC Data directory from active monitoring/scans.

- Did you try Reboot (Restart Windows)?


____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1276417 - Posted: 27 Aug 2012, 20:15:58 UTC - in response to Message 1275624.

I don't see any history on Norton that indicates it's blocking Boinc. Disc space is not a problem. I've got 100 gigs set aside on a 500 gig drive and the drive itself is only about 10% full.

So a further update on the errors.

Seti enhanced works.
Astropulse works.

It's only the cuda work units that end in error, everyone of them end in error.
____________

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1276418 - Posted: 27 Aug 2012, 20:17:51 UTC - in response to Message 1276417.

I don't see any history on Norton that indicates it's blocking Boinc. Disc space is not a problem. I've got 100 gigs set aside on a 500 gig drive and the drive itself is only about 10% full.

So a further update on the errors.

Seti enhanced works.
Astropulse works.

It's only the cuda work units that end in error, everyone of them end in error.


The work units say Error-True Exit -1 (0xffffff)
____________

Profile BilBg
Volunteer tester
Avatar
Send message
Joined: 27 May 07
Posts: 2568
Credit: 5,856,675
RAC: 2,242
Bulgaria
Message 1276595 - Posted: 28 Aug 2012, 8:44:45 UTC - in response to Message 1276418.
Last modified: 28 Aug 2012, 9:15:08 UTC


It's only the CUDA work units that end in error, everyone of them end in error.

The work units say Error-True Exit -1 (0xffffff)

We can see the 'Error tasks for computer 6748434' here:
http://setiathome.berkeley.edu/results.php?hostid=6748434&offset=0&show_names=0&state=5&appid=

Some of tasks crash:
- exit code -1073741819 (0xc0000005)
Restarted at 13.99 percent.


... but most tasks give:

- exit code -1 (0xffffffff)
Restarted at 13.99 percent.
SETI@home error -1 Can't create file -- disk full?

It is very strange that all of them 'Restarted at 13.99 percent.'
(They stop processing and restart again probably because you set 'Do not use GPU while user active')

Try (temporarily, to isolate the problem):
- From Activity menu -> Use GPU always
- Do not touch the mouse and keyboard for 5-10 minutes
- Exit maximum number of programs (e.g. browsers, media players, games, hardware monitors, gadgets)
- Set 'Suspend work when non-BOINC CPU usage is above' to 0 (zero)

- And Really test if disabling Norton for 5-10 minutes will eliminate the errors
For me this is very probable cause of errors, based on many reports from the past - try to Google for: BOINC Antivirus Norton

I use NOD32 (which is not causing problems for me), but there are many reports about Trend Micro, Avast!, Avira, McAfee, Kaspersky, Comodo, Norton
http://boinc.berkeley.edu/dev/forum_thread.php?id=7633
http://boinc.berkeley.edu/dev/forum_thread.php?id=7365
http://boinc.berkeley.edu/dev/forum_thread.php?id=2486

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2466
http://boinc.bakerlab.org/forum_thread.php?id=5463
http://boinc01.cern.ch/test4theory/forum_thread.php?id=518
http://www.primegrid.com/forum_thread.php?id=1184


(if you are afraid to disable Norton - stop Internet first (e.g. switch Off the Router) and do not use the computer, just let BOINC run for a while.
Of course you have to have CUDA tasks already downloaded to do the test.
You may suspend them all and [Resume] one by one for tests.
)

If you see that with disabled Norton no more CUDA errors happen - exclude BOINC Data directory from active monitoring/scans.


This is shown for almost all of your CUDA tasks:
Stderr output <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> - exit code -1 (0xffffffff) </message> <stderr_txt> setiathome_CUDA: Found 1 CUDA device(s): Device 1 : GeForce 210 totalGlobalMem = 1073741824 sharedMemPerBlock = 16384 regsPerBlock = 16384 warpSize = 32 memPitch = 2147483647 maxThreadsPerBlock = 512 clockRate = 1238000 totalConstMem = 65536 major = 1 minor = 2 textureAlignment = 256 deviceOverlap = 1 multiProcessorCount = 2 setiathome_CUDA: CUDA Device 1 specified, checking... Device 1: GeForce 210 is okay SETI@home using CUDA accelerated device GeForce 210 Restarted at 13.99 percent. SETI@home error -1 Can't create file -- disk full? in checkpoint() File: ..\seti.cpp Line: 395 </stderr_txt> ]]>

____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1276951 - Posted: 29 Aug 2012, 10:19:18 UTC - in response to Message 1276595.

Okay, I removed Boinc and cleaned it from the registry. Did a fresh install and Cuda packets seem to work until it suspended and tried to restart. At that point, the Cuda unit hit an error and the rest of the units in my cache quick ran and hit errors. So I seem to be having a restart issue.

I downloaded the latest beta version of the nvida driver and put an exception in Norton for the Boinc directories. Now I just need berkley to come back on line so I can get some more work units.
____________

Profile BilBg
Volunteer tester
Avatar
Send message
Joined: 27 May 07
Posts: 2568
Credit: 5,856,675
RAC: 2,242
Bulgaria
Message 1277169 - Posted: 29 Aug 2012, 16:44:55 UTC - in response to Message 1276951.
Last modified: 29 Aug 2012, 16:52:10 UTC

So I seem to be having a restart issue.

Hmm, I seem to remember that some BOINC version have this 'restart issue' (e.g. with 'Snooze GPU' from the tray icon) but do not remember which BOINC version was it.

P.S.
You have 'Computers hidden', is it for purpose or you just forget to:
Should SETI@home show your computers on its web site? yes
http://setiathome.berkeley.edu/prefs.php?subset=project

With 'Computers hidden' you will get much less help.
Anyone will be bored to ask for every small single detail as "What BOINC version do you use?"


____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1277872 - Posted: 31 Aug 2012, 2:53:12 UTC - in response to Message 1277169.

This is getting ridiculous. I removed and reinstalled everything. GPU, drivers, boinc.

Now every cuda wu ends in error. And it won't download anything but cuda workunits. So it is into downloading "enhanced" work units anymore.

WTF....?
____________

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24055
Credit: 516,934
RAC: 134
United States
Message 1277898 - Posted: 31 Aug 2012, 3:53:38 UTC

There are two orthogonal items.

Enhanced vs AstroPulse

CPU vs GPU. (CUDA is a GPU programming language).

Have you blown the dust bunnies out of your computer recently?

Have you overclocked your computer?

Have you checked the temperatures in your computer recently?
____________


BOINC WIKI

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1278045 - Posted: 31 Aug 2012, 9:17:58 UTC - in response to Message 1277898.

There are two orthogonal items.

Enhanced vs AstroPulse

CPU vs GPU. (CUDA is a GPU programming language).

Have you blown the dust bunnies out of your computer recently?

Have you overclocked your computer?

Have you checked the temperatures in your computer recently?


I installed my GPU about a week and a half ago. It worked fine for the first few days but then I changed some of the seti@home computing preferences and upgraded the NVIDIA driver then I started having problems. The CUDA WUs started having errors. The ehanced and astropulse WUs worked. I tried uninstalling/reinstalling everything and I'm having no luck. I can't stop the errors. Now it seems that seti only wants to run on WU (a cuda WU) at a time even though I don't have that option selected. I'm at a loss.

If there are any screen shots or work results that I can post up to assist then let me know.
____________

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1278127 - Posted: 31 Aug 2012, 11:41:25 UTC - in response to Message 1278045.

Here is the latest result, just reinstalled BOINC, it was running fine. So restarted the computer and got the error. Any thoughts? Can't create file, disc full error....


<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
- exit code -1 (0xffffffff)
</message>
<stderr_txt>
setiathome_enhanced 6.02 DevC++/MinGW
libboinc: 6.3.6

Work Unit Info:
...............
WU true angle range is : 0.428419
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_vGetPowerSpectrumUnrolled2 0.00012 0.00000
v_ChirpData 0.00591 0.00000
v_vTranspose4x16ntw 0.00203 0.00000
BH SSE folding 0.00051 0.00000
Restarted at 13.99 percent.
SETI@home error -1 Can't create file -- disk full?
in checkpoint()
File: ../seti.cpp
Line: 395


</stderr_txt>
]]>

____________

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12258
Credit: 2,544,727
RAC: 264
Netherlands
Message 1278214 - Posted: 31 Aug 2012, 14:29:39 UTC - in response to Message 1278045.
Last modified: 31 Aug 2012, 14:29:56 UTC

It worked fine for the first few days but then I changed some of the seti@home computing preferences and upgraded the NVIDIA driver then I started having problems.

I tried uninstalling/reinstalling everything

So it ran fine with a previous videocard driver and (whichever) preference settings, but then to fix things you uninstall what? Did you uninstall your present videocard driver and reinstall the previous one that you were using? Or did you uninstall & reinstall BOINC?

I ask this as I can tell you that the latter won't have any impact on the situation, while when you perform the former (uninstall present vid driver and reinstall the previous one) you will probably have more luck.

The latest driver isn't always the best one (for your card/system).
Before you say you don't have the older driver installer on your system anymore, do know that you can either always re-download it from Nvidia's web site or it probably came on the driver CD you got with the videocard.
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1278306 - Posted: 31 Aug 2012, 16:46:54 UTC - in response to Message 1278214.

It worked fine for the first few days but then I changed some of the seti@home computing preferences and upgraded the NVIDIA driver then I started having problems.

I tried uninstalling/reinstalling everything

So it ran fine with a previous videocard driver and (whichever) preference settings, but then to fix things you uninstall what? Did you uninstall your present videocard driver and reinstall the previous one that you were using? Or did you uninstall & reinstall BOINC?

I ask this as I can tell you that the latter won't have any impact on the situation, while when you perform the former (uninstall present vid driver and reinstall the previous one) you will probably have more luck.

The latest driver isn't always the best one (for your card/system).
Before you say you don't have the older driver installer on your system anymore, do know that you can either always re-download it from Nvidia's web site or it probably came on the driver CD you got with the videocard.



Currently using the 306 beta driver but I've tried all of them 306 301 2xx.... No luck
____________

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12258
Credit: 2,544,727
RAC: 264
Netherlands
Message 1278328 - Posted: 31 Aug 2012, 17:11:49 UTC - in response to Message 1278306.

Do you uninstall older drivers before installing the newer ones?
Do you use Driver Sweeper or similar to remove all the remnants of old drivers between uninstalling & reinstalling?

What were the Seti preferences that you changed (it shouldn't make a difference, but tell us anyway)?

Since you're now also returning error 0xc0000005s, please see this BOINC FAQ for pointers.

As for the error 0xffffffffs, please state all the settings for
Disk: use at most GB
Disk: leave free at least GB
Disk: use at most of total
Swap space: use at most of total

Check these values both in your computing preferences AND in BOINC Manager->Advanced view->Tools->Computing preferences->Disk and memory usage. If you don't (want to) use the latter, click the Clear button to leave here. This will force BOINC to use the online preferences.
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

ctymountie
Send message
Joined: 23 Jul 05
Posts: 20
Credit: 28,604
RAC: 0
United States
Message 1278395 - Posted: 31 Aug 2012, 18:40:14 UTC - in response to Message 1278328.
Last modified: 31 Aug 2012, 18:44:23 UTC

Do you uninstall older drivers before installing the newer ones?
Do you use Driver Sweeper or similar to remove all the remnants of old drivers between uninstalling & reinstalling?

What were the Seti preferences that you changed (it shouldn't make a difference, but tell us anyway)?

Since you're now also returning error 0xc0000005s, please see this BOINC FAQ for pointers.

As for the error 0xffffffffs, please state all the settings for
Disk: use at most GB
Disk: leave free at least GB
Disk: use at most of total
Swap space: use at most of total

Check these values both in your computing preferences AND in BOINC Manager->Advanced view->Tools->Computing preferences->Disk and memory usage. If you don't (want to) use the latter, click the Clear button to leave here. This will force BOINC to use the online preferences.


10 GB most
.1 GB free
leave 50% free
75% swap space

And I deleted leftover subfolders and used a registry cleaner after remvoing the old drivers.
____________

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12258
Credit: 2,544,727
RAC: 264
Netherlands
Message 1278435 - Posted: 31 Aug 2012, 20:07:08 UTC - in response to Message 1278395.

Mind answering the rest as well?

Do you uninstall older drivers before installing the newer ones?
Do you use Driver Sweeper or similar to remove all the remnants of old drivers between uninstalling & reinstalling?

What were the Seti preferences that you changed (it shouldn't make a difference, but tell us anyway)?


Added to that, how large is the hard drive that BOINC's Data directory is installed on?
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

1 · 2 · Next

Questions and Answers : Windows : Cuda work units ending in error

Copyright © 2014 University of California