Cuda work units ending in error

疑难解答 : Windows : Cuda work units ending in error
留言板合理

To post messages, you must log in.

前 · 1 · 2

作者消息
Profile Jord
志愿者测试人员
Avatar

发送消息
已加入:9 Jun 99
贴子:15170
积分:4,362,181
近期平均积分:3
Netherlands
消息 1278214 - 发表于:31 Aug 2012, 14:29:39 UTC - 回复消息 1278045.  
最近的修改日期:31 Aug 2012, 14:29:56 UTC

It worked fine for the first few days but then I changed some of the seti@home computing preferences and upgraded the NVIDIA driver then I started having problems.

I tried uninstalling/reinstalling everything

So it ran fine with a previous videocard driver and (whichever) preference settings, but then to fix things you uninstall what? Did you uninstall your present videocard driver and reinstall the previous one that you were using? Or did you uninstall & reinstall BOINC?

I ask this as I can tell you that the latter won't have any impact on the situation, while when you perform the former (uninstall present vid driver and reinstall the previous one) you will probably have more luck.

The latest driver isn't always the best one (for your card/system).
Before you say you don't have the older driver installer on your system anymore, do know that you can either always re-download it from Nvidia's web site or it probably came on the driver CD you got with the videocard.
ID: 1278214 · 举报违规帖子
ctymountie

发送消息
已加入:23 Jul 05
贴子:20
积分:71,081
近期平均积分:0
United States
消息 1278127 - 发表于:31 Aug 2012, 11:41:25 UTC - 回复消息 1278045.  

Here is the latest result, just reinstalled BOINC, it was running fine. So restarted the computer and got the error. Any thoughts? Can't create file, disc full error....


<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
- exit code -1 (0xffffffff)
</message>
<stderr_txt>
setiathome_enhanced 6.02 DevC++/MinGW
libboinc: 6.3.6

Work Unit Info:
...............
WU true angle range is : 0.428419
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_vGetPowerSpectrumUnrolled2 0.00012 0.00000
v_ChirpData 0.00591 0.00000
v_vTranspose4x16ntw 0.00203 0.00000
BH SSE folding 0.00051 0.00000
Restarted at 13.99 percent.
SETI@home error -1 Can't create file -- disk full?
in checkpoint()
File: ../seti.cpp
Line: 395


</stderr_txt>
]]>

ID: 1278127 · 举报违规帖子
ctymountie

发送消息
已加入:23 Jul 05
贴子:20
积分:71,081
近期平均积分:0
United States
消息 1278045 - 发表于:31 Aug 2012, 9:17:58 UTC - 回复消息 1277898.  

There are two orthogonal items.

Enhanced vs AstroPulse

CPU vs GPU. (CUDA is a GPU programming language).

Have you blown the dust bunnies out of your computer recently?

Have you overclocked your computer?

Have you checked the temperatures in your computer recently?


I installed my GPU about a week and a half ago. It worked fine for the first few days but then I changed some of the seti@home computing preferences and upgraded the NVIDIA driver then I started having problems. The CUDA WUs started having errors. The ehanced and astropulse WUs worked. I tried uninstalling/reinstalling everything and I'm having no luck. I can't stop the errors. Now it seems that seti only wants to run on WU (a cuda WU) at a time even though I don't have that option selected. I'm at a loss.

If there are any screen shots or work results that I can post up to assist then let me know.
ID: 1278045 · 举报违规帖子
John McLeod VII
志愿者开发人员
志愿者测试人员
Avatar

发送消息
已加入:15 Jul 99
贴子:24806
积分:790,712
近期平均积分:0
United States
消息 1277898 - 发表于:31 Aug 2012, 3:53:38 UTC

There are two orthogonal items.

Enhanced vs AstroPulse

CPU vs GPU. (CUDA is a GPU programming language).

Have you blown the dust bunnies out of your computer recently?

Have you overclocked your computer?

Have you checked the temperatures in your computer recently?


BOINC WIKI
ID: 1277898 · 举报违规帖子
ctymountie

发送消息
已加入:23 Jul 05
贴子:20
积分:71,081
近期平均积分:0
United States
消息 1277872 - 发表于:31 Aug 2012, 2:53:12 UTC - 回复消息 1277169.  

This is getting ridiculous. I removed and reinstalled everything. GPU, drivers, boinc.

Now every cuda wu ends in error. And it won't download anything but cuda workunits. So it is into downloading "enhanced" work units anymore.

WTF....?
ID: 1277872 · 举报违规帖子
Profile BilBg
志愿者测试人员
Avatar

发送消息
已加入:27 May 07
贴子:3720
积分:9,385,827
近期平均积分:0
Bulgaria
消息 1277169 - 发表于:29 Aug 2012, 16:44:55 UTC - 回复消息 1276951.  
最近的修改日期:29 Aug 2012, 16:52:10 UTC

So I seem to be having a restart issue.

Hmm, I seem to remember that some BOINC version have this 'restart issue' (e.g. with 'Snooze GPU' from the tray icon) but do not remember which BOINC version was it.

P.S.
You have 'Computers hidden', is it for purpose or you just forget to:
Should SETI@home show your computers on its web site? yes
http://setiathome.berkeley.edu/prefs.php?subset=project

With 'Computers hidden' you will get much less help.
Anyone will be bored to ask for every small single detail as "What BOINC version do you use?"


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1277169 · 举报违规帖子
ctymountie

发送消息
已加入:23 Jul 05
贴子:20
积分:71,081
近期平均积分:0
United States
消息 1276951 - 发表于:29 Aug 2012, 10:19:18 UTC - 回复消息 1276595.  

Okay, I removed Boinc and cleaned it from the registry. Did a fresh install and Cuda packets seem to work until it suspended and tried to restart. At that point, the Cuda unit hit an error and the rest of the units in my cache quick ran and hit errors. So I seem to be having a restart issue.

I downloaded the latest beta version of the nvida driver and put an exception in Norton for the Boinc directories. Now I just need berkley to come back on line so I can get some more work units.
ID: 1276951 · 举报违规帖子
Profile BilBg
志愿者测试人员
Avatar

发送消息
已加入:27 May 07
贴子:3720
积分:9,385,827
近期平均积分:0
Bulgaria
消息 1276595 - 发表于:28 Aug 2012, 8:44:45 UTC - 回复消息 1276418.  
最近的修改日期:28 Aug 2012, 9:15:08 UTC


It's only the CUDA work units that end in error, everyone of them end in error.

The work units say Error-True Exit -1 (0xffffff)

We can see the 'Error tasks for computer 6748434' here:
http://setiathome.berkeley.edu/results.php?hostid=6748434&offset=0&show_names=0&state=5&appid=

Some of tasks crash:
- exit code -1073741819 (0xc0000005)
Restarted at 13.99 percent.


... but most tasks give:

- exit code -1 (0xffffffff)
Restarted at 13.99 percent.
SETI@home error -1 Can't create file -- disk full?

It is very strange that all of them 'Restarted at 13.99 percent.'
(They stop processing and restart again probably because you set 'Do not use GPU while user active')

Try (temporarily, to isolate the problem):
- From Activity menu -> Use GPU always
- Do not touch the mouse and keyboard for 5-10 minutes
- Exit maximum number of programs (e.g. browsers, media players, games, hardware monitors, gadgets)
- Set 'Suspend work when non-BOINC CPU usage is above' to 0 (zero)

- And Really test if disabling Norton for 5-10 minutes will eliminate the errors
For me this is very probable cause of errors, based on many reports from the past - try to Google for: BOINC Antivirus Norton

I use NOD32 (which is not causing problems for me), but there are many reports about Trend Micro, Avast!, Avira, McAfee, Kaspersky, Comodo, Norton
http://boinc.berkeley.edu/dev/forum_thread.php?id=7633
http://boinc.berkeley.edu/dev/forum_thread.php?id=7365
http://boinc.berkeley.edu/dev/forum_thread.php?id=2486

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2466
http://boinc.bakerlab.org/forum_thread.php?id=5463
http://boinc01.cern.ch/test4theory/forum_thread.php?id=518
http://www.primegrid.com/forum_thread.php?id=1184


(if you are afraid to disable Norton - stop Internet first (e.g. switch Off the Router) and do not use the computer, just let BOINC run for a while.
Of course you have to have CUDA tasks already downloaded to do the test.
You may suspend them all and [Resume] one by one for tests.
)

If you see that with disabled Norton no more CUDA errors happen - exclude BOINC Data directory from active monitoring/scans.


This is shown for almost all of your CUDA tasks:
Stderr output
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
 - exit code -1 (0xffffffff)
</message>
<stderr_txt>
setiathome_CUDA: Found 1 CUDA device(s):
   Device 1 : GeForce 210 
           totalGlobalMem = 1073741824 
           sharedMemPerBlock = 16384 
           regsPerBlock = 16384 
           warpSize = 32 
           memPitch = 2147483647 
           maxThreadsPerBlock = 512 
           clockRate = 1238000 
           totalConstMem = 65536 
           major = 1 
           minor = 2 
           textureAlignment = 256 
           deviceOverlap = 1 
           multiProcessorCount = 2 
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce 210 is okay
SETI@home using CUDA accelerated device GeForce 210
Restarted at 13.99 percent.
SETI@home error -1 Can't create file -- disk full?
in checkpoint()
File: ..\seti.cpp
Line: 395


</stderr_txt>
]]>

 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1276595 · 举报违规帖子
ctymountie

发送消息
已加入:23 Jul 05
贴子:20
积分:71,081
近期平均积分:0
United States
消息 1276418 - 发表于:27 Aug 2012, 20:17:51 UTC - 回复消息 1276417.  

I don't see any history on Norton that indicates it's blocking Boinc. Disc space is not a problem. I've got 100 gigs set aside on a 500 gig drive and the drive itself is only about 10% full.

So a further update on the errors.

Seti enhanced works.
Astropulse works.

It's only the cuda work units that end in error, everyone of them end in error.


The work units say Error-True Exit -1 (0xffffff)
ID: 1276418 · 举报违规帖子
ctymountie

发送消息
已加入:23 Jul 05
贴子:20
积分:71,081
近期平均积分:0
United States
消息 1276417 - 发表于:27 Aug 2012, 20:15:58 UTC - 回复消息 1275624.  

I don't see any history on Norton that indicates it's blocking Boinc. Disc space is not a problem. I've got 100 gigs set aside on a 500 gig drive and the drive itself is only about 10% full.

So a further update on the errors.

Seti enhanced works.
Astropulse works.

It's only the cuda work units that end in error, everyone of them end in error.
ID: 1276417 · 举报违规帖子
Profile BilBg
志愿者测试人员
Avatar

发送消息
已加入:27 May 07
贴子:3720
积分:9,385,827
近期平均积分:0
Bulgaria
消息 1275624 - 发表于:26 Aug 2012, 5:32:05 UTC - 回复消息 1275572.  
最近的修改日期:26 Aug 2012, 5:57:11 UTC


http://setiathome.berkeley.edu/result.php?resultid=2574833931

Since sometimes tasks start/restart OK and only sometimes error:
Restarted at 7.08 percent. (OK)
Restarted at 8.33 percent. (OK)
Restarted at 9.21 percent. (SETI@home error -1 Can't create file -- disk full?)

- How much free space do you have on the HDD (on the partition where BOINC Data is, probably C:)?
- How much HDD space do you allow for BOINC use (look in Disk tab in BOINC Manager)?
- Do you see high HDD load (HDD LED/light is ON almost all the time)?

- Your Antivirus may be locking files for too long when it scans them (when a file is created/written it is usually scanned by the resident (On-Access) Antivirus module)
- Check Antivirus logs/Quarantine for any BOINC related files (some Antiviruses delete 'suspicious' files (they scramble and move the files to something called 'Quarantine'/'Vault'/'Chest' or similar))
- If you see that Antivirus is responsible for errors - try to exclude BOINC Data directory from active monitoring/scans.

- Did you try Reboot (Restart Windows)?


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1275624 · 举报违规帖子
ctymountie

发送消息
已加入:23 Jul 05
贴子:20
积分:71,081
近期平均积分:0
United States
消息 1275572 - 发表于:26 Aug 2012, 0:07:52 UTC - 回复消息 1275339.  

It's run an astropulse work unit without error. See what happens with the next cuda unit I get.
ID: 1275572 · 举报违规帖子
ctymountie

发送消息
已加入:23 Jul 05
贴子:20
积分:71,081
近期平均积分:0
United States
消息 1275339 - 发表于:25 Aug 2012, 12:01:25 UTC - 回复消息 1275253.  

You also get this error if the directory is write protected to the BOINC user (or at least that's what happens under Linux)


I was playing with the directories in an attempt to get BOINCLogX to work. I had to change the properties on the programdata file from hidden. But I don't think any of them were changed to Read Only...
ID: 1275339 · 举报违规帖子
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者负责人
志愿者测试人员

发送消息
已加入:7 Mar 03
贴子:18752
积分:416,307,556
近期平均积分:380
United Kingdom
消息 1275253 - 发表于:25 Aug 2012, 6:49:18 UTC

You also get this error if the directory is write protected to the BOINC user (or at least that's what happens under Linux)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1275253 · 举报违规帖子
Profile Gatekeeper
Avatar

发送消息
已加入:14 Jul 04
贴子:887
积分:176,479,616
近期平均积分:0
United States
消息 1275225 - 发表于:25 Aug 2012, 5:56:32 UTC
最近的修改日期:25 Aug 2012, 5:57:17 UTC

Well, the predominant error seems to be: SETI@home error -1 Can't create file -- disk full?
in checkpoint()
File: ..\seti.cpp
Line: 395

Off the top of my head, I'd say either your HDD is near full, or, you don't have enough space allocated to BOINC/S@H.
ID: 1275225 · 举报违规帖子
ctymountie

发送消息
已加入:23 Jul 05
贴子:20
积分:71,081
近期平均积分:0
United States
消息 1275196 - 发表于:25 Aug 2012, 4:53:26 UTC

The last 25 CUDA work units I started today ended in with ERROR WHILE COMPUTING.
I just installed the GPU in this new computer. It was working fine for the first two days, processing WU's in about 25 minutes. Now I'm getting errors with CUDA packets.

I'm still using the same NVIDia driver from yesterday. I've changed some the computing preferences and added BOINClogX and seti map view. Now I'm getting the errors. Any suggestions?
ID: 1275196 · 举报违规帖子
前 · 1 · 2

疑难解答 : Windows : Cuda work units ending in error


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.