My S@H Cuda 6.06 stops running at 00:00:21 seconds....

Message boards : Number crunching : My S@H Cuda 6.06 stops running at 00:00:21 seconds....
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Riker

Send message
Joined: 10 May 01
Posts: 2
Credit: 11,970,797
RAC: 12
Sweden
Message 850219 - Posted: 6 Jan 2009, 22:45:01 UTC

I just discovered a quite serious problem - My Cuda Seti have stopped crunching - but it still seems active. I get the status running, then Cpu-time counts up to 00:19, freezes a second, the displays 00:21 and no work is done, a simple task keeps up running until i pause or abort it. I tried to leave it alone to see if it was crunching, but after 91:21:12 i aborted it. THe machine has been running fine before. Does anybody have a clue ???
I'm running XP Pro SP2, Q6600 and a Gainward GF8600GT on that specific machine. All the astropulses running on CPU runs fine....

/Goran
ID: 850219 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 850233 - Posted: 6 Jan 2009, 23:07:07 UTC - in response to Message 850219.  
Last modified: 6 Jan 2009, 23:07:47 UTC

Without seeing the tasks it's hard to say exactly what it is, but if it's a CUDA task, the most common reason for this type of behaviour is Very Low Angle Range, or VLAR ( angle ranges 0.1 and below), which CUDA is not able to run. If you could post a link to the task you're talking about or un-hide your pc, it would be more possible to verify.
ID: 850233 · Report as offensive
RandMan

Send message
Joined: 15 Sep 07
Posts: 1
Credit: 143,477
RAC: 0
United States
Message 850242 - Posted: 6 Jan 2009, 23:21:50 UTC

I have the same problem - CUDA job stuck at 00:21.
ID: 850242 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 850252 - Posted: 6 Jan 2009, 23:35:18 UTC - in response to Message 850242.  
Last modified: 6 Jan 2009, 23:37:24 UTC

I have the same problem - CUDA job stuck at 00:21.

Of the MB tasks of yours that were aborted they all appear to be Very Low Angle Range. At this time aborting those tasks is about all you can do when running CUDA.

You also have a number of AstroPulse tasks aborted too, but I'm not sure why those were a probmlrm for you. If they were stuck too, snoozing them and making sure you completely close down Boinc Manager and then restarting Boinc can sometimes get those going again. Sometimes even a reboot of the system after the previous suggestion fails, can do the trick too.
ID: 850252 · Report as offensive
Profile eaglescouter

Send message
Joined: 28 Dec 02
Posts: 162
Credit: 42,012,553
RAC: 0
United States
Message 850477 - Posted: 7 Jan 2009, 16:56:57 UTC

I aborted one of these 'frozen' CUDA work units, the system had completed no work for several hours :(
It's not too many computers, it's a lack of circuit breakers for this room. But we can fix it :)
ID: 850477 · Report as offensive
Profile Fusky*
Avatar

Send message
Joined: 9 Apr 07
Posts: 120
Credit: 415,845
RAC: 0
United States
Message 850492 - Posted: 7 Jan 2009, 17:45:32 UTC

same happened to me had a task running next day same task was still running,so all i did was stop accepting cuda tasks and aborted all cuda tasks i had in stock,so back to crunching regular units now
ID: 850492 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 850494 - Posted: 7 Jan 2009, 17:47:58 UTC

someone had written and posted a script to end task when Cuda locked up. It was on this forum but I can't find it now.


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 850494 · Report as offensive
kevin6912
Volunteer tester

Send message
Joined: 18 Jul 99
Posts: 17
Credit: 10,539,602
RAC: 0
United States
Message 850505 - Posted: 7 Jan 2009, 17:59:24 UTC

VBScript thread link VBscript - It fights with CUDA app freezing

Regards,
Kevin
ID: 850505 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 850513 - Posted: 7 Jan 2009, 18:14:51 UTC - in response to Message 850505.  

thats the one


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 850513 · Report as offensive
Profile Krzychu P.
Volunteer tester

Send message
Joined: 2 Feb 06
Posts: 1
Credit: 116,830
RAC: 0
Poland
Message 852640 - Posted: 12 Jan 2009, 12:09:30 UTC - in response to Message 850513.  

Hi!
I had the same problem.
after 18÷23sek. CUDA WUs stopped
then I have used VBscript, it helped for ~6 units after script's start
but then even this script didn't help
so after 3 days of trials I had to reset the project in the manager and reinstall all the graphic drivers
For now it seems to be ok. But I wonder, what bug it was...
ID: 852640 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 852644 - Posted: 12 Jan 2009, 12:37:40 UTC - in response to Message 852640.  
Last modified: 12 Jan 2009, 12:53:49 UTC

Hi!
I had the same problem.
after 18�23sek. CUDA WUs stopped
then I have used VBscript, it helped for ~6 units after script's start
but then even this script didn't help
so after 3 days of trials I had to reset the project in the manager and reinstall all the graphic drivers
For now it seems to be ok. But I wonder, what bug it was...


There are many threads that discuss the bugs that are causing problems in CUDA. This post probably describes what bug or bugs were in the tasks you had as well.

As well as the VBscript you are running you should also use the batch file mentioned in several threads like this thread which will allow you to spot VLAR tasks and abort them before they run .
ID: 852644 · Report as offensive
Maik

Send message
Joined: 15 May 99
Posts: 163
Credit: 9,208,555
RAC: 0
Germany
Message 852647 - Posted: 12 Jan 2009, 12:55:00 UTC - in response to Message 852644.  
Last modified: 12 Jan 2009, 13:23:26 UTC

Byron S Goodgame wrote:
As well as the VBscript you are running you should also use the batch file mentioned in several threads like this thread which will allow you to spot VLAR tasks before they run and abort them.


Thats right. The script cant abort VeryLowAngleRange - WU's by clicking the button in Boinc-Manager. You must make that as it is explained in this thread. The script is helping you to avoid non-VLAR WU's that they get stuck and blocking your host ... And its written down often: You have to babysit seti-cuda. That means: Download some WU's (in cache), suspend computation, suspend networkactivity (better: set to no new tasks), check them for VLAR, abort VLAR, let them crunch the complete cache, resume networkactivitity, upload results, report results, start at the beginning at every time your cache is crunched (if you set NNT -> allow new tasks).

kevin6912 wrote:
VBScript thread link VBscript - It fights with CUDA app freezing

Regards,
Kevin

Please do NOT use the script posted there, use this link ;)
VBscript Fights Cuda - Thread @ lunatics
ID: 852647 · Report as offensive

Message boards : Number crunching : My S@H Cuda 6.06 stops running at 00:00:21 seconds....


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.