My S@H Cuda 6.06 stops running at 00:00:21 seconds....


log in

Advanced search

Message boards : Number crunching : My S@H Cuda 6.06 stops running at 00:00:21 seconds....

Author Message
Profile Riker
Send message
Joined: 10 May 01
Posts: 2
Credit: 7,726,231
RAC: 0
Sweden
Message 850219 - Posted: 6 Jan 2009, 22:45:01 UTC

I just discovered a quite serious problem - My Cuda Seti have stopped crunching - but it still seems active. I get the status running, then Cpu-time counts up to 00:19, freezes a second, the displays 00:21 and no work is done, a simple task keeps up running until i pause or abort it. I tried to leave it alone to see if it was crunching, but after 91:21:12 i aborted it. THe machine has been running fine before. Does anybody have a clue ???
I'm running XP Pro SP2, Q6600 and a Gainward GF8600GT on that specific machine. All the astropulses running on CPU runs fine....

/Goran
____________

Profile Byron S Goodgame
Volunteer tester
Avatar
Send message
Joined: 16 Jan 06
Posts: 1151
Credit: 3,936,993
RAC: 0
United States
Message 850233 - Posted: 6 Jan 2009, 23:07:07 UTC - in response to Message 850219.
Last modified: 6 Jan 2009, 23:07:47 UTC

Without seeing the tasks it's hard to say exactly what it is, but if it's a CUDA task, the most common reason for this type of behaviour is Very Low Angle Range, or VLAR ( angle ranges 0.1 and below), which CUDA is not able to run. If you could post a link to the task you're talking about or un-hide your pc, it would be more possible to verify.
____________

RandMan
Send message
Joined: 15 Sep 07
Posts: 1
Credit: 143,477
RAC: 0
United States
Message 850242 - Posted: 6 Jan 2009, 23:21:50 UTC

I have the same problem - CUDA job stuck at 00:21.

Profile Byron S Goodgame
Volunteer tester
Avatar
Send message
Joined: 16 Jan 06
Posts: 1151
Credit: 3,936,993
RAC: 0
United States
Message 850252 - Posted: 6 Jan 2009, 23:35:18 UTC - in response to Message 850242.
Last modified: 6 Jan 2009, 23:37:24 UTC

I have the same problem - CUDA job stuck at 00:21.

Of the MB tasks of yours that were aborted they all appear to be Very Low Angle Range. At this time aborting those tasks is about all you can do when running CUDA.

You also have a number of AstroPulse tasks aborted too, but I'm not sure why those were a probmlrm for you. If they were stuck too, snoozing them and making sure you completely close down Boinc Manager and then restarting Boinc can sometimes get those going again. Sometimes even a reboot of the system after the previous suggestion fails, can do the trick too.
____________

Profile eaglescouter
Send message
Joined: 28 Dec 02
Posts: 162
Credit: 42,012,553
RAC: 0
United States
Message 850477 - Posted: 7 Jan 2009, 16:56:57 UTC

I aborted one of these 'frozen' CUDA work units, the system had completed no work for several hours :(
____________
It's not too many computers, it's a lack of circuit breakers for this room. But we can fix it :)

Profile Fusky*
Avatar
Send message
Joined: 9 Apr 07
Posts: 120
Credit: 320,015
RAC: 0
United States
Message 850492 - Posted: 7 Jan 2009, 17:45:32 UTC

same happened to me had a task running next day same task was still running,so all i did was stop accepting cuda tasks and aborted all cuda tasks i had in stock,so back to crunching regular units now
____________

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,321
RAC: 0
Korea, North
Message 850494 - Posted: 7 Jan 2009, 17:47:58 UTC

someone had written and posted a script to end task when Cuda locked up. It was on this forum but I can't find it now.
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

kevin6912
Volunteer tester
Send message
Joined: 18 Jul 99
Posts: 17
Credit: 10,539,502
RAC: 307
United States
Message 850505 - Posted: 7 Jan 2009, 17:59:24 UTC

VBScript thread link VBscript - It fights with CUDA app freezing

Regards,
Kevin

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,321
RAC: 0
Korea, North
Message 850513 - Posted: 7 Jan 2009, 18:14:51 UTC - in response to Message 850505.

thats the one
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Profile Krzychu P.
Volunteer tester
Send message
Joined: 2 Feb 06
Posts: 1
Credit: 103,937
RAC: 0
Poland
Message 852640 - Posted: 12 Jan 2009, 12:09:30 UTC - in response to Message 850513.

Hi!
I had the same problem.
after 18÷23sek. CUDA WUs stopped
then I have used VBscript, it helped for ~6 units after script's start
but then even this script didn't help
so after 3 days of trials I had to reset the project in the manager and reinstall all the graphic drivers
For now it seems to be ok. But I wonder, what bug it was...
____________

Profile Byron S Goodgame
Volunteer tester
Avatar
Send message
Joined: 16 Jan 06
Posts: 1151
Credit: 3,936,993
RAC: 0
United States
Message 852644 - Posted: 12 Jan 2009, 12:37:40 UTC - in response to Message 852640.
Last modified: 12 Jan 2009, 12:53:49 UTC

Hi!
I had the same problem.
after 18�23sek. CUDA WUs stopped
then I have used VBscript, it helped for ~6 units after script's start
but then even this script didn't help
so after 3 days of trials I had to reset the project in the manager and reinstall all the graphic drivers
For now it seems to be ok. But I wonder, what bug it was...


There are many threads that discuss the bugs that are causing problems in CUDA. This post probably describes what bug or bugs were in the tasks you had as well.

As well as the VBscript you are running you should also use the batch file mentioned in several threads like this thread which will allow you to spot VLAR tasks and abort them before they run .
____________

Maik
Send message
Joined: 15 May 99
Posts: 163
Credit: 7,215,944
RAC: 0
Germany
Message 852647 - Posted: 12 Jan 2009, 12:55:00 UTC - in response to Message 852644.
Last modified: 12 Jan 2009, 13:23:26 UTC

Byron S Goodgame wrote:
As well as the VBscript you are running you should also use the batch file mentioned in several threads like this thread which will allow you to spot VLAR tasks before they run and abort them.


Thats right. The script cant abort VeryLowAngleRange - WU's by clicking the button in Boinc-Manager. You must make that as it is explained in this thread. The script is helping you to avoid non-VLAR WU's that they get stuck and blocking your host ... And its written down often: You have to babysit seti-cuda. That means: Download some WU's (in cache), suspend computation, suspend networkactivity (better: set to no new tasks), check them for VLAR, abort VLAR, let them crunch the complete cache, resume networkactivitity, upload results, report results, start at the beginning at every time your cache is crunched (if you set NNT -> allow new tasks).

kevin6912 wrote:
VBScript thread link VBscript - It fights with CUDA app freezing

Regards,
Kevin

Please do NOT use the script posted there, use this link ;)
VBscript Fights Cuda - Thread @ lunatics
____________

Message boards : Number crunching : My S@H Cuda 6.06 stops running at 00:00:21 seconds....

Copyright © 2014 University of California