Questions and Answers :
GPU applications :
Tasks hanging on CUDA 6.08
Message board moderation
Author | Message |
---|---|
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
Overall, CUDA 6.08 works fine with the exception of occasional hanging workunits. This one is an example: http://setiathome.berkeley.edu/workunit.php?wuid=398292134 It doesn't crash my system or anything, it simply doesn't progress. Unlike most other stalled workunits, even pausing / resuming the task doesn't have any effect. I had to abort it. Once I did that, BOINC started working on AP instead of going back to CUDA. Tried pausing the AP tasks and BOINC went nuts downloading tons of work for other projects. :-/ After restarting BOINC, it appears to be chugging away with the 6.08 CPU client instead of CUDA. So it seems there is one small bug left in the CUDA app, and I'm still waiting anxiously for a version of BOINC that properly schedules CUDA work. But since I now have tons of work downloaded for other projects, it seems I have plenty of time. :-P You will be assimilated...bunghole! |
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
Update - Restarting BOINC client seems to get stuck WUs running again. You will be assimilated...bunghole! |
RandyC Send message Joined: 20 Oct 99 Posts: 714 Credit: 1,704,345 RAC: 0 |
I aborted this WU after it ran for 9 hrs and was less than 60% complete. Started running 6.08 last night; I'd been running Raistmer's app up until then. |
MeglaW Send message Joined: 21 Jun 00 Posts: 36 Credit: 479,460 RAC: 0 |
new version? my boinc is not updating! |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Please people, when you report hanging tasks do tell the basics about your system, OS, driver version, CUDA card used, application version, BOINC client version used and that link to your hanging task. |
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
Please people, when you report hanging tasks do tell the basics about your system, OS, driver version, CUDA card used, application version, BOINC client version used and that link to your hanging task. OS - XP 32 Bit Drivers - 181.20 /w GeForce 9600 GT CUDA App 6.08 BOINC 6.4.5 Failed Task: http://setiathome.berkeley.edu/workunit.php?wuid=398292134 You will be assimilated...bunghole! |
Maik Send message Joined: 15 May 99 Posts: 163 Credit: 9,208,555 RAC: 0 |
How does it look if the task is stuck? - is cpu time of the cuda process (WinTaskManger) = 0 ? - whats the temp. on your gpu if a task is stuck / running ? |
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
How does it look if the task is stuck? The CPU time of the hanging tasks is often at zero, but not always. The temp of the GPU drops to idle as soon as the task hangs (50 - 51c full load, ~40c idle" You will be assimilated...bunghole! |
javamann Send message Joined: 14 Oct 02 Posts: 14 Credit: 2,877,279 RAC: 0 |
Same here, I just upgraded to the 6.6.2 version from the 6.2.19 version and the CUDA job is not progressing. Shows "Running (0.18CPUs, 1 CUDA). I am also running four Enhanced 6.03 with the CPU's Video Card: BFG GTX280 1G Ram / Water Cooled Video Driver: 181.22 (Just installed) OS: Window XP SP3 BOINC: 6.6.2 CUDA: 6.08 CPU: AMD Phenom II X4 (3.62GHz) NOTE: When I added this cc_config (program and data directories) my GPU started processing units. I also did a another BOINC restart which might have fixed it too. cc_config: <cc_config> <options> <ncpus>5</ncpus> </options> </cc_config> |
startrekforever Send message Joined: 5 Dec 99 Posts: 36 Credit: 17,772,420 RAC: 0 |
My 6.08 CUDA task also hung: OS - XP 32-bit Drivers - 181.20 w/ 9600 GT CUDA App 6.0.8 BOINC 6.4.5 (Seems to be same setup as Borgholio, hmm.) http://setiathome.berkeley.edu/workunit.php?wuid=398428488 CPU is Q6600 overclocked to 3.2, should not be a factor but thought to mention it. This system is totally stable. Non-overclocked video card. Main monitor on non-CUDA capable 9300GT. This CUDA task was chugging for a while, showed 7 hours to go before I noticed it was stuck at about 53% complete. 6.08 much more stable (no compute errors so far, this one hang though) than 6.06 was. I have another system (Q6600 at 3.5) running CUDA on a 260 card under XP 64-bit, everything else including main monitor on 7300GT the same) with no issue yet. |
Maik Send message Joined: 15 May 99 Posts: 163 Credit: 9,208,555 RAC: 0 |
@ Al: Have a 9600GT too, same driver, quad-core ... Still have some problems with tasks like you and Borgholio. Here is a log of my script i've written: 24.01.2009 14:32:32 > 15dc08ag.4820.11933.9.8.153_0 runtime 31s -> compute error? 24.01.2009 17:16:50 > 16dc08aa.9825.5389.12.8.87_0 runtime 357s -> stuck ? 24.01.2009 19:06:19 > 16dc08aa.9825.5389.12.8.83_1 runtime 465s -> stuck ? 24.01.2009 19:16:24 > 15dc08ah.10456.1299.14.8.54_0 runtime 419s -> stuck ? 24.01.2009 22:52:57 > 15dc08ag.25773.23794.15.8.30_0 runtime 248s -> stuck ? 25.01.2009 00:15:35 > 15dc08ah.26302.24203.15.8.218_0 runtime 340s -> stuck ? 25.01.2009 04:01:56 > 16dc08aa.26261.24612.13.8.12_0 runtime 186s -> stuck ? 25.01.2009 04:35:47 > 16dc08aa.26261.24612.13.8.2_0 runtime 31s -> compute error? 25.01.2009 05:11:10 > 15dc08ag.25773.24203.15.8.255_0 runtime 31s -> compute error? 25.01.2009 05:20:12 > 15dc08ag.25773.24203.15.8.253_0 runtime 31s -> compute error? 25.01.2009 12:36:14 > 15dc08ah.26302.24612.15.8.116_1 runtime 303s -> stuck ? 25.01.2009 17:51:04 > 15dc08ah.26302.24612.15.8.252_1 runtime 31s -> compute error? If you want to try it, go lunatics-gpu-forum. This script detects idleing feeder process if a wu is stuck and will terminate them. After this BM is restarting the task and its running to 100% ... |
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
Here is another hanging task: http://setiathome.berkeley.edu/result.php?resultid=1136193414 It hung at 6.742%. I'll let it run overnight and see if it un-sticks itself. You will be assimilated...bunghole! |
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
Here is another hanging task: Well it unstuck itself. However since my last post, BOINC manager suspended and resumed computation in order to run CPU benchmarks. I wonder if perhaps that unstuck the task. You will be assimilated...bunghole! |
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
Ok here's another one: http://setiathome.berkeley.edu/result.php?resultid=1136193376 Started at 1:48am and is stuck at 98.817%. Been that way for nearly 8 hours. Edit - Pausing and resuming the task helped. When I resumed it, however, the % completion dropped to 87.683%. It doesn't seem to have checkpointed before the task froze. You will be assimilated...bunghole! |
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
Another stuck task: http://setiathome.berkeley.edu/result.php?resultid=1138268168 Task resumed after pausing / unpausing. You will be assimilated...bunghole! |
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
Another one: http://setiathome.berkeley.edu/result.php?resultid=1139774244 Seemed to hang at the same moment that BOINC suspended an AP task and resumed one from the World Community Grid. You will be assimilated...bunghole! |
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
And another: http://setiathome.berkeley.edu/result.php?resultid=1141356737 As an aside, should I keep reporting these here? I don't want to keep spamming stuck workunits if it's already being looked at and / or this is the wrong place to do it... You will be assimilated...bunghole! |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Yes please, so Eric can send those tasks to the Nvidia developer. |
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
Task hung at 5.261%. http://setiathome.berkeley.edu/result.php?resultid=1141356725 Suspended / resumed task and it picked up at 4.613%. Crunching fine now. You will be assimilated...bunghole! |
Borgholio Send message Joined: 2 Aug 99 Posts: 654 Credit: 18,623,738 RAC: 45 |
Another stuck task: http://setiathome.berkeley.edu/result.php?resultid=1141507512 Hung at ~35%. Suspended / resumed and is working now. You will be assimilated...bunghole! |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.