WU issues, 0% progress and stalling at 7-10%

Message boards : Number crunching : WU issues, 0% progress and stalling at 7-10%
Message board moderation

To post messages, you must log in.

AuthorMessage
Brian Koster

Send message
Joined: 3 Apr 01
Posts: 10
Credit: 1,836,801
RAC: 318
United States
Message 1001153 - Posted: 6 Jun 2010, 2:37:21 UTC

Been having an issue with 6.03 enhanced WUs either running for hours with 0.000% progress or at best running to around 7-10% progress then stopping dead.

Tried exit, restart, tried detaching.

6.09 cuda stuff runs fine as does Einstein, Rosetta, and Milkyway
ID: 1001153 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 420,877
RAC: 83
Germany
Message 1001192 - Posted: 6 Jun 2010, 7:07:39 UTC - in response to Message 1001153.  

Since your computers are hidden, we can't check the stdout messages of your returned tasks.

Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)

SETI@home classic workunits 3,758
SETI@home classic CPU time 66,520 hours
ID: 1001192 · Report as offensive
Brian Koster

Send message
Joined: 3 Apr 01
Posts: 10
Credit: 1,836,801
RAC: 318
United States
Message 1001294 - Posted: 6 Jun 2010, 16:07:37 UTC - in response to Message 1001192.  

Since your computers are hidden, we can't check the stdout messages of your returned tasks.

Gruß,
Gundolf


Fixed
ID: 1001294 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1001315 - Posted: 6 Jun 2010, 17:34:07 UTC - in response to Message 1001294.  

Brian Koster wrote:
Since your computers are hidden, we can't check the stdout messages of your returned tasks.

Gruß,
Gundolf


Fixed

Thanks!

There has been a history of multicore AMD hosts hanging, first noted by Pappa (Al Reust) at SETI Beta before S@H Enhanced was released here. That was about 4 years ago I guess. The 4 tasks you aborted seem to match the pattern of something going wrong during the "Optimal function choices:" tests. But your host is showing huge negative times for some of those tests which I hadn't seen before. It gives me an idea which might possibly lead to a fix eventually.

The 7-10% stalls are almost certainly happening while those tests are run after a restart, the retesting then isn't shown in the stderr but is done nonetheless.

For now, the only known method of avoiding the problem is to use third-party applications which don't do that testing at startup. If you've read other threads here you'll have seen mention of those optimized applications available from Lunatics. Installing those would fix the problem as well as increase your host's productivity. But it also would lay an obligation on you to check here or at the Lunatics site often for any required updates, since BOINC cannot do automatic updates for those applications.
                                                                Joe
ID: 1001315 · Report as offensive
Brian Koster

Send message
Joined: 3 Apr 01
Posts: 10
Credit: 1,836,801
RAC: 318
United States
Message 1001909 - Posted: 8 Jun 2010, 17:48:16 UTC - in response to Message 1001315.  

Brian Koster wrote:
Since your computers are hidden, we can't check the stdout messages of your returned tasks.

Gruß,
Gundolf


Fixed

Thanks!

There has been a history of multicore AMD hosts hanging, first noted by Pappa (Al Reust) at SETI Beta before S@H Enhanced was released here. That was about 4 years ago I guess. The 4 tasks you aborted seem to match the pattern of something going wrong during the "Optimal function choices:" tests. But your host is showing huge negative times for some of those tests which I hadn't seen before. It gives me an idea which might possibly lead to a fix eventually.

The 7-10% stalls are almost certainly happening while those tests are run after a restart, the retesting then isn't shown in the stderr but is done nonetheless.

For now, the only known method of avoiding the problem is to use third-party applications which don't do that testing at startup. If you've read other threads here you'll have seen mention of those optimized applications available from Lunatics. Installing those would fix the problem as well as increase your host's productivity. But it also would lay an obligation on you to check here or at the Lunatics site often for any required updates, since BOINC cannot do automatic updates for those applications.
                                                                Joe


Well .. I guess I'll just have to abort as needed. A batch of 6.03 ran fine then suddenly a WU at 66% just stopped .. let it go overnight to be sure but it was still stalled today.
ID: 1001909 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,436,947
RAC: 0
Burma
Message 1001972 - Posted: 9 Jun 2010, 1:59:47 UTC - in response to Message 1001909.  

its easier to head over to the Lunatics site and use the Unified installer that they created which will automatically install all the apps and files you need to run the optimized apps that will prevent your WU's from freezing.

Honestly, its no harder than installing any other software. Just make sure you stop BOINC then start it again after the install. YOu shouldnt have any more problems after that.

You get your work done substantially faster than you currently do and no hangups whats to lose
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1001972 · Report as offensive
Brian Koster

Send message
Joined: 3 Apr 01
Posts: 10
Credit: 1,836,801
RAC: 318
United States
Message 1002783 - Posted: 11 Jun 2010, 2:29:03 UTC

Well .. I installed the Lunatics optimized stuff .. seems to be running fine ...

.. except for one thing ...

... suddenly my cuda WUs are showing 100 hour estimated run times ROFL

If it weren't for the fact that BOINC now thinks they all need to run at high priority it would be a chuckle
ID: 1002783 · Report as offensive
Profile Hellsheep
Volunteer tester

Send message
Joined: 12 Sep 08
Posts: 428
Credit: 784,780
RAC: 0
Australia
Message 1002784 - Posted: 11 Jun 2010, 2:36:38 UTC - in response to Message 1002783.  

Well .. I installed the Lunatics optimized stuff .. seems to be running fine ...

.. except for one thing ...

... suddenly my cuda WUs are showing 100 hour estimated run times ROFL

If it weren't for the fact that BOINC now thinks they all need to run at high priority it would be a chuckle


I'm having the same issue, read this:

http://setiathome.berkeley.edu/forum_thread.php?id=60285&nowrap=true#1002735

Richard has been a great help, i'm going to read up more on it before attempting it, and let my cache run low probably too. Just in-case i screw up.
- Jarryd
ID: 1002784 · Report as offensive
Brian Koster

Send message
Joined: 3 Apr 01
Posts: 10
Credit: 1,836,801
RAC: 318
United States
Message 1003381 - Posted: 12 Jun 2010, 3:20:18 UTC - in response to Message 1002784.  

Well .. I installed the Lunatics optimized stuff .. seems to be running fine ...

.. except for one thing ...

... suddenly my cuda WUs are showing 100 hour estimated run times ROFL

If it weren't for the fact that BOINC now thinks they all need to run at high priority it would be a chuckle


I'm having the same issue, read this:

http://setiathome.berkeley.edu/forum_thread.php?id=60285&nowrap=true#1002735

Richard has been a great help, i'm going to read up more on it before attempting it, and let my cache run low probably too. Just in-case i screw up.


Heh .. I ain't gonna worry about it, unless WUs get scarce it just means it'll be DLing one at a time so <shrug> 8-)

ID: 1003381 · Report as offensive

Message boards : Number crunching : WU issues, 0% progress and stalling at 7-10%


 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.