WU issues, 0% progress and stalling at 7-10%


log in

Advanced search

Message boards : Number crunching : WU issues, 0% progress and stalling at 7-10%

Author Message
Brian Koster
Send message
Joined: 3 Apr 01
Posts: 10
Credit: 1,566,901
RAC: 0
United States
Message 1001153 - Posted: 6 Jun 2010, 2:37:21 UTC

Been having an issue with 6.03 enhanced WUs either running for hours with 0.000% progress or at best running to around 7-10% progress then stopping dead.

Tried exit, restart, tried detaching.

6.09 cuda stuff runs fine as does Einstein, Rosetta, and Milkyway
____________

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 359,533
RAC: 35
Germany
Message 1001192 - Posted: 6 Jun 2010, 7:07:39 UTC - in response to Message 1001153.

Since your computers are hidden, we can't check the stdout messages of your returned tasks.

Gruß,
Gundolf
____________
Computer sind nicht alles im Leben. (Kleiner Scherz)

SETI@home classic workunits 3,758
SETI@home classic CPU time 66,520 hours

Brian Koster
Send message
Joined: 3 Apr 01
Posts: 10
Credit: 1,566,901
RAC: 0
United States
Message 1001294 - Posted: 6 Jun 2010, 16:07:37 UTC - in response to Message 1001192.

Since your computers are hidden, we can't check the stdout messages of your returned tasks.

Gruß,
Gundolf


Fixed
____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4306
Credit: 1,083,263
RAC: 1,454
United States
Message 1001315 - Posted: 6 Jun 2010, 17:34:07 UTC - in response to Message 1001294.

Brian Koster wrote:
Since your computers are hidden, we can't check the stdout messages of your returned tasks.

Gruß,
Gundolf


Fixed

Thanks!

There has been a history of multicore AMD hosts hanging, first noted by Pappa (Al Reust) at SETI Beta before S@H Enhanced was released here. That was about 4 years ago I guess. The 4 tasks you aborted seem to match the pattern of something going wrong during the "Optimal function choices:" tests. But your host is showing huge negative times for some of those tests which I hadn't seen before. It gives me an idea which might possibly lead to a fix eventually.

The 7-10% stalls are almost certainly happening while those tests are run after a restart, the retesting then isn't shown in the stderr but is done nonetheless.

For now, the only known method of avoiding the problem is to use third-party applications which don't do that testing at startup. If you've read other threads here you'll have seen mention of those optimized applications available from Lunatics. Installing those would fix the problem as well as increase your host's productivity. But it also would lay an obligation on you to check here or at the Lunatics site often for any required updates, since BOINC cannot do automatic updates for those applications.
Joe

Brian Koster
Send message
Joined: 3 Apr 01
Posts: 10
Credit: 1,566,901
RAC: 0
United States
Message 1001909 - Posted: 8 Jun 2010, 17:48:16 UTC - in response to Message 1001315.

Brian Koster wrote:
Since your computers are hidden, we can't check the stdout messages of your returned tasks.

Gruß,
Gundolf


Fixed

Thanks!

There has been a history of multicore AMD hosts hanging, first noted by Pappa (Al Reust) at SETI Beta before S@H Enhanced was released here. That was about 4 years ago I guess. The 4 tasks you aborted seem to match the pattern of something going wrong during the "Optimal function choices:" tests. But your host is showing huge negative times for some of those tests which I hadn't seen before. It gives me an idea which might possibly lead to a fix eventually.

The 7-10% stalls are almost certainly happening while those tests are run after a restart, the retesting then isn't shown in the stderr but is done nonetheless.

For now, the only known method of avoiding the problem is to use third-party applications which don't do that testing at startup. If you've read other threads here you'll have seen mention of those optimized applications available from Lunatics. Installing those would fix the problem as well as increase your host's productivity. But it also would lay an obligation on you to check here or at the Lunatics site often for any required updates, since BOINC cannot do automatic updates for those applications.
Joe


Well .. I guess I'll just have to abort as needed. A batch of 6.03 ran fine then suddenly a WU at 66% just stopped .. let it go overnight to be sure but it was still stalled today.
____________

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,321
RAC: 0
Korea, North
Message 1001972 - Posted: 9 Jun 2010, 1:59:47 UTC - in response to Message 1001909.

its easier to head over to the Lunatics site and use the Unified installer that they created which will automatically install all the apps and files you need to run the optimized apps that will prevent your WU's from freezing.

Honestly, its no harder than installing any other software. Just make sure you stop BOINC then start it again after the install. YOu shouldnt have any more problems after that.

You get your work done substantially faster than you currently do and no hangups whats to lose
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Brian Koster
Send message
Joined: 3 Apr 01
Posts: 10
Credit: 1,566,901
RAC: 0
United States
Message 1002783 - Posted: 11 Jun 2010, 2:29:03 UTC

Well .. I installed the Lunatics optimized stuff .. seems to be running fine ...

.. except for one thing ...

... suddenly my cuda WUs are showing 100 hour estimated run times ROFL

If it weren't for the fact that BOINC now thinks they all need to run at high priority it would be a chuckle
____________

Profile Hellsheep
Volunteer tester
Send message
Joined: 12 Sep 08
Posts: 428
Credit: 784,780
RAC: 0
Australia
Message 1002784 - Posted: 11 Jun 2010, 2:36:38 UTC - in response to Message 1002783.

Well .. I installed the Lunatics optimized stuff .. seems to be running fine ...

.. except for one thing ...

... suddenly my cuda WUs are showing 100 hour estimated run times ROFL

If it weren't for the fact that BOINC now thinks they all need to run at high priority it would be a chuckle


I'm having the same issue, read this:

http://setiathome.berkeley.edu/forum_thread.php?id=60285&nowrap=true#1002735

Richard has been a great help, i'm going to read up more on it before attempting it, and let my cache run low probably too. Just in-case i screw up.
____________
- Jarryd

Brian Koster
Send message
Joined: 3 Apr 01
Posts: 10
Credit: 1,566,901
RAC: 0
United States
Message 1003381 - Posted: 12 Jun 2010, 3:20:18 UTC - in response to Message 1002784.

Well .. I installed the Lunatics optimized stuff .. seems to be running fine ...

.. except for one thing ...

... suddenly my cuda WUs are showing 100 hour estimated run times ROFL

If it weren't for the fact that BOINC now thinks they all need to run at high priority it would be a chuckle


I'm having the same issue, read this:

http://setiathome.berkeley.edu/forum_thread.php?id=60285&nowrap=true#1002735

Richard has been a great help, i'm going to read up more on it before attempting it, and let my cache run low probably too. Just in-case i screw up.


Heh .. I ain't gonna worry about it, unless WUs get scarce it just means it'll be DLing one at a time so <shrug> 8-)

____________

Message boards : Number crunching : WU issues, 0% progress and stalling at 7-10%

Copyright © 2014 University of California