Error while computing

Message boards : Number crunching : Error while computing
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile bill_mole
Volunteer tester

Send message
Joined: 10 Sep 01
Posts: 57
Credit: 1,671,789
RAC: 0
United Kingdom
Message 1028861 - Posted: 27 Aug 2010, 9:47:33 UTC
Last modified: 27 Aug 2010, 9:49:51 UTC

I have looked through the forum, and can't find an explanation for this message - maybe I looked in the wrong place. If so, please point me to the answers.

I have been getting this message more frequently lately. As far as I know I have not changed anything since things were running smoothly. Everything else I run seems OK.

Example cases:
Task 1680837159
Task 1680826663
Task 1680826585
Task 1659527842
Task 1659527838

Any suggestions - or do I just take the hit and waste compute time on dud units?

These units seem to run on & on - long after the time to complete counter hits zero
ID: 1028861 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1028862 - Posted: 27 Aug 2010, 10:00:44 UTC - in response to Message 1028861.  
Last modified: 27 Aug 2010, 10:11:07 UTC

http://setiathome.berkeley.edu/result.php?resultid=1680837159
http://setiathome.berkeley.edu/result.php?resultid=1680826663
http://setiathome.berkeley.edu/result.php?resultid=1680826585
http://setiathome.berkeley.edu/result.php?resultid=1659527842
http://setiathome.berkeley.edu/result.php?resultid=1659527838

AFAIK, this -177 error happen if the estimate times in your BOINC Manger are much away of the real calculation times.

You use stock/original project applications.
So currently I can't imagine what happened there.

For to prevent that this will happen again you could take Fred's BOINC Rescheduler, this tool will prevent this kind of errors.

I have a DL URL of this tool in the 'quick instruction' in my profile.

But, take care, this tool is only for advanced users!
If you are uncertain, have problems, or questions, ask before you execute this tool.
ID: 1028862 · Report as offensive
Bernd Noessler

Send message
Joined: 15 Nov 09
Posts: 99
Credit: 52,635,434
RAC: 0
Germany
Message 1028879 - Posted: 27 Aug 2010, 12:14:42 UTC

Sorry Sutaru, but I have a different opinion.

The tasks with the -177 crunched 10 times as long as normal. That is what the
-177 errors are thaught for.
Using the rescheduler dosn't make sense in this case.

The seti client had problems selecting the optimal functions. Maybe it's
a hardware problem. I would have a look at the CPU temps first.
ID: 1028879 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1028883 - Posted: 27 Aug 2010, 12:49:59 UTC - in response to Message 1028861.  

Bill,
I see you are running AMD. This is a known problem with them doing just as you describe. A reboot usually cures it for awhile but it will start doing it again. The best way is to run optimized. That seems to take care of the problem. The Lunatics crew have taken down the installer right now but a new one should be coming very soon. Keep an eye out over there and get the new one as soon as they post it.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1028883 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1028884 - Posted: 27 Aug 2010, 12:50:55 UTC - in response to Message 1028879.  

Hmm.. you know the BOINC Rescheduler tool of Fred? ;-)

It's not only for to send VLAR WUs of the GPU to the CPU..

It mod the client_state.xml file also for to prevent the -177 error.

Currently I must do this at least one time/day.
All new DLed WUs have wrong estimate times, and if the BOINC Rescheduler wouldn't change the specially entries in the client_state.xml file all this WUs would get this -177 error.

ID: 1028884 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1028890 - Posted: 27 Aug 2010, 13:17:45 UTC - in response to Message 1028883.  
Last modified: 27 Aug 2010, 13:20:52 UTC

Bill,
I see you are running AMD. This is a known problem with them doing just as you describe. A reboot usually cures it for awhile but it will start doing it again. The best way is to run optimized. That seems to take care of the problem. The Lunatics crew have taken down the installer right now but a new one should be coming very soon. Keep an eye out over there and get the new one as soon as they post it.


This is new for me..
Are you sure? If this is right, noone looked to the stock/original S@h Enhanced application for to prevent this?
Are there to less AMD CPUs out there for to look deeper to this problem?

IIRC, I never crunched with the new stock/original S@h Enhanced version on my AMD CPUs.
Always the optimized applications. And never saw this -177 error. So at least the optimized application causes not this error.
ID: 1028890 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1028896 - Posted: 27 Aug 2010, 13:40:35 UTC - in response to Message 1028890.  
Last modified: 27 Aug 2010, 13:41:57 UTC

I think the -177 errors are new with the "New Credit System" but AMDs are known to hang on work units. The to completion and elapsed time keeps rising while the % done just sits there going nowhere. It doesn't happen that often and a reboot usually cures it for awhile so it isn't a big enough deal for the SETI crew to dig into it but going optimized seems to cure it for good.

One of the guys with AMD will probably be along to explain it a bit better, I've just seen it mentioned on here a few times.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1028896 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1028909 - Posted: 27 Aug 2010, 14:42:20 UTC - in response to Message 1028883.  

(...)
The Lunatics crew have taken down the installer right now but a new one should be coming very soon. Keep an eye out over there and get the new one as soon as they post it.


http://lunatics.kwsn.net/index.php?module=Downloads;catd=9

The Win98me Installer is still there.. ;-)

The WinXP+ 32bit and 64bit Installers are away.

..strange.

ID: 1028909 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1028921 - Posted: 27 Aug 2010, 15:31:13 UTC - in response to Message 1028909.  

Not so strange Sutaru,
The main problem with the old 0.36 installer was people using it along with the new Fermi 4xx series graphics cards. People weren't reading and would just use 0.36 without changing the CUDA section causing bad results. The new 0.37 installer fixes that. They figured anyone still running Win98 or ME would be highly unlikely to run a new Fermi graphics card.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1028921 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1028927 - Posted: 27 Aug 2010, 16:05:00 UTC

The lunatics installer is still available on Arkayn's site for those not willing to wait for v.37...
ID: 1028927 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1028931 - Posted: 27 Aug 2010, 16:13:10 UTC - in response to Message 1028927.  

The lunatics installer is still available on Arkayn's site for those not willing to wait for v.37...


Thanks JohnDK but please, please, please stress that anyone trying it for the first time read all the warnings and for those with the new graphics cards to pay close attention to how to set up for them.



PROUD MEMBER OF Team Starfire World BOINC
ID: 1028931 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1028954 - Posted: 27 Aug 2010, 17:28:24 UTC
Last modified: 27 Aug 2010, 17:35:33 UTC

Just as an example, this is one of my wingmen that just turned in at least three -9 results. I didn't look to see how many more he might have.

setiathome_CUDA: CUDA Device 2 specified, checking...
Device 2: GeForce GTX 480 is okay
SETI@home using CUDA accelerated device GeForce GTX 480
V12 modification by Raistmer
Priority of worker thread rised successfully
Priority of process adjusted successfully
Total GPU memory 1576468480 free GPU memory 1049001984
setiathome_enhanced 6.02 Visual Studio/Microsoft C++


Wanna see something ugly? Here's his list of tasks marked invalid...
http://setiathome.berkeley.edu/results.php?hostid=5293938&offset=0&show_names=0&state=4


PROUD MEMBER OF Team Starfire World BOINC
ID: 1028954 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1028958 - Posted: 27 Aug 2010, 17:37:51 UTC - in response to Message 1028931.  

The lunatics installer is still available on Arkayn's site for those not willing to wait for v.37...


Thanks JohnDK but please, please, please stress that anyone trying it for the first time read all the warnings and for those with the new graphics cards to pay close attention to how to set up for them.

There's a warning on the site about fermi cards but guess it can slip through anyway...
ID: 1028958 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1028959 - Posted: 27 Aug 2010, 17:42:26 UTC - in response to Message 1028954.  

Wanna see something ugly? Here's his list of tasks marked invalid...
http://setiathome.berkeley.edu/results.php?hostid=5293938&offset=0&show_names=0&state=4

Maybe sending him a PM might be good :D
ID: 1028959 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1028961 - Posted: 27 Aug 2010, 17:50:30 UTC
Last modified: 27 Aug 2010, 18:13:18 UTC

If someone have a GTX4xx (Fermi) GPU and no knowledge about editing app_info.xml files, could use the Lunatics Installer V0.36, but without select the MultiBeam CUDA app.
So only optimized project applications for CPU. GPU idle - at least for S@h.


[EDIT: OTOH, or maybe PCs with GTX4xx (Fermi) GPU/s only stock/original project applications.
BOINC will DL the MB 6.03 (maybe also AstroPulse) and suitable MB 6.10 CUDA application automatically.
The performance of the CPU is then not to high as with optimized project applications, but better as to trash all CUDA WUs.
The performance of a GTX4xx is higher as all CPUs out there. So combined not much performance loss of the whole PC.]


If someone know how to edit app_info.xml files could look here: 'Running SETI@home on an nVidia Fermi GPU'.
The DL URLs of the MB 6.10 CUDA application/suitable .dll files - are in the 2nd message of this thread.


Like always, if you are uncertain what to do, ask first.

:-)
ID: 1028961 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1028962 - Posted: 27 Aug 2010, 17:51:05 UTC - in response to Message 1028959.  

I wasn't the first to be teamed with him. At least one other I know of has tried PMing him but got no reply.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1028962 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1029009 - Posted: 27 Aug 2010, 20:14:34 UTC - in response to Message 1028921.  

Not so strange Sutaru,
The main problem with the old 0.36 installer was people using it along with the new Fermi 4xx series graphics cards. People weren't reading and would just use 0.36 without changing the CUDA section causing bad results. The new 0.37 installer fixes that. They figured anyone still running Win98 or ME would be highly unlikely to run a new Fermi graphics card.

LOL, NVIDIA have never made CUDA drivers, etc. for Win9X. Even Win2k isn't officially supported now, though when I checked last year it was possible to use the XP package.
                                                                 Joe
ID: 1029009 · Report as offensive
Profile bill_mole
Volunteer tester

Send message
Joined: 10 Sep 01
Posts: 57
Credit: 1,671,789
RAC: 0
United Kingdom
Message 1029937 - Posted: 31 Aug 2010, 12:28:25 UTC

Thanks everyone for your help.

It looks like perryjay probably has the answer. I'm not really technical enough these days to mess with the setup/applications, and don't have the time to get into these things at the moment, as Real World issues are getting in the way, so reboot occasionally sounds like a workaround I can live with for a while.


ID: 1029937 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1029951 - Posted: 31 Aug 2010, 13:47:46 UTC - in response to Message 1029937.  

Bill the Lunatics installer is no harder than any other program to download and install. Just make sure you have BOINC turned off when you do it. The only real choice to make for your system is the SSE3 app to use for your AMD CPU. You'd need the one for AMD which the installer clearly designates for you. Then its just a matter of restarting BOINC and you'll never see WU's hanging again


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1029951 · Report as offensive

Message boards : Number crunching : Error while computing


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.