Long CPU times

log in

Advanced search

Message boards : Number crunching : Long CPU times

Author Message
Profile Bill GProject donor
Send message
Joined: 1 Jun 01
Posts: 471
Credit: 61,157,293
RAC: 78,831
United States
Message 1468132 - Posted: 24 Jan 2014, 11:30:21 UTC
Last modified: 24 Jan 2014, 11:30:56 UTC

Just noticed these rather long CPU times on a computer that errored out against one of my WUs. I hope the operator is able to fix this soon. I sent an email and hope he will see it and asked him to come here.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 9868
Credit: 66,513,845
RAC: 43,075
United Kingdom
Message 1468138 - Posted: 24 Jan 2014, 11:43:48 UTC - in response to Message 1468132.

LOL - long indeed, about 600 years.

The interesting thing is that he's running BOINC v5.10.45, which doesn't report elapsed time and CPU time separately. If you look at his pending and valid tasks, you'll see that the server has recorded identical times in both columns - I think that's a server workround for credit purposes.

It's just possible that this may be outside the user's control, though a reboot or other TLC is never a bad idea. On the other hand, if it is a server problem, I don't see how it could have happened, or what we could do about it.

Profile WilliamProject donor
Volunteer tester
Send message
Joined: 14 Feb 13
Posts: 1794
Credit: 10,519,512
RAC: 12,606
Message 1468147 - Posted: 24 Jan 2014, 12:13:12 UTC

The errors are -177 runtime exceeded aborts - yes I'd abort that too if it had been running for 600 years!

So I reckon something local on the host makes it go to unreasonable CPU times. Boinc sees an unreasonable CPU time and aborts the task with -177.
Task is returned to server, server makes sanity check on CPU time - can't be larger than time received - time sent. Run time (for credit purposes) is set to the maximal possible time i.e. time received - time sent.
That explains the discrepancy between the fields (the host has a few good tasks - and a pending one with an unreasonable run time.
Looking through the errors that host seems to have been having problams with that for quite some time.

So, what on earth might lead to a host suddenly getting an insane CPU time field? Memory glitch?
A person who won't read has no advantage over one who can't read. (Mark Twain)

Message boards : Number crunching : Long CPU times

Copyright © 2015 University of California