Not sure what is happening to my completed WU's

Message boards : Number crunching : Not sure what is happening to my completed WU's
Message board moderation

To post messages, you must log in.

AuthorMessage
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1644945 - Posted: 21 Feb 2015, 15:41:06 UTC

So a couple of weird things happened yesterday on one of my machines. First a whole slew of WU's say they were abandoned because they ran out of time. This was not the case, their deadlines were well into the future. But more concerning is that the machine is getting new WUs and completing them, reporting them and then they seemingly disappear... My event log says they are completed and reported but the machine info on their site does not indicate that it has reported any work units in two days... Any thoughts on what might be going on?

http://setiathome.berkeley.edu/results.php?hostid=7159696

That's the machine in question...

Thanks,

Chris
ID: 1644945 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 1644953 - Posted: 21 Feb 2015, 15:55:01 UTC - in response to Message 1644945.  

All workunits are removed from the website within 24 hours once a quorum is met.
ID: 1644953 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1644955 - Posted: 21 Feb 2015, 16:03:05 UTC - in response to Message 1644953.  

Yes, but these are freshly downloaded wu's that have not been completed by anyone else yet either. So they aren't just getting validated and trashed by the server. My RAC has also dropped so I can't really tell that I am even getting any credit for them. Application details also indicates that basically no wu's have been completed in the past 24 hrs...

Thanks,

Chris
ID: 1644955 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1644956 - Posted: 21 Feb 2015, 16:03:28 UTC - in response to Message 1644945.  

So a couple of weird things happened yesterday on one of my machines. First a whole slew of WU's say they were abandoned because they ran out of time. This was not the case, their deadlines were well into the future. But more concerning is that the machine is getting new WUs and completing them, reporting them and then they seemingly disappear... My event log says they are completed and reported but the machine info on their site does not indicate that it has reported any work units in two days... Any thoughts on what might be going on?

http://setiathome.berkeley.edu/results.php?hostid=7159696

That's the machine in question...

Thanks,

Chris

Note sure but the easiest way to find out what is happening to the completed work would be to track the tasks that your machine is actively working on.
These are the current two oldest tasks on that system.
http://setiathome.berkeley.edu/workunit.php?wuid=1712359086
http://setiathome.berkeley.edu/workunit.php?wuid=1712359013
If BOINC is running FIFO, as it normally does. Then they should be the next to be completed.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1644956 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1644960 - Posted: 21 Feb 2015, 16:07:22 UTC - in response to Message 1644953.  

All workunits are removed from the website within 24 hours once a quorum is met.

Check the link, Ozz - especially the 'error' filter.

This is a rare, but previously observed, problem. The older tasks have been marked 'abandoned' on the server, but your BOINC client doesn't know that.

You are probably still crunching the 'abandoned' tasks and reporting them - but nothing you can do will reverse the 'abandoned' outcome.

The best you can do is to identify the first task newly-issued after 20 Feb 2015, 3:10:15 UTC, and start crunching again from that point forward (checking that it's still shown as 'in progress' on the task list). Just cut your losses and abort any of the abandoned ones still on your computer.
ID: 1644960 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1644964 - Posted: 21 Feb 2015, 16:13:22 UTC

I thought this had been sorted years ago.
Tasks when resent, often get rerouted to the GPU rather than the CPU they were originally assigned to. I have thousands of WUs in play at any given time, and this only happens once in a blue moon.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1644964 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1644967 - Posted: 21 Feb 2015, 16:22:05 UTC - in response to Message 1644964.  

I thought this had been sorted years ago.
Tasks when resent, often get rerouted to the GPU rather than the CPU they were originally assigned to. I have thousands of WUs in play at any given time, and this only happens once in a blue moon.

Different problem, different server message.

We had a spate of these during the network congestion period, about two years ago. I've still got the sched_request/reply files that some users sent in at my request when they experienced it (19 March 2013, it looks like), but I couldn't work out what had gone wrong. We really need someone able (and willing) to dig out the specific host server log files to cover the exact event - and we all know exactly how much spare time Eric has to babysit that (zilch).

We moved to the colo shortly after my attempted research, and the incidence of the problem dropped sharply - though not to zero, as Chris A has demonstrated.
ID: 1644967 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1645121 - Posted: 21 Feb 2015, 23:53:11 UTC - in response to Message 1644967.  

Well I thought if I cleared my cache it would straighten itself out. Unfortunately no, they are still disappearing into the ether. Il, track the ones it completes to see where they go...

Thanks,

Chris
ID: 1645121 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1645130 - Posted: 22 Feb 2015, 0:15:36 UTC - in response to Message 1645121.  

Well I thought if I cleared my cache it would straighten itself out. Unfortunately no, they are still disappearing into the ether. Il, track the ones it completes to see where they go...

Thanks,

Chris

Looks like you dumped the tasks at 21 Feb 2015, 22:19:13 UTC. That host shows 0 in progress & no new tasks downloaded since then.
If it happens to be crunching anything for SETI@home it is not associated with that host ID.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1645130 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1645165 - Posted: 22 Feb 2015, 3:31:25 UTC - in response to Message 1645130.  

Yeah right now it is not crunching anything. I didn't have any WU's that were expired so I'm not sure what the problems was. We'll see if it persists when I get some new work. At the moment it is irritated that I abandoned so many tasks today so I expect it will be tomorrow before it gets anything new.

Thanks,

Chris
ID: 1645165 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1645324 - Posted: 22 Feb 2015, 16:11:23 UTC - in response to Message 1645165.  

It seems to have straightened itself out once I got work again this morning.

Chris
ID: 1645324 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1645329 - Posted: 22 Feb 2015, 16:35:11 UTC - in response to Message 1645324.  

It seems to have straightened itself out once I got work again this morning.

Chris

Yes, this bug manifests itself like that - it only affects the tasks 'in progress' at the time shown against the tasks marked 'abandoned'. Anything downloaded after that time is OK - you aborted more than you needed to.
ID: 1645329 · Report as offensive

Message boards : Number crunching : Not sure what is happening to my completed WU's


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.