Message boards :
Number crunching :
Not sure what is happening to my completed WU's
Message board moderation
Author | Message |
---|---|
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
So a couple of weird things happened yesterday on one of my machines. First a whole slew of WU's say they were abandoned because they ran out of time. This was not the case, their deadlines were well into the future. But more concerning is that the machine is getting new WUs and completing them, reporting them and then they seemingly disappear... My event log says they are completed and reported but the machine info on their site does not indicate that it has reported any work units in two days... Any thoughts on what might be going on? http://setiathome.berkeley.edu/results.php?hostid=7159696 That's the machine in question... Thanks, Chris |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
All workunits are removed from the website within 24 hours once a quorum is met. |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
Yes, but these are freshly downloaded wu's that have not been completed by anyone else yet either. So they aren't just getting validated and trashed by the server. My RAC has also dropped so I can't really tell that I am even getting any credit for them. Application details also indicates that basically no wu's have been completed in the past 24 hrs... Thanks, Chris |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
So a couple of weird things happened yesterday on one of my machines. First a whole slew of WU's say they were abandoned because they ran out of time. This was not the case, their deadlines were well into the future. But more concerning is that the machine is getting new WUs and completing them, reporting them and then they seemingly disappear... My event log says they are completed and reported but the machine info on their site does not indicate that it has reported any work units in two days... Any thoughts on what might be going on? Note sure but the easiest way to find out what is happening to the completed work would be to track the tasks that your machine is actively working on. These are the current two oldest tasks on that system. http://setiathome.berkeley.edu/workunit.php?wuid=1712359086 http://setiathome.berkeley.edu/workunit.php?wuid=1712359013 If BOINC is running FIFO, as it normally does. Then they should be the next to be completed. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
All workunits are removed from the website within 24 hours once a quorum is met. Check the link, Ozz - especially the 'error' filter. This is a rare, but previously observed, problem. The older tasks have been marked 'abandoned' on the server, but your BOINC client doesn't know that. You are probably still crunching the 'abandoned' tasks and reporting them - but nothing you can do will reverse the 'abandoned' outcome. The best you can do is to identify the first task newly-issued after 20 Feb 2015, 3:10:15 UTC, and start crunching again from that point forward (checking that it's still shown as 'in progress' on the task list). Just cut your losses and abort any of the abandoned ones still on your computer. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I thought this had been sorted years ago. Tasks when resent, often get rerouted to the GPU rather than the CPU they were originally assigned to. I have thousands of WUs in play at any given time, and this only happens once in a blue moon. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I thought this had been sorted years ago. Different problem, different server message. We had a spate of these during the network congestion period, about two years ago. I've still got the sched_request/reply files that some users sent in at my request when they experienced it (19 March 2013, it looks like), but I couldn't work out what had gone wrong. We really need someone able (and willing) to dig out the specific host server log files to cover the exact event - and we all know exactly how much spare time Eric has to babysit that (zilch). We moved to the colo shortly after my attempted research, and the incidence of the problem dropped sharply - though not to zero, as Chris A has demonstrated. |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
Well I thought if I cleared my cache it would straighten itself out. Unfortunately no, they are still disappearing into the ether. Il, track the ones it completes to see where they go... Thanks, Chris |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Well I thought if I cleared my cache it would straighten itself out. Unfortunately no, they are still disappearing into the ether. Il, track the ones it completes to see where they go... Looks like you dumped the tasks at 21 Feb 2015, 22:19:13 UTC. That host shows 0 in progress & no new tasks downloaded since then. If it happens to be crunching anything for SETI@home it is not associated with that host ID. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
Yeah right now it is not crunching anything. I didn't have any WU's that were expired so I'm not sure what the problems was. We'll see if it persists when I get some new work. At the moment it is irritated that I abandoned so many tasks today so I expect it will be tomorrow before it gets anything new. Thanks, Chris |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
It seems to have straightened itself out once I got work again this morning. Chris |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
It seems to have straightened itself out once I got work again this morning. Yes, this bug manifests itself like that - it only affects the tasks 'in progress' at the time shown against the tasks marked 'abandoned'. Anything downloaded after that time is OK - you aborted more than you needed to. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.