Completed WUs going nowhere?

Message boards : Number crunching : Completed WUs going nowhere?
Message board moderation

To post messages, you must log in.

AuthorMessage
D_S_Spence

Send message
Joined: 4 Dec 16
Posts: 7
Credit: 315,189
RAC: 610
Canada
Message 1883875 - Posted: 14 Aug 2017, 19:29:58 UTC

Hi everyone,

I have an old netbook that was working successfully on SETI@home WUs for a few weeks before I went on vacation. I left it running while on vacation, but there was a power failure while I was away, and then another power failure after that.

After I got back and turned the machine back on and got Boinc running again, it looks (from boincmgr) to be working. It has been submitting completed WUs most days. But the WUs aren't being counted. There is an increasingly large list of "In Progress" tasks, and the total credit has been stuck at 2977 for quite a while (maybe since before I went on vacation?)

Another strange symptom: the "number of times client has contacted server" number was 0 after restart, and stayed at 0 through a few apparent complete WU uploads.

The computer is Squishy.

I'm just hoping someone has some pointers for figuring out what is going wrong.
ID: 1883875 · Report as offensive
Profile Darrell Project Donor
Volunteer tester
Avatar

Send message
Joined: 14 Mar 03
Posts: 267
Credit: 1,399,508
RAC: 28
United States
Message 1883882 - Posted: 14 Aug 2017, 20:10:07 UTC
Last modified: 14 Aug 2017, 20:23:05 UTC

Your Linux machine has a full cache of 56 days of work and your Windows7 machine only has one task. Is it the one that is having problems?

Oops, went and click on the link in your post and it is the Linux one. Based upon the the time it took the one reported task on July 24th to complete, the other 41 tasks will take 56 days. It has contacted the server 5 times, and probably will be awhile before it needs to again.

Are you saying that the 41 tasks are done? In the Manager on the task page, is it set to show only active tasks or all tasks?
... and still I fear, and still I dare not laugh at the Mad Man!

Queen - The Prophet's Song
ID: 1883882 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6530
Credit: 184,495,338
RAC: 47,345
United States
Message 1883910 - Posted: 14 Aug 2017, 22:25:15 UTC - in response to Message 1883875.  

Hi everyone,

I have an old netbook that was working successfully on SETI@home WUs for a few weeks before I went on vacation. I left it running while on vacation, but there was a power failure while I was away, and then another power failure after that.

After I got back and turned the machine back on and got Boinc running again, it looks (from boincmgr) to be working. It has been submitting completed WUs most days. But the WUs aren't being counted. There is an increasingly large list of "In Progress" tasks, and the total credit has been stuck at 2977 for quite a while (maybe since before I went on vacation?)

Another strange symptom: the "number of times client has contacted server" number was 0 after restart, and stayed at 0 through a few apparent complete WU uploads.

The computer is Squishy.

I'm just hoping someone has some pointers for figuring out what is going wrong.

If you check out the external stat sites you can kind of get an idea of what your computer was doing.
It has been at 2977 credit for a few days and received 988 credit on August 11th.


Validated tasks are cleaned up after ~24 hours.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!
ID: 1883910 · Report as offensive
D_S_Spence

Send message
Joined: 4 Dec 16
Posts: 7
Credit: 315,189
RAC: 610
Canada
Message 1884731 - Posted: 18 Aug 2017, 15:16:37 UTC

Thanks for the responses!

What I see now on an external site ( https://boincstats.com/en/stats/0/host/detail/8299899/charts ) matches what I remember, where no results were counted from August 3 to 14. The chart does not match the one posted above by HAL9000. There were definitely results prior to July 24.

After the power failure, the computer has been returning results from Aug 7 or 8 until now, but it only started getting points on the 15th.

There is a big list of "In Progress" tasks listed here on the website ( https://setiathome.berkeley.edu/results.php?hostid=8299899 ). Some of them are timing out now. But on the computer, there are currently only 7 tasks listed on BOINC Manager. 2 are in progress, 5 are waiting to run.

I think the "In Progress" WUs that don't show up on the BOINC Manager interface are in the projects/setiathome.berkeley.edu directory on the computer. Is there a way to get BOINC to see them again? Maybe some .xml file can be edited by hand to add them into the list again? Is there a way to cancel those WUs so that someone else can pick them up, instead of waiting for them to time out?

It's still a mystery about the units that got reported from the 7/8th to the 15th, and also why the stats are showing up differently on different sites at different times.

Maybe something got screwed up with the computer's ID?

At any rate, things seem to be returning to normal. Squishy is getting credit again. New WUs aren't disappearing.
ID: 1884731 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9807
Credit: 126,655,896
RAC: 85,439
Australia
Message 1884811 - Posted: 18 Aug 2017, 22:05:59 UTC - in response to Message 1884731.  

There is a big list of "In Progress" tasks listed here on the website ( https://setiathome.berkeley.edu/results.php?hostid=8299899 ). Some of them are timing out now. But on the computer, there are currently only 7 tasks listed on BOINC Manager. 2 are in progress, 5 are waiting to run.

I think the "In Progress" WUs that don't show up on the BOINC Manager interface are in the projects/setiathome.berkeley.edu directory on the computer. Is there a way to get BOINC to see them again? Maybe some .xml file can be edited by hand to add them into the list again? Is there a way to cancel those WUs so that someone else can pick them up, instead of waiting for them to time out?

Ghosts.
Someone posted what was required to recover them in another thread, but it's a lot of mucking about.

They're generally the result of a network issue- the Server allocates you work upon a request, and you start downloading them. Then for whatever reason you don't complete the download, but the Scheduler thinks you have. End result is work in progress according to the BOINC servers, but it's not actually on your system.
Grant
Darwin NT
ID: 1884811 · Report as offensive
Profile Jeff Buck Special Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,815
RAC: 12
United States
Message 1884832 - Posted: 18 Aug 2017, 23:31:49 UTC - in response to Message 1884731.  

There is a big list of "In Progress" tasks listed here on the website ( https://setiathome.berkeley.edu/results.php?hostid=8299899 ). Some of them are timing out now. But on the computer, there are currently only 7 tasks listed on BOINC Manager. 2 are in progress, 5 are waiting to run.

I think the "In Progress" WUs that don't show up on the BOINC Manager interface are in the projects/setiathome.berkeley.edu directory on the computer. Is there a way to get BOINC to see them again? Maybe some .xml file can be edited by hand to add them into the list again? Is there a way to cancel those WUs so that someone else can pick them up, instead of waiting for them to time out?
You can't cancel them, but there is a way to recover "lost" tasks (or "ghosts") so they can still be run on your host . See Message 1865033 for step-by-step instructions.
ID: 1884832 · Report as offensive
D_S_Spence

Send message
Joined: 4 Dec 16
Posts: 7
Credit: 315,189
RAC: 610
Canada
Message 1889981 - Posted: 15 Sep 2017, 18:10:08 UTC - in response to Message 1884832.  

You can't cancel them, but there is a way to recover "lost" tasks (or "ghosts") so they can still be run on your host . See Message 1865033 for step-by-step instructions.


Thank you for the instructions, Jeff! I did try it a couple of times on Squishy, but didn't manage to do it. It has a tiny little screen I can barely read or arrange windows on.

Anyway, I went to install a VNC server on Squishy a couple of days ago and I realized what caused the original problem! I put Tiny Core Linux on that machine (quite an old version because there was a BOINC extension for it and not for any newer versions) and when I set it up, I made the home directory persistent, but I forgot to remove "home" from the list of places to make a backup of when shutting down or calling "backup" from the terminal.

Apparently on July 19, I executed a backup that stored everything in my home folder. Probably I rebooted the machine and I forgot to select "none" from the "Backup Options" dropdown. So when the power failures happened, each time the computer rebooted, it restored everything from July 19. So Squishy re-did some WUs, reported them, and got no credit. She also had no memory of the big queue of WUs she had downloaded before I went on vacation.

I've fixed the configuration so it shouldn't happen again.
ID: 1889981 · Report as offensive

Message boards : Number crunching : Completed WUs going nowhere?


 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.