Completed WUs going nowhere?

Message boards : Number crunching : Completed WUs going nowhere?
Message board moderation

To post messages, you must log in.

AuthorMessage
D_S_Spence

Send message
Joined: 4 Dec 16
Posts: 7
Credit: 140,284
RAC: 955
Canada
Message 1883875 - Posted: 14 Aug 2017, 19:29:58 UTC

Hi everyone,

I have an old netbook that was working successfully on SETI@home WUs for a few weeks before I went on vacation. I left it running while on vacation, but there was a power failure while I was away, and then another power failure after that.

After I got back and turned the machine back on and got Boinc running again, it looks (from boincmgr) to be working. It has been submitting completed WUs most days. But the WUs aren't being counted. There is an increasingly large list of "In Progress" tasks, and the total credit has been stuck at 2977 for quite a while (maybe since before I went on vacation?)

Another strange symptom: the "number of times client has contacted server" number was 0 after restart, and stayed at 0 through a few apparent complete WU uploads.

The computer is Squishy.

I'm just hoping someone has some pointers for figuring out what is going wrong.
ID: 1883875 · Report as offensive     Reply Quote
Profile DarrellProject Donor
Volunteer tester
Avatar

Send message
Joined: 14 Mar 03
Posts: 267
Credit: 1,373,515
RAC: 450
United States
Message 1883882 - Posted: 14 Aug 2017, 20:10:07 UTC
Last modified: 14 Aug 2017, 20:23:05 UTC

Your Linux machine has a full cache of 56 days of work and your Windows7 machine only has one task. Is it the one that is having problems?

Oops, went and click on the link in your post and it is the Linux one. Based upon the the time it took the one reported task on July 24th to complete, the other 41 tasks will take 56 days. It has contacted the server 5 times, and probably will be awhile before it needs to again.

Are you saying that the 41 tasks are done? In the Manager on the task page, is it set to show only active tasks or all tasks?
... and still I fear, and still I dare not laugh at the Mad Man!

Queen - The Prophet's Song
ID: 1883882 · Report as offensive     Reply Quote
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6466
Credit: 175,956,075
RAC: 56,339
United States
Message 1883910 - Posted: 14 Aug 2017, 22:25:15 UTC - in response to Message 1883875.  

Hi everyone,

I have an old netbook that was working successfully on SETI@home WUs for a few weeks before I went on vacation. I left it running while on vacation, but there was a power failure while I was away, and then another power failure after that.

After I got back and turned the machine back on and got Boinc running again, it looks (from boincmgr) to be working. It has been submitting completed WUs most days. But the WUs aren't being counted. There is an increasingly large list of "In Progress" tasks, and the total credit has been stuck at 2977 for quite a while (maybe since before I went on vacation?)

Another strange symptom: the "number of times client has contacted server" number was 0 after restart, and stayed at 0 through a few apparent complete WU uploads.

The computer is Squishy.

I'm just hoping someone has some pointers for figuring out what is going wrong.

If you check out the external stat sites you can kind of get an idea of what your computer was doing.
It has been at 2977 credit for a few days and received 988 credit on August 11th.


Validated tasks are cleaned up after ~24 hours.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!
ID: 1883910 · Report as offensive     Reply Quote
D_S_Spence

Send message
Joined: 4 Dec 16
Posts: 7
Credit: 140,284
RAC: 955
Canada
Message 1884731 - Posted: 18 Aug 2017, 15:16:37 UTC

Thanks for the responses!

What I see now on an external site ( https://boincstats.com/en/stats/0/host/detail/8299899/charts ) matches what I remember, where no results were counted from August 3 to 14. The chart does not match the one posted above by HAL9000. There were definitely results prior to July 24.

After the power failure, the computer has been returning results from Aug 7 or 8 until now, but it only started getting points on the 15th.

There is a big list of "In Progress" tasks listed here on the website ( https://setiathome.berkeley.edu/results.php?hostid=8299899 ). Some of them are timing out now. But on the computer, there are currently only 7 tasks listed on BOINC Manager. 2 are in progress, 5 are waiting to run.

I think the "In Progress" WUs that don't show up on the BOINC Manager interface are in the projects/setiathome.berkeley.edu directory on the computer. Is there a way to get BOINC to see them again? Maybe some .xml file can be edited by hand to add them into the list again? Is there a way to cancel those WUs so that someone else can pick them up, instead of waiting for them to time out?

It's still a mystery about the units that got reported from the 7/8th to the 15th, and also why the stats are showing up differently on different sites at different times.

Maybe something got screwed up with the computer's ID?

At any rate, things seem to be returning to normal. Squishy is getting credit again. New WUs aren't disappearing.
ID: 1884731 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 8892
Credit: 115,219,704
RAC: 70,427
Australia
Message 1884811 - Posted: 18 Aug 2017, 22:05:59 UTC - in response to Message 1884731.  

There is a big list of "In Progress" tasks listed here on the website ( https://setiathome.berkeley.edu/results.php?hostid=8299899 ). Some of them are timing out now. But on the computer, there are currently only 7 tasks listed on BOINC Manager. 2 are in progress, 5 are waiting to run.

I think the "In Progress" WUs that don't show up on the BOINC Manager interface are in the projects/setiathome.berkeley.edu directory on the computer. Is there a way to get BOINC to see them again? Maybe some .xml file can be edited by hand to add them into the list again? Is there a way to cancel those WUs so that someone else can pick them up, instead of waiting for them to time out?

Ghosts.
Someone posted what was required to recover them in another thread, but it's a lot of mucking about.

They're generally the result of a network issue- the Server allocates you work upon a request, and you start downloading them. Then for whatever reason you don't complete the download, but the Scheduler thinks you have. End result is work in progress according to the BOINC servers, but it's not actually on your system.
Grant
Darwin NT
ID: 1884811 · Report as offensive     Reply Quote
Profile Jeff Buck
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1276
Credit: 134,491,865
RAC: 253,011
United States
Message 1884832 - Posted: 18 Aug 2017, 23:31:49 UTC - in response to Message 1884731.  

There is a big list of "In Progress" tasks listed here on the website ( https://setiathome.berkeley.edu/results.php?hostid=8299899 ). Some of them are timing out now. But on the computer, there are currently only 7 tasks listed on BOINC Manager. 2 are in progress, 5 are waiting to run.

I think the "In Progress" WUs that don't show up on the BOINC Manager interface are in the projects/setiathome.berkeley.edu directory on the computer. Is there a way to get BOINC to see them again? Maybe some .xml file can be edited by hand to add them into the list again? Is there a way to cancel those WUs so that someone else can pick them up, instead of waiting for them to time out?
You can't cancel them, but there is a way to recover "lost" tasks (or "ghosts") so they can still be run on your host . See Message 1865033 for step-by-step instructions.
ID: 1884832 · Report as offensive     Reply Quote
D_S_Spence

Send message
Joined: 4 Dec 16
Posts: 7
Credit: 140,284
RAC: 955
Canada
Message 1889981 - Posted: 15 Sep 2017, 18:10:08 UTC - in response to Message 1884832.  

You can't cancel them, but there is a way to recover "lost" tasks (or "ghosts") so they can still be run on your host . See Message 1865033 for step-by-step instructions.


Thank you for the instructions, Jeff! I did try it a couple of times on Squishy, but didn't manage to do it. It has a tiny little screen I can barely read or arrange windows on.

Anyway, I went to install a VNC server on Squishy a couple of days ago and I realized what caused the original problem! I put Tiny Core Linux on that machine (quite an old version because there was a BOINC extension for it and not for any newer versions) and when I set it up, I made the home directory persistent, but I forgot to remove "home" from the list of places to make a backup of when shutting down or calling "backup" from the terminal.

Apparently on July 19, I executed a backup that stored everything in my home folder. Probably I rebooted the machine and I forgot to select "none" from the "Backup Options" dropdown. So when the power failures happened, each time the computer rebooted, it restored everything from July 19. So Squishy re-did some WUs, reported them, and got no credit. She also had no memory of the big queue of WUs she had downloaded before I went on vacation.

I've fixed the configuration so it shouldn't happen again.
ID: 1889981 · Report as offensive     Reply Quote

Message boards : Number crunching : Completed WUs going nowhere?


 
©2017 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.