Ghost work units

Message boards : Number crunching : Ghost work units
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

AuthorMessage
Profile SciManStev Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Jun 99
Posts: 6652
Credit: 121,090,076
RAC: 0
United States
Message 1023681 - Posted: 8 Aug 2010, 14:25:03 UTC

It does seem like a spirally problem. I think ghosts have always been out there, but with the 3 day outages, and maxed bandwidth being sustaind for longer times, the ghost army is becoming more powerful. I am not sure if it will continue to expand, or reach some limit. I wonder if a cruncher with a RAC of 10 could ever end up with a pending so large it couldn't be reported. Perhaps the new server will somehow reduce the ghost generation as all the servers are shuffeled around. Right now what I'm seeing is that the pendings, pendings not working, and ghosts are all interconnected.

Steve
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website
ID: 1023681 · Report as offensive
James Nelson
Volunteer tester
Avatar

Send message
Joined: 23 Mar 02
Posts: 381
Credit: 4,806,382
RAC: 0
United States
Message 1023682 - Posted: 8 Aug 2010, 14:26:39 UTC - in response to Message 1023669.  

OMG I hope I am doing this wrong....................

Due to the current project limits I have 800 wu on each of my computers at this moment. I checked the "In progress" list on my first computer and it tops out at 1802.

Please tell me I don't have 1002 ghost work units on that machine. I had no idea that this problem had grown to this magnatude. I thought I probably had a couple hundred or less. Now I have to seriously consider the detach-attach procedure but I don't want to lose the work I have on board. Any other options??


that number is all your WU's ,in process, pending, and credit granted. not sure how long the granted ones stay visible before they are removed, 24 hours I think.
ID: 1023682 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 1023686 - Posted: 8 Aug 2010, 14:37:35 UTC

I am looking only at the work "In progress" which removes pending and finished work granted credit from the list.

Here are the ghosts from my 4 active computers................

1 - (1802 - 800) = 1002 ghost
2 - (2764 - 800) = 1964 ghost
3 - ( 903 - 800) =  103 ghost
4 - ( 842 - 800) =   42 ghost


Computers 1 and 2 have a serious problem!
Boinc....Boinc....Boinc....Boinc....
ID: 1023686 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1023687 - Posted: 8 Aug 2010, 14:38:40 UTC - in response to Message 1023682.  

James,
I started to post the same thing but then thought about it and went to look. There is a page that only shows work units in progress and not the pending or complete.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1023687 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 1023696 - Posted: 8 Aug 2010, 14:57:57 UTC

What would happen if I did the detach-atach but quickly stopped the wu downloads after the atach. Stop Boinc and remove the account file account_setiathome.berkeley.edu.xml. Restart Boinc and join the Seti project again which would get me a new client account number. Then let it download the 800 new wu on a new client computer account.

What would happen then If I later merged the two computer accounts again?
Boinc....Boinc....Boinc....Boinc....
ID: 1023696 · Report as offensive
James Nelson
Volunteer tester
Avatar

Send message
Joined: 23 Mar 02
Posts: 381
Credit: 4,806,382
RAC: 0
United States
Message 1023700 - Posted: 8 Aug 2010, 15:06:08 UTC - in response to Message 1023696.  

What would happen if I did the detach-atach but quickly stopped the wu downloads after the atach. Stop Boinc and remove the account file account_setiathome.berkeley.edu.xml. Restart Boinc and join the Seti project again which would get me a new client account number. Then let it download the 800 new wu on a new client computer account.

What would happen then If I later merged the two computer accounts again?


you wouldn't need to do all that when you reattach all the old work would get reassigned to other host's, and you would download new work.
ID: 1023700 · Report as offensive
Profile SciManStev Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Jun 99
Posts: 6652
Credit: 121,090,076
RAC: 0
United States
Message 1023702 - Posted: 8 Aug 2010, 15:11:05 UTC - in response to Message 1023696.  
Last modified: 8 Aug 2010, 15:11:33 UTC

What would happen if I did the detach-atach but quickly stopped the wu downloads after the atach. Stop Boinc and remove the account file account_setiathome.berkeley.edu.xml. Restart Boinc and join the Seti project again which would get me a new client account number. Then let it download the 800 new wu on a new client computer account.

What would happen then If I later merged the two computer accounts again?

When I detach and reatach, I quickly stop the downloads. I hadn't seen it before, but Claggy's suggestion above of copying out the statistics xml file would be an excellent idea. I don't think the outcome would be good with real units still on your rig, but before I detach, I copy my seti directory to the desktop first. Then do the detach and reatach and quickly stop the new work from downloading. I copy the contents of the seti directory back into the new seti directory, and re-enable new tasks. It frees the ghosts, and downloads new work. The app_info stays as it was. The statistics file should keep the statistics graph as it was as well. I am looking forward to trying that out.

Steve
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website
ID: 1023702 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 1023703 - Posted: 8 Aug 2010, 15:11:34 UTC
Last modified: 8 Aug 2010, 15:12:44 UTC

On the other hand maybe I should just wait and let Berkeley deal with the problem. They created it.

But that's not fair to my fellow crunchers. These ghosts don't even begin to expire until 21 August(UTC). Another 12 days from now.
Boinc....Boinc....Boinc....Boinc....
ID: 1023703 · Report as offensive
Bearcat

Send message
Joined: 10 Sep 99
Posts: 106
Credit: 10,778,506
RAC: 0
United States
Message 1023717 - Posted: 8 Aug 2010, 15:59:05 UTC - in response to Message 1023703.  

Eventually the number of ghosts created per day and the number of ghost units timing out will balance, and the problem will not grow further.

The only remaining problem will be the pending units for big crunchers. Just put the total number of pending credit on the account page as one number instead of a seperate line for each pending wu, and even the big crunchers can keep track of their numbers.
ID: 1023717 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 1023731 - Posted: 8 Aug 2010, 17:05:12 UTC

At one time a while back the pending total was listed on the account page but again it was a drain on the servers to keep it there for everyone. So they made it to be "on demand".
Boinc....Boinc....Boinc....Boinc....
ID: 1023731 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1023743 - Posted: 8 Aug 2010, 17:39:48 UTC

Looking at Scarecrow's 60 day charts the overall "in progress" was averaging under 5 million tasks before the first 3 day outage, but has stepped up to about 6.5 million during the most recent outage. Some portion of that increase is due to users having selected longer cache durations, but the average turnaround time hasn't expanded much so I think many have not yet boosted cache. It seems ghosts might be a significant part of that "in progress" growth.

Given a typical deadline of about 5 weeks for MB tasks, the ghost population may not grow much more. I also have the impression fewer were created during recent cycles, I've seen a lot from the July 9 - 12 uptime...

My own modest hosts haven't gotten any, but because I'm on dial-up I try to only allow BOINC network activity when Cricket suggests it's likely to be successful. But my plans often go agley so I end up in the middle of a feeding frenzy.
                                                                 Joe
ID: 1023743 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1023744 - Posted: 8 Aug 2010, 17:53:16 UTC - in response to Message 1023743.  

I believe staff was panicing while people were struggling to get an honest 3 days load. It would seem that most have adjusted accordingly as 10 days grew closer to 10 days.. and reduced the days requested.

The first outtage and to a lesser extent the second and third outtages definately left the largest crunchers offline. Progress has been made.
I did have 21 report LATE after the last outtage, and these may add to someone elses pending, but at least they are not ghosts. (late by about 1 hour.. it was tough to watch ;) )


Janice
ID: 1023744 · Report as offensive
Ivailo Bonev
Volunteer tester
Avatar

Send message
Joined: 26 Jun 00
Posts: 247
Credit: 35,864,461
RAC: 2
Bulgaria
Message 1023813 - Posted: 8 Aug 2010, 22:52:21 UTC - in response to Message 1023743.  

Given a typical deadline of about 5 weeks for MB tasks, the ghost population may not grow much more. I also have the impression fewer were created during recent cycles, I've seen a lot from the July 9 - 12 uptime...

Yep, I'm with the "small" cruncher PC with RAC 6000 and have 7 Ghost units from this period! Imagine some big crunching PC how many will have...
ID: 1023813 · Report as offensive
Profile Zeus Fab3r
Avatar

Send message
Joined: 17 Jan 01
Posts: 649
Credit: 275,335,635
RAC: 597
Serbia
Message 1023856 - Posted: 9 Aug 2010, 2:21:32 UTC - in response to Message 1023813.  

Yep, I'm with the "small" cruncher PC with RAC 6000 and have 7 Ghost units from this period! Imagine some big crunching PC how many will have...


Currently I have 881 on my "medium" crunching rig :-)
Last night there was 940, and I just hate the thought of detaching...

Who the hell is General Failure and why is he reading my harddisk?¿
ID: 1023856 · Report as offensive
Profile Uli
Volunteer tester
Avatar

Send message
Joined: 6 Feb 00
Posts: 10923
Credit: 5,996,015
RAC: 1
Germany
Message 1023858 - Posted: 9 Aug 2010, 2:24:24 UTC

Lol, I only have one. It seems be a ghost for my wingman too, so mostly it will just have to time out.
It is from 7/19.
Pluto will always be a planet to me.

Seti Ambassador
Not to late to order an Anni Shirt
ID: 1023858 · Report as offensive
Profile MadMaC
Volunteer tester
Avatar

Send message
Joined: 4 Apr 01
Posts: 201
Credit: 47,158,217
RAC: 0
United Kingdom
Message 1023909 - Posted: 9 Aug 2010, 9:11:19 UTC - in response to Message 1023858.  

Im not sure if I have any, but my pending is shooting thrugh the roof so Im guessing that I do. Surely the best thing to do is just wait, and then the units will get resent out and then we will get the credit for them, but just a bit later than usual?
Not quite sure I understand the concept to be honest...
ID: 1023909 · Report as offensive
Profile Area 51
Avatar

Send message
Joined: 31 Jan 04
Posts: 965
Credit: 42,193,520
RAC: 0
United Kingdom
Message 1023915 - Posted: 9 Aug 2010, 11:45:16 UTC - in response to Message 1023909.  
Last modified: 9 Aug 2010, 11:46:33 UTC

Not quite sure I understand the concept to be honest...


Ghost units are simply where the servers at Berkeley think they have sent you some tasks that they have allocated to you, but the tasks never actually reach your machine(s). However, they are listed in the tasks allocated to your machine(s), Berkeley-side, so to all intents and purposes, Berkeley thinks you have the tasks when in actual fact you don't. Yes, leaving the tasks to time out ensures that they will be re-sent. However, this has a number of consequences:

1) Other people's pending (your wingmen) will rise, waiting for you to process a task that you never received. Conversely, if your wingmen have ghost tasks, and you have processed your corresponding task, your pending will rise whilst you wait for someone to process and return a matching ghost task (which of course they can't, because its a ghost)!
2) The tasks sit on the servers at Berkeley, occupying disk space until enough tasks are returned such that quorum is reached (which may well be considerably after the original deadline).
3) Not pro-actively detaching/reattaching periodically means that if one of your ghosts times out, your quota for the day will be reset to the minimum -1. This will not be significant for you if you are running optimised apps, since the new quota system is not yet running for anonymous platforms. However, when the new quota system does come into force for you, you will be heavily restricted on your task downloads until you return a number of consecutive valid tasks (which in itself may be delayed by other people's ghosts.....and so on)!

So yes, it is a problem and one which really does need to be addressed - hopefully before the quota system is brought on-line for anonymous platforms otherwise there will be problems that may well have been considerably underestimated. I can easily foresee a situation where the more powerful machines could sit idle for large periods of the day, simply because they have to re-build their quota from a very low value.
ID: 1023915 · Report as offensive
Profile MadMaC
Volunteer tester
Avatar

Send message
Joined: 4 Apr 01
Posts: 201
Credit: 47,158,217
RAC: 0
United Kingdom
Message 1023920 - Posted: 9 Aug 2010, 12:10:10 UTC - in response to Message 1023915.  

Nice one..
Cheers for the explanation..
ID: 1023920 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1023926 - Posted: 9 Aug 2010, 13:04:33 UTC - in response to Message 1023915.  

...
3) Not pro-actively detaching/reattaching periodically means that if one of your ghosts times out, your quota for the day will be reset to the minimum -1. This will not be significant for you if you are running optimised apps, since the new quota system is not yet running for anonymous platforms. However, when the new quota system does come into force for you, you will be heavily restricted on your task downloads until you return a number of consecutive valid tasks (which in itself may be delayed by other people's ghosts.....and so on)!

So yes, it is a problem and one which really does need to be addressed - hopefully before the quota system is brought on-line for anonymous platforms otherwise there will be problems that may well have been considerably underestimated. I can easily foresee a situation where the more powerful machines could sit idle for large periods of the day, simply because they have to re-build their quota from a very low value.

The new quota system does apply to anonymous platform hosts the same as those running stock. As with the old quota system, because it is done in number of tasks the effects can only be serious for hosts which do many tasks in a day. The new quota system is actually more generous than the old, GPUs used to have a maximum quota of 500 tasks a day but now even just after a single error the quota is 792 and it can grow much larger.

Ghosts can have a large effect on quota because they usually occur in batches. That is, all the tasks which the Scheduler has chosen for one request for work get ghosted together, and many may be from the same group so have identical deadlines. When such a group times out, the "Max tasks per day" can be reduced to much less than 100. If that happens during the last day before an outage, a fast host is likely to run out of work before the end of the outage. Similarly, if enough ghosts time out during an outage a fast host might not be able to get even the reduced limit of tasks immediately following the outage, though enough of the tasks finished during the outage would probably be validated quickly to build the quota up again.
                                                                Joe
ID: 1023926 · Report as offensive
Profile Area 51
Avatar

Send message
Joined: 31 Jan 04
Posts: 965
Credit: 42,193,520
RAC: 0
United Kingdom
Message 1024041 - Posted: 9 Aug 2010, 21:10:57 UTC - in response to Message 1023926.  

The new quota system does apply to anonymous platform hosts the same as those running stock.



Joe

Thanks for your clarification (once again). Perhaps it was a poor choice of words - or maybe I'm missing something. Whilst I can see the quota system working, in that it increments and resets at what appear to be appropriate times, it does however not appear to be enforced for (at least) the anonymous platform. Am I correct in this statement? For example I have exceeded my daily GPU download quota today already (it got zapped earlier today by 40 ghost units that had a two week expiry).


ID: 1024041 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

Message boards : Number crunching : Ghost work units


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.