Message boards :
Number crunching :
Ghost work units
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next
Author | Message |
---|---|
SciManStev Send message Joined: 20 Jun 99 Posts: 6652 Credit: 121,090,076 RAC: 0 |
It does seem like a spirally problem. I think ghosts have always been out there, but with the 3 day outages, and maxed bandwidth being sustaind for longer times, the ghost army is becoming more powerful. I am not sure if it will continue to expand, or reach some limit. I wonder if a cruncher with a RAC of 10 could ever end up with a pending so large it couldn't be reported. Perhaps the new server will somehow reduce the ghost generation as all the servers are shuffeled around. Right now what I'm seeing is that the pendings, pendings not working, and ghosts are all interconnected. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
James Nelson Send message Joined: 23 Mar 02 Posts: 381 Credit: 4,806,382 RAC: 0 |
OMG I hope I am doing this wrong.................... that number is all your WU's ,in process, pending, and credit granted. not sure how long the granted ones stay visible before they are removed, 24 hours I think. |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 |
I am looking only at the work "In progress" which removes pending and finished work granted credit from the list. Here are the ghosts from my 4 active computers................ 1 - (1802 - 800) = 1002 ghost 2 - (2764 - 800) = 1964 ghost 3 - ( 903 - 800) = 103 ghost 4 - ( 842 - 800) = 42 ghost Computers 1 and 2 have a serious problem! Boinc....Boinc....Boinc....Boinc.... |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
James, I started to post the same thing but then thought about it and went to look. There is a page that only shows work units in progress and not the pending or complete. PROUD MEMBER OF Team Starfire World BOINC |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 |
What would happen if I did the detach-atach but quickly stopped the wu downloads after the atach. Stop Boinc and remove the account file account_setiathome.berkeley.edu.xml. Restart Boinc and join the Seti project again which would get me a new client account number. Then let it download the 800 new wu on a new client computer account. What would happen then If I later merged the two computer accounts again? Boinc....Boinc....Boinc....Boinc.... |
James Nelson Send message Joined: 23 Mar 02 Posts: 381 Credit: 4,806,382 RAC: 0 |
What would happen if I did the detach-atach but quickly stopped the wu downloads after the atach. Stop Boinc and remove the account file account_setiathome.berkeley.edu.xml. Restart Boinc and join the Seti project again which would get me a new client account number. Then let it download the 800 new wu on a new client computer account. you wouldn't need to do all that when you reattach all the old work would get reassigned to other host's, and you would download new work. |
SciManStev Send message Joined: 20 Jun 99 Posts: 6652 Credit: 121,090,076 RAC: 0 |
What would happen if I did the detach-atach but quickly stopped the wu downloads after the atach. Stop Boinc and remove the account file account_setiathome.berkeley.edu.xml. Restart Boinc and join the Seti project again which would get me a new client account number. Then let it download the 800 new wu on a new client computer account. When I detach and reatach, I quickly stop the downloads. I hadn't seen it before, but Claggy's suggestion above of copying out the statistics xml file would be an excellent idea. I don't think the outcome would be good with real units still on your rig, but before I detach, I copy my seti directory to the desktop first. Then do the detach and reatach and quickly stop the new work from downloading. I copy the contents of the seti directory back into the new seti directory, and re-enable new tasks. It frees the ghosts, and downloads new work. The app_info stays as it was. The statistics file should keep the statistics graph as it was as well. I am looking forward to trying that out. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 |
On the other hand maybe I should just wait and let Berkeley deal with the problem. They created it. But that's not fair to my fellow crunchers. These ghosts don't even begin to expire until 21 August(UTC). Another 12 days from now. Boinc....Boinc....Boinc....Boinc.... |
Bearcat Send message Joined: 10 Sep 99 Posts: 106 Credit: 10,778,506 RAC: 0 |
Eventually the number of ghosts created per day and the number of ghost units timing out will balance, and the problem will not grow further. The only remaining problem will be the pending units for big crunchers. Just put the total number of pending credit on the account page as one number instead of a seperate line for each pending wu, and even the big crunchers can keep track of their numbers. |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 |
At one time a while back the pending total was listed on the account page but again it was a drain on the servers to keep it there for everyone. So they made it to be "on demand". Boinc....Boinc....Boinc....Boinc.... |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Looking at Scarecrow's 60 day charts the overall "in progress" was averaging under 5 million tasks before the first 3 day outage, but has stepped up to about 6.5 million during the most recent outage. Some portion of that increase is due to users having selected longer cache durations, but the average turnaround time hasn't expanded much so I think many have not yet boosted cache. It seems ghosts might be a significant part of that "in progress" growth. Given a typical deadline of about 5 weeks for MB tasks, the ghost population may not grow much more. I also have the impression fewer were created during recent cycles, I've seen a lot from the July 9 - 12 uptime... My own modest hosts haven't gotten any, but because I'm on dial-up I try to only allow BOINC network activity when Cricket suggests it's likely to be successful. But my plans often go agley so I end up in the middle of a feeding frenzy. Joe |
soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0 |
I believe staff was panicing while people were struggling to get an honest 3 days load. It would seem that most have adjusted accordingly as 10 days grew closer to 10 days.. and reduced the days requested. The first outtage and to a lesser extent the second and third outtages definately left the largest crunchers offline. Progress has been made. I did have 21 report LATE after the last outtage, and these may add to someone elses pending, but at least they are not ghosts. (late by about 1 hour.. it was tough to watch ;) ) Janice |
Ivailo Bonev Send message Joined: 26 Jun 00 Posts: 247 Credit: 35,864,461 RAC: 2 |
Given a typical deadline of about 5 weeks for MB tasks, the ghost population may not grow much more. I also have the impression fewer were created during recent cycles, I've seen a lot from the July 9 - 12 uptime... Yep, I'm with the "small" cruncher PC with RAC 6000 and have 7 Ghost units from this period! Imagine some big crunching PC how many will have... |
Zeus Fab3r Send message Joined: 17 Jan 01 Posts: 649 Credit: 275,335,635 RAC: 597 |
Yep, I'm with the "small" cruncher PC with RAC 6000 and have 7 Ghost units from this period! Imagine some big crunching PC how many will have... Currently I have 881 on my "medium" crunching rig :-) Last night there was 940, and I just hate the thought of detaching... Who the hell is General Failure and why is he reading my harddisk?¿ |
Uli Send message Joined: 6 Feb 00 Posts: 10923 Credit: 5,996,015 RAC: 1 |
Lol, I only have one. It seems be a ghost for my wingman too, so mostly it will just have to time out. It is from 7/19. Pluto will always be a planet to me. Seti Ambassador Not to late to order an Anni Shirt |
MadMaC Send message Joined: 4 Apr 01 Posts: 201 Credit: 47,158,217 RAC: 0 |
Im not sure if I have any, but my pending is shooting thrugh the roof so Im guessing that I do. Surely the best thing to do is just wait, and then the units will get resent out and then we will get the credit for them, but just a bit later than usual? Not quite sure I understand the concept to be honest... |
Area 51 Send message Joined: 31 Jan 04 Posts: 965 Credit: 42,193,520 RAC: 0 |
Not quite sure I understand the concept to be honest... Ghost units are simply where the servers at Berkeley think they have sent you some tasks that they have allocated to you, but the tasks never actually reach your machine(s). However, they are listed in the tasks allocated to your machine(s), Berkeley-side, so to all intents and purposes, Berkeley thinks you have the tasks when in actual fact you don't. Yes, leaving the tasks to time out ensures that they will be re-sent. However, this has a number of consequences: 1) Other people's pending (your wingmen) will rise, waiting for you to process a task that you never received. Conversely, if your wingmen have ghost tasks, and you have processed your corresponding task, your pending will rise whilst you wait for someone to process and return a matching ghost task (which of course they can't, because its a ghost)! 2) The tasks sit on the servers at Berkeley, occupying disk space until enough tasks are returned such that quorum is reached (which may well be considerably after the original deadline). 3) Not pro-actively detaching/reattaching periodically means that if one of your ghosts times out, your quota for the day will be reset to the minimum -1. This will not be significant for you if you are running optimised apps, since the new quota system is not yet running for anonymous platforms. However, when the new quota system does come into force for you, you will be heavily restricted on your task downloads until you return a number of consecutive valid tasks (which in itself may be delayed by other people's ghosts.....and so on)! So yes, it is a problem and one which really does need to be addressed - hopefully before the quota system is brought on-line for anonymous platforms otherwise there will be problems that may well have been considerably underestimated. I can easily foresee a situation where the more powerful machines could sit idle for large periods of the day, simply because they have to re-build their quota from a very low value. |
MadMaC Send message Joined: 4 Apr 01 Posts: 201 Credit: 47,158,217 RAC: 0 |
Nice one.. Cheers for the explanation.. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
... The new quota system does apply to anonymous platform hosts the same as those running stock. As with the old quota system, because it is done in number of tasks the effects can only be serious for hosts which do many tasks in a day. The new quota system is actually more generous than the old, GPUs used to have a maximum quota of 500 tasks a day but now even just after a single error the quota is 792 and it can grow much larger. Ghosts can have a large effect on quota because they usually occur in batches. That is, all the tasks which the Scheduler has chosen for one request for work get ghosted together, and many may be from the same group so have identical deadlines. When such a group times out, the "Max tasks per day" can be reduced to much less than 100. If that happens during the last day before an outage, a fast host is likely to run out of work before the end of the outage. Similarly, if enough ghosts time out during an outage a fast host might not be able to get even the reduced limit of tasks immediately following the outage, though enough of the tasks finished during the outage would probably be validated quickly to build the quota up again. Joe |
Area 51 Send message Joined: 31 Jan 04 Posts: 965 Credit: 42,193,520 RAC: 0 |
The new quota system does apply to anonymous platform hosts the same as those running stock. Joe Thanks for your clarification (once again). Perhaps it was a poor choice of words - or maybe I'm missing something. Whilst I can see the quota system working, in that it increments and resets at what appear to be appropriate times, it does however not appear to be enforced for (at least) the anonymous platform. Am I correct in this statement? For example I have exceeded my daily GPU download quota today already (it got zapped earlier today by 40 ghost units that had a two week expiry). |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.