Message boards :
Number crunching :
Data Chat
Message board moderation
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 34 · Next
Author | Message |
---|---|
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Already mentioned in the Lost "Ghost" Task Recovery Protocol thread updated a week ago. . . Ah well I could have guessed someone would know about it before me, but it was still exciting to discover. And now it is known here too ... . . The first sign I had was recovering 42 lost tasks, which for a short while made me think whoever made the change had a sense of whimsy. I quickly discarded that theory and decided it was the space limit in my cache allowance. So I waited till I had 120 completed tasks, then I got 80 so I knew what the actual limit is. And I like it !!! Stephen :) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I did mention to Eric during my day at the lab, that people had found a way of using ghost recovery even though it was turned off, but that they were frustrated by the 20 limit. I hope (though I've been on the road - no confirmation) that the uplift was as a result of that chat - so the procedure has Eric's tacit approval, too. |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3776 Credit: 1,114,826,392 RAC: 3,319 |
I also bothered him directly as well... I can be quite a pest when required. :^) Edit: Also for those wondering why automatic ghost recovery was turned off, you can thank computers like this one with currently 17,845 tasks "in progress" yet not a single valid one returned. Automatically re-sending ghosts would require the scheduler to check its task allotment list of 17,845 against the list actually present on the host whenever it requested work, ie every five minutes. So, if it were automatic, three hundred machines as bad as this asking for work would be on average one per second. The scheduler would have time and bandwidth for little else. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I also bothered him directly as well... I can be quite a pest when required. :^) Whoever bugged Eric, or rather got his attention, it is most appreciated to have the resend quota raised to 80 now. You can clear your "ghosts" much more quickly now. All for the benefit of the database. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
...computers like this one with currently 17,845 tasks "in progress" yet not a single valid one returned.That is impressive. And shows that the system for limiting & releasing work for problem hosts needs work. So, if it were automatic, three hundred machines as bad as this asking for work would be on average one per second. The scheduler would have time and bandwidth for little else. It would be good if the BOINC Manager had an option to "Check for ghosts". Grant Darwin NT |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
So, if it were automatic, three hundred machines as bad as this asking for work would be on average one per second. The scheduler would have time and bandwidth for little else. When i sugest to do automaticaly i was thinking on something less fully automatic. Something like when we ask for a cpu bechmark. You manualy ask to start the recovery protocol and only them it will run but then the rest of the task will be compleated automaticaly without the need of several user interactions like we have today. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Well, if you can program it, I can submit it to the developers. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Well, if you can program it, I can submit it to the developers. Maybe if i have access to the old code said by Grant i could try to modify and make it runs. Not promise anything. But is possible. My C++ programing is a little rusty. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Well, if you can program it, I can submit it to the developers.I'm thinking it would require Scheduler changes- When someone uses "Check for Ghosts" on the BOINC Manager, it sets a flag on the server for the Scheduler to run a check just for that particular host, then resets the flag afterwards so it doesn't keep doing it. And I know they don't like making changes to the Scheduler (and my programming knowledge started and ended with dBase III+). Grant Darwin NT |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I don't think we can modify the server code. We know that the code is already there, but that it is deliberately turned off (for the reasons that Mr. Kevvy described). So this would be a change in the client, plus a trigger in the Manager to set it going. And that needs to be a discrete enough trigger that people who need (and know how) to use it can find it, but it doesn't get fired off by every Tom, Dick, or Harriet under the sun. Getting a lift to my last dinner on American soil in five minutes - I'll think about it and post more later. Quick though before I go - the key trigger is "report the same result twice". The easiest way I can think of to invoke that is "don't process ACKs for the next report". And now dinner calls... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Well, if you can program it, I can submit it to the developers.Maybe if i have access to the old code said by Grant i could try to modify and make it runs. Not promise anything. But is possible. My C++ programing is a little rusty. On the client side, it'd be a case of adding a Check for Ghosts button on the BOINC Manager under the commands group. Clicking that would result in a new tag <check_for_ghosts>0<check_for_ghosts> becoming <check_for_ghosts>1<check_for_ghosts> (to be reset to 0 after a successful Scheduler request) The heavy lifting part would be with the Scheduler, to look for and act on that flag, for just the host that is requesting it. Edit- or as Richard suggests, to get the Manager to do on request what has to be done manually for the Ghost recovery protocol... Grant Darwin NT |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
I did mention to Eric during my day at the lab, that people had found a way of using ghost recovery even though it was turned off, but that they were frustrated by the 20 limit. . . Mr Kevvy posted a message in the ghost recovery thread that it had been increased to 40 and proved stable so was again increased, this time to 80. It has worked nicely for me. . . I feel that it was highly likely your feedback to Eric was significant, so thanks for the effort. Stephen :) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
That is impressive. . . +1 . . That would make it easier to keep tabs. Stephen . . |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
So, if it were automatic, three hundred machines as bad as this asking for work would be on average one per second. The scheduler would have time and bandwidth for little else. . . Yes, a user selectable 'option' to recover ghosted WUs would be a very good and friendly solution. I don't imagine (though I could be proven wrong) that machines like the one cited by Mr Kevvy would be using that option at any time. Stephen . . |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Well, if you can program it, I can submit it to the developers.I'm thinking it would require Scheduler changes- When someone uses "Check for Ghosts" on the BOINC Manager, it sets a flag on the server for the Scheduler to run a check just for that particular host, then resets the flag afterwards so it doesn't keep doing it. . . Been there :) Stephen :) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
I also bothered him directly as well... I can be quite a pest when required. :^) . . Keep up the good work. :) And as Grant said, there is need for a process to identify hosts such as the one you cited and Stephen :) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
The key trigger is "report the same result twice". The easiest way I can think of to invoke that is "don't process ACKs for the next report".Well, the final dinner turned out to be 'on American water', rather than 'soil': And a fine round-off to my trip it was too. Many thanks to Brian, our host here. Too busy attending to the fresh salmon to think about coding, but I still think the line quoted is the easiest. I'll have to close down soon to pack for the flight back tonight, but we can pick this up again after maintenance Tuesday, when I'm back home. |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
In regards to the 2 terabytes of SSD storage for the upload server unless things have changed in the last 4 years usable space on a 512 gig SSD is about 464 gig so that means you are looking at about 1.856 TB of usable space. It will be also interesting to see how long they last because they will get written to more than average drive, unless they are industrial drives. |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
They gave us a sequence for blc35_2bit_guppi_58642_ but some of the bits are missing. There are gaps in the sequence of ending numbers. I'm not sure what this means. Is the data ok? why are bits missing? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
They gave us a sequence for blc35_2bit_guppi_58642_ but some of the bits are missing. There are gaps in the sequence of ending numbers. I'm not sure what this means. Is the data ok? why are bits missing? A little confused about your comment. What is missing? Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.