Message boards :
Number crunching :
When should SETI start canceling Unneeded Tasks, to reduce Wasted Effort and electricity ?
Message board moderation
Author | Message |
---|---|
Keith T. Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 |
Now that many of the Workunits in the BOINC Database have been Validated, but still have outstanding Tasks or Results waiting to be returned, it might be useful for the project to start Canceling Tasks that are not needed. I believe that there are at least 2 ways that this can be done: "Cancel if Not Started", probably the best option to use, because it doesn't waste work already done on slow clients "Cancel unconditionally", I don't recommend this, because it would be unfair to those who have partly completed work that should get Credit. If the Project Staff can do this, it would save time in getting the Wanted Results back more quickly, because all clients that contact the servers regularly could concentrate on Needed Tasks, instead of continuing to process unnecessary duplicates. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Hm, why do you infere there are some unneeded tasks here? We work with quorum of >1 so secondary result required for validation, to see that computations were correct ones. There are quite a lot of examples when device with broken driver returns seemingly "valid" result that actually just noise. So quorum should be set to >1. Besides that "all tasks are equal" :) SETI apps news We're not gonna fight them. We're gonna transcend them. |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
Hm, why do you infere there are some unneeded tasks here? because there ARE a lot of unneeded tasks out there. since the project has been resending tasks to new hosts even before the hosts in progress hit the deadline, there are a lot of WUs that already reached quorum of 2 or more and are only waiting on hosts that still have work that may or may not ever return. examples: https://setiathome.berkeley.edu/workunit.php?wuid=3912028733 https://setiathome.berkeley.edu/workunit.php?wuid=3912072347 https://setiathome.berkeley.edu/workunit.php?wuid=3909767422 https://setiathome.berkeley.edu/workunit.php?wuid=3909767464 https://setiathome.berkeley.edu/workunit.php?wuid=3909767488 and thousands and thousands more. the project could cancel all those outstanding WUs without penalty to their science results. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Keith T. Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 |
Hm, why do you infere there are some unneeded tasks here? I mean the Tasks which are connected with Workunits that are Already Validated, and already have a Canonical Result Only those should be cancelled if not already started on the Client This would be useful to clients like https://setiathome.berkeley.edu/show_host_detail.php?hostid=8652081 I know that client has at least 5 Tasks in Progress that are Unneeded because they were part of my pendings for the last few weeks :) |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Oops... wasn't aware of such tasks, thanks for examples! Indeed, excessive ones... SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Initial replication set to 4 now, looks like they want to finish ASAP w/o care about efficiency.... (seems "efficiency" is my word for today, just as in Oscar https://en.wikipedia.org/wiki/Oscar_(1991_film) film, LoL). SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Checked own hosts - got many tasks recently. And big share of them already done and validated. So, one could cancel such tasks client-side, but it's quite a big amount of manual work.... better if server will do it once and for all. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
I have a slow machine and decided to prioritize the WUs where my result was valuable. I had thought about aborting the others, but I worried that it would cause me to no longer get new seti WUs, I'll just slowly plod through the meaningless WUs after I've done the important ones. I'll keep my eyes out for new WUs and prioritize those too. |
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
|
Cavalary Send message Joined: 15 Jul 99 Posts: 104 Credit: 7,507,548 RAC: 38 |
If they were sent out, let people complete them I say (well, complete or time out, whichever's first). Would be messed up to find that you seem to still have work but it was canceled on the server. After all, the early resends were so there won't be a chain of them, as in wait for timeout, only then resend, maybe get a 2nd timeout, resend again only at that point, and maybe even more repeats for a handful, dragging on for months. As it is, you pretty much know the date by which all will be either returned or timed out already. |
Wiggo Send message Joined: 24 Jan 00 Posts: 36162 Credit: 261,360,520 RAC: 489 |
3952528310That's because the 5th wingman returned the required 2nd conical result before you did. ;-) Cheers. |
Link Send message Joined: 18 Sep 03 Posts: 834 Credit: 1,807,369 RAC: 0 |
Would be messed up to find that you seem to still have work but it was canceled on the server. That's not how it works, on a scheduler request the server tells the client to abort the WU (either only if not started or in any condition, the project admins choose), on the next request the client confirms it and that's the moment, when the WU is not anymore "in progress" for the server, not before. Completely clean way to do it, nothing gets messed up. |
Gene Send message Joined: 26 Apr 99 Posts: 150 Credit: 48,393,279 RAC: 118 |
My answer to the OP thread topic is... NOW. The project admins are already boosting the initial replication counts and resending work units that still have not achieved a quorum yet, i.e. waiting for nonresponsive hosts to reach the deadline. With that strategy the time to validate all workunits is very short. A couple of days, I think. There are hundreds of hosts out here with zero tasks "in progress" just chomping at the bit to return any task we get - probably in a matter of hours. What to do when all work units have canonical results? Just shut off the upload servers! No results uploaded - no credit granted. Period. This phase of the project has ended. I can, of course, hear the outcry, "My GPU/CPU ran for hours and I get nothing for it?" (bad) Luck of the draw. We've known for two months that the work unit processing would end. Most hosts have reached 0 In Progress and have rapidly declining "pending" and "inconclusive" numbers. My humble machine has had 0 In Progress for weeks now, and an average turnaround time of 0.17 days. At the other end of the spectrum I note a host (in the top 20) with 36000+ In Progress and an average turnaround time of 37.2 DAYS! I see no need for the project to wait for those to be reported, nor to give credit when they (eventually) are. (I arrive at the airline departure gate hours after the plane has left and I want my money back.... HAH!) Seti 2.0 will certainly need to rethink the whole cache thing ( and the credit thing, but that's another can of worms). |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
https://setiathome.berkeley.edu/workunit.php?wuid=3873945268 SETI apps news We're not gonna fight them. We're gonna transcend them. |
Keith T. Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 |
Look at the Sent date, there were a batch of those around that time, most of them were re-sent later. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Look at the Sent date, there were a batch of those around that time, most of them were re-sent later. But one host still had task in its queue. That's a waste, it will be just ignored after returning result. And w/o any credit for computations done (reporting after validation at least grants credit for wasted time :) ). SETI apps news We're not gonna fight them. We're gonna transcend them. |
Scrooge McDuck Send message Joined: 26 Nov 99 Posts: 1024 Credit: 1,674,173 RAC: 54 |
But one host still had task in its queue. That's a waste, it will be just ignored after returning result. It is only a waste if they are actually crunched, but they won't. The Anyway, after three rounds of forced resends even the remaining extremely large caches of some high-performance hosts now hardly contain any relevant task needed for validation. How many are still out there? There are some sound assumptions by Unixchick in the last day of s@h thread. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
But one host still had task in its queue. That's a waste, it will be just ignored after returning result. Good point :) Well, maybe next step would be increase initial replication to 8? But don't forget to increase total number and number of errors too SETI apps news We're not gonna fight them. We're gonna transcend them. |
Filipe Send message Joined: 12 Aug 00 Posts: 218 Credit: 21,281,677 RAC: 20 |
Well, maybe next step would be increase initial replication to 8? But don't forget to increase total number and number of errors too Why not send those resends to host with a fast turnaround time? Let's say <3 days? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Well, maybe next step would be increase initial replication to 8? But don't forget to increase total number and number of errors too Maybe just cause such logic absent in project server settings? That is, they can't do it just "by button click". And then the question arise worth it time to spend to do additional coding or not. Think that's the reason. SETI apps news We're not gonna fight them. We're gonna transcend them. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.