When should SETI start canceling Unneeded Tasks, to reduce Wasted Effort and electricity ?

Message boards : Number crunching : When should SETI start canceling Unneeded Tasks, to reduce Wasted Effort and electricity ?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 2046539 - Posted: 23 Apr 2020, 15:46:21 UTC
Last modified: 23 Apr 2020, 15:47:25 UTC

Now that many of the Workunits in the BOINC Database have been Validated, but still have outstanding Tasks or Results waiting to be returned, it might be useful for the project to start Canceling Tasks that are not needed.

I believe that there are at least 2 ways that this can be done:
"Cancel if Not Started", probably the best option to use, because it doesn't waste work already done on slow clients
"Cancel unconditionally", I don't recommend this, because it would be unfair to those who have partly completed work that should get Credit.

If the Project Staff can do this, it would save time in getting the Wanted Results back more quickly, because all clients that contact the servers regularly could concentrate on Needed Tasks, instead of continuing to process unnecessary duplicates.
ID: 2046539 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2046543 - Posted: 23 Apr 2020, 15:54:43 UTC - in response to Message 2046539.  
Last modified: 23 Apr 2020, 15:55:37 UTC

Hm, why do you infere there are some unneeded tasks here?
We work with quorum of >1 so secondary result required for validation, to see that computations were correct ones.
There are quite a lot of examples when device with broken driver returns seemingly "valid" result that actually just noise.
So quorum should be set to >1.

Besides that "all tasks are equal" :)
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2046543 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2046548 - Posted: 23 Apr 2020, 16:05:23 UTC - in response to Message 2046543.  

Hm, why do you infere there are some unneeded tasks here?


because there ARE a lot of unneeded tasks out there. since the project has been resending tasks to new hosts even before the hosts in progress hit the deadline, there are a lot of WUs that already reached quorum of 2 or more and are only waiting on hosts that still have work that may or may not ever return.

examples:
https://setiathome.berkeley.edu/workunit.php?wuid=3912028733
https://setiathome.berkeley.edu/workunit.php?wuid=3912072347
https://setiathome.berkeley.edu/workunit.php?wuid=3909767422
https://setiathome.berkeley.edu/workunit.php?wuid=3909767464
https://setiathome.berkeley.edu/workunit.php?wuid=3909767488

and thousands and thousands more. the project could cancel all those outstanding WUs without penalty to their science results.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2046548 · Report as offensive     Reply Quote
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 2046549 - Posted: 23 Apr 2020, 16:08:27 UTC - in response to Message 2046543.  

Hm, why do you infere there are some unneeded tasks here?
We work with quorum of >1 so secondary result required for validation, to see that computations were correct ones.
There are quite a lot of examples when device with broken driver returns seemingly "valid" result that actually just noise.
So quorum should be set to >1.

Besides that "all tasks are equal" :)


I mean the Tasks which are connected with Workunits that are Already Validated, and already have a Canonical Result
Only those should be cancelled if not already started on the Client

This would be useful to clients like https://setiathome.berkeley.edu/show_host_detail.php?hostid=8652081
I know that client has at least 5 Tasks in Progress that are Unneeded because they were part of my pendings for the last few weeks :)
ID: 2046549 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2046552 - Posted: 23 Apr 2020, 16:22:26 UTC - in response to Message 2046549.  
Last modified: 23 Apr 2020, 16:22:56 UTC

Oops... wasn't aware of such tasks, thanks for examples!
Indeed, excessive ones...
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2046552 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2046554 - Posted: 23 Apr 2020, 16:27:55 UTC

Initial replication set to 4 now, looks like they want to finish ASAP w/o care about efficiency.... (seems "efficiency" is my word for today, just as in Oscar https://en.wikipedia.org/wiki/Oscar_(1991_film) film, LoL).
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2046554 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2046563 - Posted: 23 Apr 2020, 17:50:50 UTC
Last modified: 23 Apr 2020, 17:51:02 UTC

Checked own hosts - got many tasks recently. And big share of them already done and validated.
So, one could cancel such tasks client-side, but it's quite a big amount of manual work.... better if server will do it once and for all.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2046563 · Report as offensive     Reply Quote
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 2046612 - Posted: 23 Apr 2020, 21:32:06 UTC

I have a slow machine and decided to prioritize the WUs where my result was valuable. I had thought about aborting the others, but I worried that it would cause me to no longer get new seti WUs, I'll just slowly plod through the meaningless WUs after I've done the important ones. I'll keep my eyes out for new WUs and prioritize those too.
ID: 2046612 · Report as offensive     Reply Quote
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 716
Credit: 8,032,827
RAC: 62
France
Message 2046705 - Posted: 24 Apr 2020, 9:06:39 UTC

3952528310

and some are marked invalid ... but why ? ^^
ID: 2046705 · Report as offensive     Reply Quote
Cavalary

Send message
Joined: 15 Jul 99
Posts: 104
Credit: 7,507,548
RAC: 38
Romania
Message 2047028 - Posted: 26 Apr 2020, 1:19:57 UTC

If they were sent out, let people complete them I say (well, complete or time out, whichever's first). Would be messed up to find that you seem to still have work but it was canceled on the server. After all, the early resends were so there won't be a chain of them, as in wait for timeout, only then resend, maybe get a 2nd timeout, resend again only at that point, and maybe even more repeats for a handful, dragging on for months. As it is, you pretty much know the date by which all will be either returned or timed out already.
ID: 2047028 · Report as offensive     Reply Quote
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36118
Credit: 261,360,520
RAC: 489
Australia
Message 2047030 - Posted: 26 Apr 2020, 1:25:28 UTC - in response to Message 2046705.  

3952528310

and some are marked invalid ... but why ? ^^
That's because the 5th wingman returned the required 2nd conical result before you did. ;-)

Cheers.
ID: 2047030 · Report as offensive     Reply Quote
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 2047060 - Posted: 26 Apr 2020, 8:27:19 UTC - in response to Message 2047028.  

Would be messed up to find that you seem to still have work but it was canceled on the server.

That's not how it works, on a scheduler request the server tells the client to abort the WU (either only if not started or in any condition, the project admins choose), on the next request the client confirms it and that's the moment, when the WU is not anymore "in progress" for the server, not before. Completely clean way to do it, nothing gets messed up.
ID: 2047060 · Report as offensive     Reply Quote
Gene Project Donor

Send message
Joined: 26 Apr 99
Posts: 150
Credit: 48,393,279
RAC: 118
United States
Message 2047300 - Posted: 27 Apr 2020, 19:16:35 UTC

My answer to the OP thread topic is... NOW. The project admins are already boosting the initial replication counts and resending work units that still have not achieved a quorum yet, i.e. waiting for nonresponsive hosts to reach the deadline. With that strategy the time to validate all workunits is very short. A couple of days, I think. There are hundreds of hosts out here with zero tasks "in progress" just chomping at the bit to return any task we get - probably in a matter of hours. What to do when all work units have canonical results? Just shut off the upload servers! No results uploaded - no credit granted. Period. This phase of the project has ended. I can, of course, hear the outcry, "My GPU/CPU ran for hours and I get nothing for it?" (bad) Luck of the draw. We've known for two months that the work unit processing would end. Most hosts have reached 0 In Progress and have rapidly declining "pending" and "inconclusive" numbers. My humble machine has had 0 In Progress for weeks now, and an average turnaround time of 0.17 days. At the other end of the spectrum I note a host (in the top 20) with 36000+ In Progress and an average turnaround time of 37.2 DAYS! I see no need for the project to wait for those to be reported, nor to give credit when they (eventually) are. (I arrive at the airline departure gate hours after the plane has left and I want my money back.... HAH!)
Seti 2.0 will certainly need to rethink the whole cache thing ( and the credit thing, but that's another can of worms).
ID: 2047300 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2047335 - Posted: 27 Apr 2020, 23:06:58 UTC

https://setiathome.berkeley.edu/workunit.php?wuid=3873945268
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2047335 · Report as offensive     Reply Quote
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 2047341 - Posted: 27 Apr 2020, 23:37:39 UTC - in response to Message 2047335.  

Look at the Sent date, there were a batch of those around that time, most of them were re-sent later.
ID: 2047341 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2047382 - Posted: 28 Apr 2020, 10:37:00 UTC - in response to Message 2047341.  

Look at the Sent date, there were a batch of those around that time, most of them were re-sent later.

But one host still had task in its queue. That's a waste, it will be just ignored after returning result.
And w/o any credit for computations done (reporting after validation at least grants credit for wasted time :) ).
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2047382 · Report as offensive     Reply Quote
Scrooge McDuck
Avatar

Send message
Joined: 26 Nov 99
Posts: 1015
Credit: 1,674,173
RAC: 54
Germany
Message 2047388 - Posted: 28 Apr 2020, 12:33:51 UTC - in response to Message 2047382.  
Last modified: 28 Apr 2020, 12:34:49 UTC

But one host still had task in its queue. That's a waste, it will be just ignored after returning result.
And w/o any credit for computations done (reporting after validation at least grants credit for wasted time :) ).

It is only a waste if they are actually crunched, but they won't. The task workunit you mentioned, belongs to a host which abandoned end of March, blackholing its 1K tasks.

Anyway, after three rounds of forced resends even the remaining extremely large caches of some high-performance hosts now hardly contain any relevant task needed for validation. How many are still out there? There are some sound assumptions by Unixchick in the last day of s@h thread.
ID: 2047388 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2047452 - Posted: 28 Apr 2020, 20:39:46 UTC - in response to Message 2047388.  

But one host still had task in its queue. That's a waste, it will be just ignored after returning result.
And w/o any credit for computations done (reporting after validation at least grants credit for wasted time :) ).

It is only a waste if they are actually crunched, but they won't. The task workunit you mentioned, belongs to a host which abandoned end of March, blackholing its 1K tasks.

Good point :)

Well, maybe next step would be increase initial replication to 8? But don't forget to increase total number and number of errors too
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2047452 · Report as offensive     Reply Quote
Filipe

Send message
Joined: 12 Aug 00
Posts: 218
Credit: 21,281,677
RAC: 20
Portugal
Message 2047787 - Posted: 1 May 2020, 11:51:14 UTC

Well, maybe next step would be increase initial replication to 8? But don't forget to increase total number and number of errors too


Why not send those resends to host with a fast turnaround time? Let's say <3 days?
ID: 2047787 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2047797 - Posted: 1 May 2020, 13:35:08 UTC - in response to Message 2047787.  

Well, maybe next step would be increase initial replication to 8? But don't forget to increase total number and number of errors too


Why not send those resends to host with a fast turnaround time? Let's say <3 days?

Maybe just cause such logic absent in project server settings? That is, they can't do it just "by button click". And then the question arise worth it time to spend to do additional coding or not.
Think that's the reason.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2047797 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : When should SETI start canceling Unneeded Tasks, to reduce Wasted Effort and electricity ?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.