Validation backlog: So do late returns get no credit then?

Message boards : Number crunching : Validation backlog: So do late returns get no credit then?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Christian Seti (user)
Avatar

Send message
Joined: 31 May 99
Posts: 38
Credit: 73,899,402
RAC: 62
Australia
Message 152927 - Posted: 18 Aug 2005, 7:42:53 UTC

Referring to the August 18 technical news bulletin:

"Such results can appear when a workunit had reached it's quorum number of returned results and is passed through validation, assimilation, file (both workunit and result) deletion and finally DB purging and *then* one or more results come in (perhaps they were slowed down by running intermittently on a laptop). The disassociated results are the bulk of what needs deleting."


So if the backlog of results awaiting validation are these "orphaned" results that are received after they have already reached quorum, does this mean that no credit is going to be issued for them? It's not very fair. Why should someone get credit because they returned a workunit inside the deadline and not if they are late? Sure, only three results are needed for a quorum, but discriminating in my mind violates the spirit of BOINC whereby credit is a reflection of CPU work DONE, not work USED TO ESTABLISH QUORUM. If you "order" something and then say you don't want it when it's delivered, you still have to pay for it!
---------------------------------
Nathan Zamprogno
http://baliset.blogspot.com
ID: 152927 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19012
Credit: 40,757,560
RAC: 67
United Kingdom
Message 152930 - Posted: 18 Aug 2005, 8:07:04 UTC - in response to Message 152927.  


... Why should someone get credit because they returned a workunit inside the deadline and not if they are late?...


It is these late results that are causing the backlog in the first place. If all units were returned within the initial deadline then the database could be regularly purged properly and subsequently the validation process work quicker and there wouldn't be such large backlogs.

It would be preferable if everbody who isn't restricted by using modems or regularly moved laptops etc. left the connection option at the default or at a reasonably low figure. This decreases the number of results that have to be kept in the database and reduces the incidence of late returns.

Andy
ID: 152930 · Report as offensive
Big Blue
Volunteer tester

Send message
Joined: 8 Feb 05
Posts: 16
Credit: 2,721,283
RAC: 0
Germany
Message 152931 - Posted: 18 Aug 2005, 8:08:48 UTC
Last modified: 18 Aug 2005, 8:19:01 UTC

does this mean that no credit is going to be issued for them? It's not very fair. Why should someone get credit because they returned a workunit inside the deadline and not if they are late?



If you return inside the deadline you get the Credit
ID: 152931 · Report as offensive
Andrew Casey
Avatar

Send message
Joined: 19 May 99
Posts: 36
Credit: 397,910
RAC: 0
Australia
Message 152966 - Posted: 18 Aug 2005, 10:34:30 UTC

Does someone want to look up the definition of 'deadline'
ID: 152966 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 153032 - Posted: 18 Aug 2005, 14:03:55 UTC - in response to Message 152966.  

Does someone want to look up the definition of 'deadline'

Actually the correct term is Result Deadline
ID: 153032 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 153103 - Posted: 18 Aug 2005, 15:34:40 UTC - in response to Message 152927.  

Referring to the August 18 technical news bulletin:

"Such results can appear when a workunit had reached it's quorum number of returned results and is passed through validation, assimilation, file (both workunit and result) deletion and finally DB purging and *then* one or more results come in (perhaps they were slowed down by running intermittently on a laptop). The disassociated results are the bulk of what needs deleting."


So if the backlog of results awaiting validation are these "orphaned" results that are received after they have already reached quorum, does this mean that no credit is going to be issued for them? It's not very fair. Why should someone get credit because they returned a workunit inside the deadline and not if they are late? Sure, only three results are needed for a quorum, but discriminating in my mind violates the spirit of BOINC whereby credit is a reflection of CPU work DONE, not work USED TO ESTABLISH QUORUM. If you "order" something and then say you don't want it when it's delivered, you still have to pay for it!

If you turned in your book report in school a week late, were you still able to get an A on it? I rather think not. A deadline is just that. If a correctly processed result that is valid (matches other results for the same WU) is not returned by the deadline, it has to be sent off to someone else to process - hence no credit is a possibility for the late result.


BOINC WIKI
ID: 153103 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 153134 - Posted: 18 Aug 2005, 16:35:47 UTC - in response to Message 152930.  


... Why should someone get credit because they returned a workunit inside the deadline and not if they are late?...


It is these late results that are causing the backlog in the first place. If all units were returned within the initial deadline then the database could be regularly purged properly and subsequently the validation process work quicker and there wouldn't be such large backlogs.

I think it's important to remember that these results aren't merely late. They aren't late by an hour, or a day.

They are well over the deadline. They are so late that the people who designed BOINC never anticipated the problem -- It didn't occur to them that there would be valid results returned far past the deadline.

ID: 153134 · Report as offensive
Profile Digger
Volunteer tester

Send message
Joined: 4 Dec 99
Posts: 614
Credit: 21,053
RAC: 0
United States
Message 153139 - Posted: 18 Aug 2005, 16:48:55 UTC - in response to Message 153134.  

They are well over the deadline. They are so late that the people who designed BOINC never anticipated the problem -- It didn't occur to them that there would be valid results returned far past the deadline.


John, Ned, whoever...

When we speak of disassociated result files that have no corresponding row in the database... as per Matt's post... are we speaking only of results that were submitted after the deadline, or are there other factors that may have contributed to these orphaned files? It seems that if it were only a matter of missed deadlines, then the files should simply be deleted and nobody has a right to complain.

I have no credits pending so my interest is certainly not to bitch about the situation... only for information value.

Thanks,

Dig

ID: 153139 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 153149 - Posted: 18 Aug 2005, 17:15:36 UTC - in response to Message 153139.  

John, Ned, whoever...

When we speak of disassociated result files that have no corresponding row in the database... as per Matt's post... are we speaking only of results that were submitted after the deadline, or are there other factors that may have contributed to these orphaned files? It seems that if it were only a matter of missed deadlines, then the files should simply be deleted and nobody has a right to complain.

I have no credits pending so my interest is certainly not to bitch about the situation... only for information value.

Thanks,

Dig


No, there are several cases. But in general, the solution will need to be to delete the orphans as soon as they can be detected.

1) If a report is made too late (after verification and deadline) just delete the result file - it is not going to be used for anything anyway.
2) After a result is verified, look for strays that were uploaded and not reported.
3) When a result is moved to the science database, search for strays and delete.
4) The remaining case is going to be difficult - those that are uploaded after verification, deadline, and the move to the science DB that are never reported. These, I suspect will need to be swept up by a daemon that periodically sweeps through the upload directory.


BOINC WIKI
ID: 153149 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 153158 - Posted: 18 Aug 2005, 17:28:38 UTC - in response to Message 153149.  

No, there are several cases. But in general, the solution will need to be to delete the orphans as soon as they can be detected.

1) If a report is made too late (after verification and deadline) just delete the result file - it is not going to be used for anything anyway.
2) After a result is verified, look for strays that were uploaded and not reported.
3) When a result is moved to the science database, search for strays and delete.
4) The remaining case is going to be difficult - those that are uploaded after verification, deadline, and the move to the science DB that are never reported. These, I suspect will need to be swept up by a daemon that periodically sweeps through the upload directory.


#1, no problem.
#2 and #3, both Validation and Assimilation will normally happen before 4th result in, so any "strays" can still be reported normally. A better solution is to let db_purge look for strays, since SETI@Home normally waits 7 days from wu "done" this is a solution, but some projects is AFAIK running db_purge immediately after "done"...
#4 is the difficult part...
ID: 153158 · Report as offensive
Profile Digger
Volunteer tester

Send message
Joined: 4 Dec 99
Posts: 614
Credit: 21,053
RAC: 0
United States
Message 153159 - Posted: 18 Aug 2005, 17:29:13 UTC - in response to Message 153149.  
Last modified: 18 Aug 2005, 17:59:54 UTC

When we speak of disassociated result files that have no corresponding row in the database... as per Matt's post... are we speaking only of results that were submitted after the deadline, or are there other factors that may have contributed to these orphaned files?...


No, there are several cases. But in general, the solution will need to be to delete the orphans as soon as they can be detected.


Okay, thanks John. So there are at least some cases where credit should have been granted for these orphaned files.

Dig
ID: 153159 · Report as offensive
Profile [B@H] Ray
Volunteer tester
Avatar

Send message
Joined: 1 Sep 00
Posts: 485
Credit: 45,275
RAC: 0
United States
Message 153184 - Posted: 18 Aug 2005, 18:51:40 UTC - in response to Message 153134.  

They are well over the deadline. They are so late that the people who designed BOINC never anticipated the problem -- It didn't occur to them that there would be valid results returned far past the deadline.


If I had some whare the dealine passed a few Mo. ago I would not even run them. Just as easy to abort them and get new ones. Even with a dial up connection there is no reason to go 6 Mo. or a year and think they are still good when you can read the date in BOINC.


Pizza@Home Rays Place Rays place Forums
ID: 153184 · Report as offensive
Profile Digger
Volunteer tester

Send message
Joined: 4 Dec 99
Posts: 614
Credit: 21,053
RAC: 0
United States
Message 153188 - Posted: 18 Aug 2005, 19:01:31 UTC - in response to Message 153184.  

If I had some whare the dealine passed a few Mo. ago I would not even run them. Just as easy to abort them and get new ones. Even with a dial up connection there is no reason to go 6 Mo. or a year and think they are still good when you can read the date in BOINC.


I agree with you there, Ray. But it does seem that at least in some cases, the results were orphaned by reasons other than simply passing the deadline. Perhaps this is why the good folks at Berkeley are trying to resolve the matter as gracefully as possible. I have a lot of respect for their efforts. I was just trying to see all sides of the issue. :)

Dig
ID: 153188 · Report as offensive

Message boards : Number crunching : Validation backlog: So do late returns get no credit then?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.