Aborted by project?

Message boards : Number crunching : Aborted by project?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Modesto
Volunteer tester

Send message
Joined: 4 Jul 04
Posts: 47
Credit: 321,752
RAC: 0
Canada
Message 592361 - Posted: 25 Jun 2007, 11:18:55 UTC
Last modified: 25 Jun 2007, 11:19:13 UTC

Yesterday I upgraded the Boinc client to 5.10.7 from 5.8.16 and have since then seen quite a few of my SETI WUs change their status to "aborted by project". This happened yesterday and now again today... anybody know anything about this? I'm wondering if the new client is causing problems...
ID: 592361 · Report as offensive
Profile TheDogFather

Send message
Joined: 15 May 03
Posts: 9
Credit: 2,604,609
RAC: 0
United Kingdom
Message 592368 - Posted: 25 Jun 2007, 11:28:58 UTC

If two results have already been received before you start crunching yours, it gets aborted.
ID: 592368 · Report as offensive
Profile KWSN - MajorKong
Volunteer tester
Avatar

Send message
Joined: 5 Jan 00
Posts: 2892
Credit: 1,499,890
RAC: 0
United States
Message 592374 - Posted: 25 Jun 2007, 11:37:30 UTC - in response to Message 592361.  
Last modified: 29 Jun 2007, 2:24:57 UTC

Yesterday I upgraded the Boinc client to 5.10.7 from 5.8.16 and have since then seen quite a few of my SETI WUs change their status to "aborted by project". This happened yesterday and now again today... anybody know anything about this? I'm wondering if the new client is causing problems...


No, its not a client error, it is expected behavior.

S@H has recently enabled a server-side feature that will abort a result on your computer under certain circumstances.

The one that is likely affecting you is when you have not yet started a result, but the project already has that workunit validated (2 others have returned their results for that workunit, and the results are strongly similar), that result on your computer is aborted (because it is not needed) when your BOINC client contacts the scheduler. The newer BOINC such as 5.10.7 can handle this. The older ones do not, so there is no change to their behavior (ie. no aborts).

You are not out of any crunch time in this case, since it only triggers if you have not yet started that result. If you have started it, it does not abort. The only down side is that currently, an 'abort by project' will deduct one from your quota on that machine. But, this should not be a problem, because every successful result returned and validated doubles the quota (up to the max value).

Other cases where results in progress *might* be aborted include when the project determines that results are defective (such as being improperly split), or otherwise canceled by the project. I've only seen this happen a couple of times over at the S@H/Astropulse Beta project when all AP workunits were cancelled due to an error in the AP app under testing.

Again, what I believe to be happening to you is normal. The aborted results in your cache should be replaced the next time the BOINC client contacts the S@H scheduler, and you are not out of any crunch time. I could tell you for certain if you did not have your computers hidden.

Hope this helps.

(test edit)
https://youtu.be/iY57ErBkFFE

#Texit

Don't blame me, I voted for Johnson(L) in 2016.

Truth is dangerous... especially when it challenges those in power.
ID: 592374 · Report as offensive
Profile bounty.hunter
Volunteer tester
Avatar

Send message
Joined: 22 Mar 04
Posts: 442
Credit: 459,063
RAC: 0
India
Message 592385 - Posted: 25 Jun 2007, 12:39:05 UTC - in response to Message 592374.  


You are not out of any crunch time in this case, since it only triggers if you have not yet started that result. If you have started it, it does not abort. The only down side is that currently, an 'abort by project' will deduct one from your quota on that machine. But, this should not be a problem, because every successful result returned and validated doubles the quota (up to the max value).


Actually you could be out of crunch time in the following case even if you have started the workunit.....

The workunit has started but before the project server had been contacted, the workunit had already met a quorum, credit had been granted, the result was overdue and has been removed from the results database.

In this case, even if you are halfway through the workunit, it will be aborted.


ID: 592385 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 592476 - Posted: 25 Jun 2007, 18:34:56 UTC - in response to Message 592385.  


You are not out of any crunch time in this case, since it only triggers if you have not yet started that result. If you have started it, it does not abort. The only down side is that currently, an 'abort by project' will deduct one from your quota on that machine. But, this should not be a problem, because every successful result returned and validated doubles the quota (up to the max value).


Actually you could be out of crunch time in the following case even if you have started the workunit.....

The workunit has started but before the project server had been contacted, the workunit had already met a quorum, credit had been granted, the result was overdue and has been removed from the results database.

In this case, even if you are halfway through the workunit, it will be aborted.

True, but note that it is a net savings in crunch time. Being past deadline the result would not get credit if crunched to the end, so the abort saves the time which would have been spent continuing to normal termination.
                                                                 Joe
ID: 592476 · Report as offensive
Kurt Schmucker

Send message
Joined: 11 Jan 00
Posts: 72
Credit: 130,823,400
RAC: 207
United States
Message 592640 - Posted: 26 Jun 2007, 1:08:42 UTC

Upgraded 5.8.16 -> 5.10.7.

BOINC aborted all my WUs that hadn't yet started, and refused to send new ones. ("No work from project.")

Is this expected? Should I not have upgraded?


ID: 592640 · Report as offensive
Profile Carlos
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 29859
Credit: 57,275,487
RAC: 157
United States
Message 592669 - Posted: 26 Jun 2007, 2:04:41 UTC - in response to Message 592640.  

Upgraded 5.8.16 -> 5.10.7.

BOINC aborted all my WUs that hadn't yet started, and refused to send new ones. ("No work from project.")

Is this expected? Should I not have upgraded?



Yes this is happening to most of the wu that have been in your computer for a while. S@H is now requiring only 2 completed results. If you have a wu that has already received 2 wu, and has not been started, then it will be marked as “not needed”. Unfortunately, the report of the unit is marked as "client error". Because of this your daily quota will be reduced. When I did it one computer was allowed no more than 1 wu per day. However, as you return good results your quota will be increased again. A little frustrating, but it will not take long for your computer to get back to it’s normal quota of 100 wu per day.
ID: 592669 · Report as offensive
Profile Jim-R.
Volunteer tester
Avatar

Send message
Joined: 7 Feb 06
Posts: 1494
Credit: 194,148
RAC: 0
United States
Message 592670 - Posted: 26 Jun 2007, 2:07:08 UTC - in response to Message 592640.  

Upgraded 5.8.16 -> 5.10.7.

BOINC aborted all my WUs that hadn't yet started, and refused to send new ones. ("No work from project.")

Is this expected? Should I not have upgraded?


How large of a cache are you running?
If you have a large cache then many of the work units you had in it were probably already validated. A new feature of 5.10.7 is the ability of the project to abort work units that have already reached quorum and validated. Since at the present time the aborted work units show up as a client error your quota was reduced but as soon as you return a few good wu's your quota will go back up.

I'm just guessing, as your computers are hidden so we can't tell by the result data.
Jim

Some people plan their life out and look back at the wealth they've had.
Others live life day by day and look back at the wealth of experiences and enjoyment they've had.
ID: 592670 · Report as offensive
Profile BANZAI56
Volunteer tester

Send message
Joined: 17 May 00
Posts: 139
Credit: 47,299,948
RAC: 2
United States
Message 592717 - Posted: 26 Jun 2007, 5:39:38 UTC - in response to Message 592670.  

A new feature of 5.10.7



Did you actually type that with a straight face? ;)


This as we are nearly crunching faster than they can split 'em...
ID: 592717 · Report as offensive
Morris
Volunteer tester

Send message
Joined: 11 Sep 01
Posts: 57
Credit: 9,077,302
RAC: 29
Italy
Message 592809 - Posted: 26 Jun 2007, 10:17:32 UTC - in response to Message 592385.  


You are not out of any crunch time in this case, since it only triggers if you have not yet started that result. If you have started it, it does not abort. The only down side is that currently, an 'abort by project' will deduct one from your quota on that machine. But, this should not be a problem, because every successful result returned and validated doubles the quota (up to the max value).


Actually you could be out of crunch time in the following case even if you have started the workunit.....

The workunit has started but before the project server had been contacted, the workunit had already met a quorum, credit had been granted, the result was overdue and has been removed from the results database.

In this case, even if you are halfway through the workunit, it will be aborted.



Bounty ... let me ask a difficult one ...
suppose quorum is met (suppose 2 results) AND i am NOT running BOINC client 5.10.7 (in that case, if computation has not yet started, i will get an "aborted by project") .... if credit is granted (on the first two host reporting the results) and wu removed from db, am i supposed to get a VALIDATION ERROR upon report of my result (suppose 3rd one) ? If it is like that, i could explain myself some "weird" behavior about a couple of host i am running ...

Hope i made my question "messy" enough :D


M.

ID: 592809 · Report as offensive
Profile bounty.hunter
Volunteer tester
Avatar

Send message
Joined: 22 Mar 04
Posts: 442
Credit: 459,063
RAC: 0
India
Message 592837 - Posted: 26 Jun 2007, 10:54:05 UTC - in response to Message 592809.  

Bounty ... let me ask a difficult one ...
suppose quorum is met (suppose 2 results) AND i am NOT running BOINC client 5.10.7 (in that case, if computation has not yet started, i will get an "aborted by project") .... if credit is granted (on the first two host reporting the results) and wu removed from db, am i supposed to get a VALIDATION ERROR upon report of my result (suppose 3rd one) ? If it is like that, i could explain myself some "weird" behavior about a couple of host i am running ...

Hope i made my question "messy" enough :D


M.


I beleive aborted by project should happen only if you are running 5.10.7, if the the quorum has been met....however over on Beta, according to Ingleside here on SETI main that might happen with 5.5.1 and above as well. I suppose Joe or Ingleside can shed light on that.

If the wu has been removed from the db then you probably would get a validation error....however with recent changes to the BOINC servers, lot of people who have "report immediately" enabled either because of the BOINC client they are running or because they are using 5.10.7 with connect set to "0" have been getting a validate error.

This is because after uploading the result, it has still not been linked in the db and when the BOINC client reports the result it is not found, resulting in a validate error. The workaround for this is to set the default connect to time to 0.001.

I hope this explains what you're asking about...
ID: 592837 · Report as offensive
michael
Volunteer tester

Send message
Joined: 19 Sep 03
Posts: 21
Credit: 183,890
RAC: 0
United Kingdom
Message 592844 - Posted: 26 Jun 2007, 11:23:38 UTC - in response to Message 592361.  

Yesterday I upgraded the Boinc client to 5.10.7 from 5.8.16 and have since then seen quite a few of my SETI WUs change their status to "aborted by project". This happened yesterday and now again today... anybody know anything about this? I'm wondering if the new client is causing problems...


If the work units in question are/were still being sent out to three users. Then this could be caused by the auto cancel. On reaching quorum of 2. The first two sucessful the third will be aborted. I think that is how it was explaned to me over on seti@homebeta yesterday. Well worth looking into.
ID: 592844 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 592902 - Posted: 26 Jun 2007, 13:42:11 UTC - in response to Message 592809.  
Last modified: 26 Jun 2007, 13:42:31 UTC


Bounty ... let me ask a difficult one ...
suppose quorum is met (suppose 2 results) AND i am NOT running BOINC client 5.10.7 (in that case, if computation has not yet started, i will get an "aborted by project") .... if credit is granted (on the first two host reporting the results) and wu removed from db, am i supposed to get a VALIDATION ERROR upon report of my result (suppose 3rd one) ? If it is like that, i could explain myself some "weird" behavior about a couple of host i am running ...

Hope i made my question "messy" enough :D


M.


The answer to the question is a qualified yes. Keep in mind the main deciding factor to whether you get credit for a result when you are late is has the WU gone through the validator successfully. If the answer is yes, you'll get a validate error, even if the WU has not been purged. If the answer is no, then you would get credit as long as your host reports before any reissued result is reported which forms a quorum.

Also, once your host is issued a result, there is no way the WU would be purged before your host had reported or the latest deadline for any result in the WU had expired (unless there was a backend problem, of course).

HTH,

Alinator
ID: 592902 · Report as offensive
Morris
Volunteer tester

Send message
Joined: 11 Sep 01
Posts: 57
Credit: 9,077,302
RAC: 29
Italy
Message 592912 - Posted: 26 Jun 2007, 14:04:15 UTC - in response to Message 592902.  
Last modified: 26 Jun 2007, 14:05:28 UTC


Bounty ... let me ask a difficult one ...
suppose quorum is met (suppose 2 results) AND i am NOT running BOINC client 5.10.7 (in that case, if computation has not yet started, i will get an "aborted by project") .... if credit is granted (on the first two host reporting the results) and wu removed from db, am i supposed to get a VALIDATION ERROR upon report of my result (suppose 3rd one) ? If it is like that, i could explain myself some "weird" behavior about a couple of host i am running ...

Hope i made my question "messy" enough :D


M.


The answer to the question is a qualified yes. Keep in mind the main deciding factor to whether you get credit for a result when you are late is has the WU gone through the validator successfully. If the answer is yes, you'll get a validate error, even if the WU has not been purged. If the answer is no, then you would get credit as long as your host reports before any reissued result is reported which forms a quorum.

Also, once your host is issued a result, there is no way the WU would be purged before your host had reported or the latest deadline for any result in the WU had expired (unless there was a backend problem, of course).

HTH,

Alinator


Alinator, as far as i knew, the credit was granted for ALL results (valid, meaningful) received within the deadline, no matter if i was the first one to report, the second one or , in case of quorum set to 2 , the third one. If it is like that, now things are different, credit will be granted *JUST* to the first two valid results reported; obviously, the third one (same example as before, minimum quorum of 2) will find the result purged from db (assuming that validation report does not take much time) once results are successfully validated...

Is it working like that now or not ? did i got it all well or no ?

Thanks
M.
ID: 592912 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 592913 - Posted: 26 Jun 2007, 14:06:53 UTC
Last modified: 26 Jun 2007, 14:08:50 UTC

Mauro, look the the wuid in the "P60 thread". The other two finished it 12 days ago(and got credit then), yet I got credit for the third which took 16 days to complete. It has Boinc 5.10.7 installed.
ID: 592913 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 592933 - Posted: 26 Jun 2007, 14:40:13 UTC - in response to Message 592912.  
Last modified: 26 Jun 2007, 15:27:57 UTC

Alinator, as far as i knew, the credit was granted for ALL results (valid, meaningful) received within the deadline, no matter if i was the first one to report, the second one or , in case of quorum set to 2 , the third one. If it is like that, now things are different, credit will be granted *JUST* to the first two valid results reported; obviously, the third one (same example as before, minimum quorum of 2) will find the result purged from db (assuming that validation report does not take much time) once results are successfully validated...

Is it working like that now or not ? did i got it all well or no ?

Thanks
M.


1.) If your host has started a result and gets it back before the deadline expires, you will always get credit for if it passes validation. Enabling auto-abort doesn't change this at all.

2.) The earliest time a WU can be purged from the BOINC DB:

a.) All of the outstanding results have been returned, a quorum formed, and credit granted.

b.) A quorum has formed and any trailers have been successfully auto-aborted and reported.

c.) The quorum has formed and the deadline for all outstanding results has expired.

The point you're missing here is your scenario is not possible under normal conditions. As long as any result in a WU has time left on the deadline, the WU will not be purged from the DB, and thus the result would be eligible to have credit granted if it passes validation. This has not changed since enabling auto-abort.

Proof:

My 'slugs' were almost never in the quorum before auto-abort was enabled but OTOH virtually never miss a deadline. IOW, they are the trailer over 95% of the time.

Since I don't have 5.10.7 on them, they don't have a clue about the 221 and so run whatever they're given to completion. I have yet to have one not get credit for it's trailer.

Alinator

ID: 592933 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 592936 - Posted: 26 Jun 2007, 14:45:46 UTC - in response to Message 592837.  

...
I beleive aborted by project should happen only if you are running 5.10.7, if the the quorum has been met....however over on Beta, according to Ingleside here on SETI main that might happen with 5.5.1 and above as well. I suppose Joe or Ingleside can shed light on that.


The initial client side implementation was checked in by David Anderson May 25 2006. It used the tags <result_abort> for unconditional project aborts and <result_abort_if_unstarted> for conditional. That is in core client 5.5.1 through 5.8.16, but it was not implemented server side.

On March 21 2007, David checked in a fix for a bug which crashed the client when it got the conditional form, and changed the tag to <result_abort_if_not_started> so it could safely be sent to older clients. Mac and Linux 5.8.17 clients and Windows 5.9.3 clients were built after that date.

The server side Scheduler code changes to allow a project to use the feature were added April 5 2007, and a bugfix May 4 2007.

The server side changes are strictly in the Scheduler, it simply tells the core client to do the aborts. It doesn't change the status of the WU directly. If the core client does the abort then the report of that result allows early cleanup server side.
                                                                 Joe
ID: 592936 · Report as offensive
Modesto
Volunteer tester

Send message
Joined: 4 Jul 04
Posts: 47
Credit: 321,752
RAC: 0
Canada
Message 593790 - Posted: 27 Jun 2007, 20:09:26 UTC - in response to Message 592374.  
Last modified: 27 Jun 2007, 20:09:47 UTC

Yesterday I upgraded the Boinc client to 5.10.7 from 5.8.16 and have since then seen quite a few of my SETI WUs change their status to "aborted by project". This happened yesterday and now again today... anybody know anything about this? I'm wondering if the new client is causing problems...


No, its not a client error, it is expected behavior.

S@H has recently enabled a server-side feature that will abort a result on your computer under certain circumstances.

The one that is likely affecting you is when you have not yet started a result, but the project already has that workunit validated (2 others have returned their results for that workunit, and the results are strongly similar), that result on your computer is aborted (because it is not needed) when your BOINC client contacts the scheduler. The newer BOINC such as 5.10.7 can handle this. The older ones do not, so there is no change to their behavior (ie. no aborts).

You are not out of any crunch time in this case, since it only triggers if you have not yet started that result. If you have started it, it does not abort. The only down side is that currently, an 'abort by project' will deduct one from your quota on that machine. But, this should not be a problem, because every successful result returned and validated doubles the quota (up to the max value).

Other cases where results in progress *might* be aborted include when the project determines that results are defective (such as being improperly split), or otherwise canceled by the project. I've only seen this happen a couple of times over at the S@H/Astropulse Beta project when all AP workunits were cancelled due to an error in the AP app under testing.

Again, what I believe to be happening to you is normal. The aborted results in your cache should be replaced the next time the BOINC client contacts the S@H scheduler, and you are not out of any crunch time. I could tell you for certain if you did not have your computers hidden.

Hope this helps.


Thank you very much for your reply and explanation, since I just upgraded to 5.10.7 I had not seen that behaviour before, though I suspected a possible connection as I did check those units and saw that 2 successful results had come in for each of those cases. The daily quota did drop quite a bit at that time, but as I was still getting work and had a decent cache, it was/is not a problem bringing said quota badk up... in other words no problems at all, I simply worried something could have been wrong on my end (which seems not to be the case).

Thanks again :)
ID: 593790 · Report as offensive
Profile KenKLRC
Avatar

Send message
Joined: 12 Jul 06
Posts: 27
Credit: 7,791,658
RAC: 0
United States
Message 595179 - Posted: 29 Jun 2007, 12:42:29 UTC

I upgraded all my machines to 5.10.8 but then started getting a preponderance of Client Errors See Below


560510765 137433892 28 Jun 2007 18:03:18 UTC 29 Jun 2007 2:26:50 UTC Over Client error Aborted 0.00 0.00 ---

560510757 137433902 29 Jun 2007 12:17:44 UTC 23 Jul 2007 14:30:36 UTC In Progress Unknown New --- --- ---

560510717 137433888 28 Jun 2007 18:03:18 UTC 29 Jun 2007 12:17:44 UTC Over No reply New 0.00 --- ---

560510705 137433876 28 Jun 2007 18:03:18 UTC 28 Jun 2007 20:28:43 UTC Over Client error Aborted 0.00 0.00 ---


As a result I returned back to 5.8.16 last night (23H00 6/28) and so far the Client Errors have not appeared. Can anyone provide insight into this anomaly?

ID: 595179 · Report as offensive
Profile Henk Haneveld
Volunteer tester

Send message
Joined: 16 May 99
Posts: 154
Credit: 1,577,293
RAC: 1
Netherlands
Message 595217 - Posted: 29 Jun 2007, 13:12:38 UTC - in response to Message 595179.  

I upgraded all my machines to 5.10.8 but then started getting a preponderance of Client Errors See Below


560510765 137433892 28 Jun 2007 18:03:18 UTC 29 Jun 2007 2:26:50 UTC Over Client error Aborted 0.00 0.00 ---

560510757 137433902 29 Jun 2007 12:17:44 UTC 23 Jul 2007 14:30:36 UTC In Progress Unknown New --- --- ---

560510717 137433888 28 Jun 2007 18:03:18 UTC 29 Jun 2007 12:17:44 UTC Over No reply New 0.00 --- ---

560510705 137433876 28 Jun 2007 18:03:18 UTC 28 Jun 2007 20:28:43 UTC Over Client error Aborted 0.00 0.00 ---


As a result I returned back to 5.8.16 last night (23H00 6/28) and so far the Client Errors have not appeared. Can anyone provide insight into this anomaly?


Try reading this thread.

ID: 595217 · Report as offensive
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Number crunching : Aborted by project?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.