Too late to validate?

Author	Message
Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0	Message 1240992 - Posted: 4 Jun 2012, 2:59:48 UTC Last modified: 4 Jun 2012, 3:00:37 UTC Here is my list of "invalids" for my main cruncher:http://setiathome.berkeley.edu/results.php?hostid=6371091&offset=0&show_names=0&state=4&appid= Note they all show as "completed too late to validate", but had a less than 2 day turnaround. What gets me is that all of them were sent to me as the third system, even though the first two rigs had completed and returned their results. Anybody have a thought as to what's going on here? ID: 1240992 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22217 Credit: 416,307,556 RAC: 380	Message 1241025 - Posted: 4 Jun 2012, 6:58:35 UTC This happens periodically. Most often after a big outage we see clusters of WU coming out with impossibly short deadlines. In case they were initially sent out just before the outage to two other users, who processed them during the outage. The server has decided to send them out again, just after the outage, but before the other two users have reported, but with impossibly short deadlines. I suspect we are going to see lots of these in the next few days :-( Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1241025 ·

Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0	Message 1241026 - Posted: 4 Jun 2012, 7:05:48 UTC - in response to Message 1241025. This happens periodically. Most often after a big outage we see clusters of WU coming out with impossibly short deadlines. In case they were initially sent out just before the outage to two other users, who processed them during the outage. The server has decided to send them out again, just after the outage, but before the other two users have reported, but with impossibly short deadlines. I suspect we are going to see lots of these in the next few days :-( Rob, I understand what you're saying, as it happens with VLAR's being re-sent on a GPU work request. But these were sent to me as user #3 AFTER users 1&2 had reported them. And I don't think they were sent to me with short deadlines; I returned them completed in 2 days (so I don't know what the project deadline for them was). I'm guessing that somehow, they got scheduled for resend even though they had been properly reported due to some timing issue between the validators and the scheduler, but I was looking for someone to confirm my theory. ID: 1241026 ·

Link Send message Joined: 18 Sep 03 Posts: 834 Credit: 1,807,369 RAC: 0	Message 1241028 - Posted: 4 Jun 2012, 7:13:13 UTC - in response to Message 1240992. Note they all show as "completed too late to validate", but had a less than 2 day turnaround. Well, they were returned about 5 minutes after the deadline and after both your wingmen returned their results, so technically that is correct. What gets me is that all of them were sent to me as the third system, even though the first two rigs had completed and returned their results. No, they were always send to you before the 2nd result was returned. So that was right as well. The real question is, why did those WUs had so short dealines? Server bug? ID: 1241028 ·

Link Send message Joined: 18 Sep 03 Posts: 834 Credit: 1,807,369 RAC: 0	Message 1241030 - Posted: 4 Jun 2012, 7:16:08 UTC - in response to Message 1241026. And I don't think they were sent to me with short deadlines; I returned them completed in 2 days (so I don't know what the project deadline for them was). You can see the deadline in the task details, for example: Name 12mr10aa.19911.24347.3.10.48_2 Workunit 999039288 Created 1 Jun 2012 \| 5:16:22 UTC Sent 1 Jun 2012 \| 8:05:14 UTC Received 3 Jun 2012 \| 17:29:43 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 6371091 Report deadline 3 Jun 2012 \| 17:24:30 UTC Run time 744.77 CPU time 107.31 Validate state Task was reported too late to validate Credit 0.00 Application version SETI@home Enhanced Anonymous platform (NVIDIA GPU) ID: 1241030 ·

Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0	Message 1241033 - Posted: 4 Jun 2012, 7:32:22 UTC - in response to Message 1241030. Last modified: 4 Jun 2012, 7:43:38 UTC And I don't think they were sent to me with short deadlines; I returned them completed in 2 days (so I don't know what the project deadline for them was). You can see the deadline in the task details, for example: Name 12mr10aa.19911.24347.3.10.48_2 Workunit 999039288 Created 1 Jun 2012 \| 5:16:22 UTC Sent 1 Jun 2012 \| 8:05:14 UTC Received 3 Jun 2012 \| 17:29:43 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 6371091 Report deadline 3 Jun 2012 \| 17:24:30 UTC Run time 744.77 CPU time 107.31 Validate state Task was reported too late to validate Credit 0.00 Application version SETI@home Enhanced Anonymous platform (NVIDIA GPU) OK, it's late here, and I'm tired, and these will all probably be deleted by tomorrow morning my time, but there are alot of timing issues about these I don't understand. It's not the credits; it's my not understanding how these came about in the first place. Hopefully, we won't, as Rob suggested, be seeing alot of these. EDIT: For example, it's curious that the project "deadline" is the same in all 8 workunits, and is almost exactly 5 minutes before I returned the workunits. It's as if when I returned them, S@H said "oops, we don't need these, let's change the deadline and make them invalid". ID: 1241033 ·

Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597	Message 1241035 - Posted: 4 Jun 2012, 7:45:56 UTC - in response to Message 1241025. This happens periodically. Most often after a big outage we see clusters of WU coming out with impossibly short deadlines. In case they were initially sent out just before the outage to two other users, who processed them during the outage. The server has decided to send them out again, just after the outage, but before the other two users have reported, but with impossibly short deadlines. I suspect we are going to see lots of these in the next few days :-( yep, had 250+ of these the other day ... good to see that others are getting the good news as well ... ID: 1241035 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1241086 - Posted: 4 Jun 2012, 14:47:11 UTC - in response to Message 1241033. And I don't think they were sent to me with short deadlines; I returned them completed in 2 days (so I don't know what the project deadline for them was). You can see the deadline in the task details, for example: Name 12mr10aa.19911.24347.3.10.48_2 Workunit 999039288 Created 1 Jun 2012 \| 5:16:22 UTC Sent 1 Jun 2012 \| 8:05:14 UTC Received 3 Jun 2012 \| 17:29:43 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 6371091 Report deadline 3 Jun 2012 \| 17:24:30 UTC Run time 744.77 CPU time 107.31 Validate state Task was reported too late to validate Credit 0.00 Application version SETI@home Enhanced Anonymous platform (NVIDIA GPU) OK, it's late here, and I'm tired, and these will all probably be deleted by tomorrow morning my time, but there are alot of timing issues about these I don't understand. It's not the credits; it's my not understanding how these came about in the first place. Hopefully, we won't, as Rob suggested, be seeing alot of these. EDIT: For example, it's curious that the project "deadline" is the same in all 8 workunits, and is almost exactly 5 minutes before I returned the workunits. It's as if when I returned them, S@H said "oops, we don't need these, let's change the deadline and make them invalid". Yes, it's another side effect of the server mod to only accept 64 at a time. At 17:24:30 UTC your host reported more than 64, so the excess became subject to the resend lost tasks logic. When that found some WUs already had a canonical result it expired them immediately (set the deadline to 'now') rather than resending them. Then on the next attempt to report them at 17:29:43 UTC they were seen as too late. Joe ID: 1241086 ·

Link Send message Joined: 18 Sep 03 Posts: 834 Credit: 1,807,369 RAC: 0	Message 1241155 - Posted: 4 Jun 2012, 16:52:02 UTC - in response to Message 1241086. You mean basically every user is forced now to set the limit to max 64 tasks per report in his cc_config.xml, otherwise there's risk of loosing credits? Not that I'm going to run into such issues anytime soon, just curious... However, that still does not explain, why some of the _0 and _1 results for these WUs had so short deadlines. ID: 1241155 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1241167 - Posted: 4 Jun 2012, 17:03:21 UTC - in response to Message 1241086. Yes, it's another side effect of the server mod to only accept 64 at a time. At 17:24:30 UTC your host reported more than 64, so the excess became subject to the resend lost tasks logic. When that found some WUs already had a canonical result it expired them immediately (set the deadline to 'now') rather than resending them. Then on the next attempt to report them at 17:29:43 UTC they were seen as too late. Joe Do we have any word if this wonderful kluge may be rescinded or fixed during tomorrow's outage? "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1241167 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1241246 - Posted: 4 Jun 2012, 18:58:51 UTC - in response to Message 1241155. You mean basically every user is forced now to set the limit to max 64 tasks per report in his cc_config.xml, otherwise there's risk of loosing credits? Not that I'm going to run into such issues anytime soon, just curious... Yes, probably any host with RAC of 5000 or above ought to be using that safety measure. However, that still does not explain, why some of the _0 and _1 results for these WUs had so short deadlines. I assume it was the same 64 limit causing the tasks to be resent, but didn't look at those details while the WUs were unpurged. Joe ID: 1241246 ·

BilBg Volunteer tester Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0	Message 1241260 - Posted: 4 Jun 2012, 19:29:42 UTC - in response to Message 1241257. I think if you set NNT (until all is reported) the server bug will not make false resents to you? Â - ALF - "Find out what you don't do well ..... then don't do it!" :) Â ID: 1241260 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1241266 - Posted: 4 Jun 2012, 19:39:22 UTC - in response to Message 1241260. I think if you set NNT (until all is reported) the server bug will not make false resents to you? Resends are still sent even when NNT is set. In related news I have had <max_tasks_reported>100</max_tasks_reported> set on my faster machines for some time. That seems to keep them happy when a large number of tasks build up. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1241266 ·

BilBg Volunteer tester Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0	Message 1241271 - Posted: 4 Jun 2012, 19:48:28 UTC - in response to Message 1241266. I think if you set NNT (until all is reported) the server bug will not make false resents to you? Resends are still sent even when NNT is set. Don't you have to actually ask for work like in: 02-Jun-2012 21:58:25 [SETI@home] Reporting 1 completed tasks, requesting new tasks for CPU and GPU ... for the Resends logic to kick in? Â - ALF - "Find out what you don't do well ..... then don't do it!" :) Â ID: 1241271 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1241273 - Posted: 4 Jun 2012, 19:49:28 UTC - in response to Message 1241266. Last modified: 4 Jun 2012, 19:50:15 UTC I think if you set NNT (until all is reported) the server bug will not make false resents to you? Resends are still sent even when NNT is set. No they aren't, not at this project anyway, at Einstein and other projects with older schedulers, yes resends are sent with NNT set. Changeset [trac]changeset:21823[/trac] â€¢scheduler: don't resend work if client isn't requesting work Claggy ID: 1241273 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1241281 - Posted: 4 Jun 2012, 19:58:50 UTC - in response to Message 1241273. I think if you set NNT (until all is reported) the server bug will not make false resents to you? Resends are still sent even when NNT is set. No they aren't, not at this project anyway, at Einstein and other projects with older schedulers, yes resends are sent with NNT set. Changeset [trac]changeset:21823[/trac] â€¢scheduler: don't resend work if client isn't requesting work Claggy Ah OK. I must have seen that with an older client version I was running then. By the date of that change set it looks like 6.10.58 and newer should have that change. I was probably using a 6.10.45 or seomthing when I had that occur. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1241281 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1241291 - Posted: 4 Jun 2012, 20:09:00 UTC - in response to Message 1241281. Last modified: 4 Jun 2012, 20:27:37 UTC I think if you set NNT (until all is reported) the server bug will not make false resents to you? Resends are still sent even when NNT is set. No they aren't, not at this project anyway, at Einstein and other projects with older schedulers, yes resends are sent with NNT set. Changeset [trac]changeset:21823[/trac] â€¢scheduler: don't resend work if client isn't requesting work Claggy Ah OK. I must have seen that with an older client version I was running then. By the date of that change set it looks like 6.10.58 and newer should have that change. I was probably using a 6.10.45 or seomthing when I had that occur. That's the scheduler on the server, ie on synergy, not the scheduler in the client, older version of Boinc (pre 6.10.x) used to still ask for work even if the preferences were set to not use a resourse, (it was a server side preference then) Boinc 6.10.x and later used different preferences (i think they were combined on the website later) that stops Boinc 6.10.x and later from even asking for work, Claggy ID: 1241291 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1241376 - Posted: 4 Jun 2012, 22:33:18 UTC - in response to Message 1241291. I think if you set NNT (until all is reported) the server bug will not make false resents to you? Resends are still sent even when NNT is set. No they aren't, not at this project anyway, at Einstein and other projects with older schedulers, yes resends are sent with NNT set. Changeset [trac]changeset:21823[/trac] â€¢scheduler: don't resend work if client isn't requesting work Claggy Ah OK. I must have seen that with an older client version I was running then. By the date of that change set it looks like 6.10.58 and newer should have that change. I was probably using a 6.10.45 or seomthing when I had that occur. That's the scheduler on the server, ie on synergy, not the scheduler in the client, older version of Boinc (pre 6.10.x) used to still ask for work even if the preferences were set to not use a resourse, (it was a server side preference then) Boinc 6.10.x and later used different preferences (i think they were combined on the website later) that stops Boinc 6.10.x and later from even asking for work, Claggy It seems like it was only a few months ago when I had set NNT on a machine and then proceeded to get numerous resends. Perhaps it was just much longer ago then it seems like. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1241376 ·

Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597	Message 1241624 - Posted: 5 Jun 2012, 9:43:38 UTC - in response to Message 1241257. [quote]You mean basically every user is forced now to set the limit to max 64 tasks per report in his cc_config.xml, otherwise there's risk of loosing credits? Not that I'm going to run into such issues anytime soon, just curious... But since my versions of Boinc does not support setting the limits in cc_config, and the likelyhood of me upgrading any of my clients is close to zero, my hope is that they fix the issue at the source. /quote] my thoughts exactly Sten ... ID: 1241624 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1241627 - Posted: 5 Jun 2012, 9:53:39 UTC - in response to Message 1241624. Last modified: 5 Jun 2012, 9:54:16 UTC [quote]You mean basically every user is forced now to set the limit to max 64 tasks per report in his cc_config.xml, otherwise there's risk of loosing credits? Not that I'm going to run into such issues anytime soon, just curious... But since my versions of Boinc does not support setting the limits in cc_config, and the likelyhood of me upgrading any of my clients is close to zero, my hope is that they fix the issue at the source. /quote] my thoughts exactly Sten ... David has already done a Changeset, about 6 hours ago, it might even be on Seti now (but i doubt it): Changeset [trac]changeset:25733[/trac] â€¢scheduler: if we truncate the # of results accepted (like we're doing in SETI@home) don't resend lost results since we don't know what they are Claggy ID: 1241627 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.