Message boards :
Number crunching :
Can't find good ans. 4 LOSS of WUs..?
Message board moderation
Author | Message |
---|---|
[Cx] Send message Joined: 25 Jul 05 Posts: 141 Credit: 25,742 RAC: 0 |
I've tried to INFORM myself as much as possible, cuz we all know how clueless it is to duplicate questions already posted and replied to... I've read the Tech News, and get the general idea as to WHY there's been a delay with u/l / d/l of WUs... I've read through a few threads -- some of them quite long -- in search of Clues, hoping I wouldn't have to post... Now that I am, I'd appreciate constructive and informative answers, please. If you're aching to flame because you think I'm "asking the same question as thousands of others", please don't waste your time. (Nothing you say will move me to give you the time of day to reply to your flame.) ;) Thanks. OK then... SITUATION: Obviously being similar to many at present -- completed WUs aren't going through / being u/l'd. Message screen is reporting "Temporarily failed upload of [WU] / Backing off [min] [sec] on upload.." and "Scheduler request .. failed / No schedulers responded / Deferring communication with project.." ((Sidebar: How do you get your Messages to also list a "Reason"? Someone posted their messages showing a "Reason" line; I don't get any "Reason"s at all!)) PROBLEM: IN STARK CONTRAST to others I've read about, I DO NOT have a "screen-full" (no matter what size! ;) ) of WUs waiting to u/l, AND YET I have **no movement** in my Credit count for WUs completed. .. and it's been A WEEK since my last credit, and I've "completed" several WUs..! QUESTION: Since the implication that NOT having a list of WUs waiting to Upload (u/l) is that they **should have** u/l'd, then where's my Credit? ALTERNATIVELY, Is there any chance that a "failed" u/l eventually removes itself (?!?) from the Transfers Q? ALSO: If you have a COMPLETED WU, and it has FAILED to u/l, and *then* you select it and click "Suspend", WHAT is _supposed_ to happen to/with the WU? (Asking this because.. I tried to "suspend" a WU whose [WORK tab] Status was "Uploading", and I've since Shutdown & rebooted my PC to find the WU is **GONE**!!) I have NOT "Reset" the Project. I have NOT "Detach"ed I HAVE clicked "No New Work" (for the time being) 2ND Q: Could there be "completed" WUs sitting on my PC that BOINC has somehow "lost track" of, that may still be salvaged and u/l'd? Okay.. that's my lot. Hope someone, or several someones, with many more clues than myself, is/are able to Clue Me In. Thanks! -( moi )- |
cjsoftuk Send message Joined: 3 Sep 04 Posts: 248 Credit: 183,721 RAC: 0 |
Question 1: Why haven't I got credit? Answer: The Validators are off at the moment. No credits are being produced. Question 2: Is there a chance that a failed u/l removes itself? Answer: NO. BOINC will never destroy work that is completed. Question 3: If you click suspend on an U/ling WU, what happens? Answer: As far as I know, nothing should happen. However, it seems that those who are a little inquisitive have found a new bug. Well done on finding this bug. Reporting to BOINC Bug Database. Question 4: Can BOINC possibly have lost track of WUs? Answer: It is possible that if BOINC was killed by Windows while it was writing the Client_State.xml file, then yes, it is possible it would lose work. Not without very careful editing of client_state.xml would you be able to recover them. |
J D K Send message Joined: 26 May 04 Posts: 1295 Credit: 311,371 RAC: 0 |
Your WUs 40504969 10 Dec 2005 3:50:06 UTC 24 Dec 2005 3:50:06 UTC In Progress Unknown New --- --- --- 167513580 40075906 7 Dec 2005 3:08:59 UTC 8 Dec 2005 3:47:24 UTC Over Client error Downloading 0.00 0.00 --- 166721572 39885694 6 Dec 2005 3:27:22 UTC 20 Dec 2005 3:27:22 UTC In Progress Unknown New --- --- --- 166681972 39875953 6 Dec 2005 2:16:00 UTC 8 Dec 2005 3:47:24 UTC Over Success Done 19,957.33 43.23 pending 165726240 39641989 5 Dec 2005 0:34:55 UTC 8 Dec 2005 3:47:24 UTC Over Success Done 15,864.43 34.37 pending 165719608 39640364 5 Dec 2005 0:24:46 UTC 5 Dec 2005 2:32:14 UTC Over Success Done 6,678.22 13.89 pending 165522804 39592262 4 Dec 2005 19:32:37 UTC 5 Dec 2005 10:56:54 UTC Over Success Done 463.64 0.96 pending 165366054 39554076 4 Dec 2005 15:45:30 UTC 5 Dec 2005 10:56:54 UTC Over Validate error Done 15,751.76 --- --- 164844749 39426971 4 Dec 2005 1:53:03 UTC 4 Dec 2005 19:42:46 UTC Over Success Done 13,756.47 28.60 30.14 163938425 39206217 3 Dec 2005 1:14:52 UTC 4 Dec 2005 20:39:17 UTC Over Validate error Done 17,286.52 --- --- 163188796 39024369 2 Dec 2005 5:21:44 UTC 4 Dec 2005 1:42:53 UTC Over Success Done 15,971.16 33.21 32.52 162968063 38972087 1 Dec 2005 23:07:47 UTC 3 Dec 2005 1:14:52 UTC Over Client error Computing 14,458.08 30.06 --- 162799766 38931183 1 Dec 2005 18:40:03 UTC 1 Dec 2005 23:07:47 UTC Over Client error Downloading 0.00 0.00 --- 162140666 38771431 1 Dec 2005 1:04:41 UTC 2 Dec 2005 5:11:36 UTC Over Success Done 16,480.97 36.76 22.10 160584490 38392070 29 Nov 2005 2:14:16 UTC 1 Dec 2005 0:51:45 UTC Over Success Done 17,873.32 39.87 31.99 157778815 37709437 25 Nov 2005 10:58:30 UTC 26 Nov 2005 19:46:43 UTC Over Success Done 16,227.16 35.66 20.95 166627461 39862590 6 Dec 2005 0:37:10 UTC 20 Dec 2005 0:37:10 UTC In Progress Unknown New --- --- --- 158942330 37990611 26 Nov 2005 23:23:36 UTC 10 Dec 2005 23:23:36 UTC Over No reply New 0.00 --- --- 157396634 37616554 24 Nov 2005 22:28:13 UTC 3 Dec 2005 17:29:51 UTC Over Success Done 15,542.05 23.04 23.04 155245066 37089134 21 Nov 2005 22:27:42 UTC 24 Nov 2005 22:38:22 UTC Over Success Done 18,822.59 28.79 28.79 And the beat goes on Sonny and Cher BOINC Wiki |
Aurora Borealis Send message Joined: 14 Jan 01 Posts: 3075 Credit: 5,631,463 RAC: 0 |
The simple answer is you result have probably uploaded properly onto Beckley's server. To keep things humming a little better, they have probably stopped processing the data for the moment and just cashing it. To me it is logical that they don't want to add to the stress on the database servers until the backlog catches up. I know that my result have not been updated yet and the credits wont be issued until they have been validate. Credits will probably take up to a week after the current crisis is over since the backlog will then need to be processed. Boinc V7.2.42 Win7 i5 3.33G 4GB, GTX470 |
Bill Barto Send message Joined: 28 Jun 99 Posts: 864 Credit: 58,712,313 RAC: 91 |
dasilvac, On this workunit: http://setiathome.berkeley.edu/result.php?resultid=167513580 you clicked on the "abort" button under the "transfer" tab. This action deletes the result from the transfer queue and prevents you from getting credit for the result. If you have done this to any other result you have also lost them. |
[Cx] Send message Joined: 25 Jul 05 Posts: 141 Credit: 25,742 RAC: 0 |
On this workunit: Good Gracious..! So, whereas a more sensible assumption that: Aborting a *TRANSFER* would keep a completed Work Unit in the "Work" list as "Ready to Upload",.. INSTEAD, BOINC / SETI decides that aborting any TRANSFER should completely invalidate the whole Work Unit?!?!? Does this suggest that once a WU is tagged for TRANSFER, it's "beyond" just being "completed", and hence it means that touching ANYthing in the TRANSFER screen is practically dangerous? As far as I'm concerned, that's a waste of cycles. *grin* What if I have a legit reason to either shutdown my PC or disconnect my internet or otherwise feel I should stop data moving in/out of my PC? Someone PLEASE tell me why there is an ABORT button on the TRANSFER screen, if its sole purpose is to waste computing cycles by killing a completed Work Unit? OR.. perhaps.. is this going to be amended in a future version? (Or is it in 5.x, since I need to upgrade anyway from my 4.45?) Thanks.. .Cx. THANKS: CJSoft - helped me clarify a few things. Jim K - appreciated, though I knew where to look up my work units. ;) Aurora B - I got that impression from the Tech Notes, but yeah, thanks. :] Bill B - I hadn't bothered to look that deeply before, so thanks for that..! :) |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
How do you get your Messages to also list a "Reason"? Someone posted their messages showing a "Reason" line; I don't get any "Reason"s at all! The messages tab should contain the same data as is posted to the log files. I hav seen some rare cases where this is not so, but, the most authoritive source of actual messages is the output log files. If you are interested make a copy and read them. Most messages can be found in the WIki, but, as the BOINC Client Software has evolved, well, so do the messages. We do try to indicate versions where possible. In the older system the logs contained the error number as -106 for example, in about 4.72 this was changed to "System I/O". The error codes are listed in the Wiki for those that are not printed as text. |
Aurora Borealis Send message Joined: 14 Jan 01 Posts: 3075 Credit: 5,631,463 RAC: 0 |
As always Paul is right. If you hit abort button, it deletes the result and you have lost the credits. It should only be used as a last resort. BTW I've personally never lost more than a few minutes on a WU just from manually shutting the down Boinc software. I sympathize with you loss. But, it just a painful example that you should not try to micromanage the Boinc software. It is design to provide information, not as a tool for 'control freaks'. I apologize if this term does not apply to you, but this board seems to be filled with people that it does fit. Boinc V7.2.42 Win7 i5 3.33G 4GB, GTX470 |
Bill Barto Send message Joined: 28 Jun 99 Posts: 864 Credit: 58,712,313 RAC: 91 |
As always Paul is right. If you hit abort button, it deletes the result and you have lost the credits. It should only be used as a last resort. Paul? |
Aurora Borealis Send message Joined: 14 Jan 01 Posts: 3075 Credit: 5,631,463 RAC: 0 |
As always Paul is right. If you hit abort button, it deletes the result and you have lost the credits. It should only be used as a last resort. Sorry, still waking up, I was agreeing with your post..... BUT, you have to admit PAUL is always right. Boinc V7.2.42 Win7 i5 3.33G 4GB, GTX470 |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Sorry, still waking up Same thing about that Beckley's server, which is probably a hidden commercial for Becks beer, right? ;) |
Bill Barto Send message Joined: 28 Jun 99 Posts: 864 Credit: 58,712,313 RAC: 91 |
As always Paul is right. If you hit abort button, it deletes the result and you have lost the credits. It should only be used as a last resort. Can't argue with that. |
[Cx] Send message Joined: 25 Jul 05 Posts: 141 Credit: 25,742 RAC: 0 |
As always Paul is right. If you hit abort button, it deletes the result and you have lost the credits. It should only be used as a last resort. Thanks again, folks. Paul, I've belatedly realized that the "Reason" message that doesn't show for me is most likely because I'm still running 4.45! So... sorry about that. .. and yes, I know I'm guilty for the occasional micro-management, but you'll note I only did that _once_ and that was before I bothered to read Tech Notes to figure out why things weren't transferring correctly. Though not a "control freak", I do tend to "fiddle" and poke around when something isn't going the way I'd assume it "normally" would... Oh well, what's a few credits / work hours? At least, for The Science, it's been covered by at least another 3 'puters... so, no major loss there, right? :) Say... I'd be happy to close this thread now... Seems appropriate, no? Thx again, .Cx. |
Jack Gulley Send message Joined: 4 Mar 03 Posts: 423 Credit: 526,566 RAC: 0 |
If you hit abort button, it deletes the result and you have lost the credits. Very poor user interface implementation. Obviously by someone that has never done and used serious user interface programming before. The button says "Abort Transfer" and implies ending the current transfer attempt. It does not say "Delete Result", which it should say based on this description of what it does. That panel needs two options to replace the miss labeled one. A "Suspend Transfer" and a "Delete Result" button that does what it says it does. Plus and "delete" of results or any file must have a second window that ask the user to confirm that they want to "remove that result" and never send it back for credit. 'control freak' Someone who attempts to exert absolute control over someone else so that they become dependent on them and become incapable of independent decisions or breaking free from that control. Hum.. now just who is being a control freak here. (BOINC?) |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
{edited. I should read} |
Aurora Borealis Send message Joined: 14 Jan 01 Posts: 3075 Credit: 5,631,463 RAC: 0 |
If you hit abort button, it deletes the result and you have lost the credits. I agree this is definitely a bad choice of wording. I also, think you're right about an additional button and to clarifying the meaning. We must provide for the control freaks. Being a charter member of 'Control Freaks Anonymous', I can see it others, even when it's in its early stages. I was merely raising a red flag. Boinc V7.2.42 Win7 i5 3.33G 4GB, GTX470 |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
So, whereas a more sensible assumption that: The tooltip when you hover the Abort button says: "Click 'Abort transfer' to delete the file from the transfer queue. This will prevent you from being granted credit for this result." That's fairly clear, though perhaps a confirmation dialog after clicking the button would more likely be read. (Much as I hate those "Are you sure?" dialogs, there ARE cases where they are justified.) Does this suggest that once a WU is tagged for TRANSFER, it's "beyond" just being "completed", and hence it means that touching ANYthing in the TRANSFER screen is practically dangerous? I'd define it the other way; a result is not "completed" until it has been uploaded AND reported to the scheduler. Someone PLEASE tell me why there is an ABORT button on the TRANSFER screen, if its sole purpose is to waste computing cycles by killing a completed Work Unit? Here's one example: In the Test Project there have been several times when a block of setiathome_enhanced WUs had errors so they have been cancelled. That puts them in a state where uploaded results won't be validated, so an Abort makes sense. The present situation suggests that addition of a Suspend/Resume button for Transfers would be a good idea. I suspect if some skilled coder submitted changes to the BOINC code to add that, it would be accepted. It's not just a trivial addition to BOINC Manager; BOINC itself has only the Retry and Abort actions for file transfers. Joe |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
Paul is not always right ... I distinctly remember one mistake in January 1976. :) Seriously guys, I don't have a lock on the truth. Wish I did ... |
[Cx] Send message Joined: 25 Jul 05 Posts: 141 Credit: 25,742 RAC: 0 |
THANKS, EVERYONE... .. for all your responses, comments, and input! .. and thanks for not flaming the nearly-n00b! . o O ( Ruh-roh... guess what you've just invited..! ) Thanks especially to the "regulars" for their extreme patience with, and tolerance of, The Clueless Wandering Sheep Who Ask (too many?) Questions. Your kind assistance is greatly appreciated. Regards, .Cx. |
TPR_Mojo Send message Joined: 18 Apr 00 Posts: 323 Credit: 7,001,052 RAC: 0 |
Paul is not always right ... tut-tut Mr Buck must try harder ;) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.