Message boards :
Number crunching :
Work Unit problem
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next
Author | Message |
---|---|
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Well, you could edit it in the project section of the "client_state.xml" file instead. You'll get better results. Did that, set it to 0.9, it showed 9½ hours, then tried a few others until settling on 0.3, shows 3 hours, MB WU's are taking less than that, But it'll do for now, and i might get some work now, thanks. Claggy. |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Well, you could edit it in the project section of the "client_state.xml" file instead. You'll get better results. I have reset the DCF in client state a few times. Whenever I process one of these bum WUs, it kicks the DCF way up and Boinc goes into EDF panic mode. "Time is simply the mechanism that keeps everything from happening all at once." |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Just be aware there are some truly meaty workunits amongst the noisy ones. this one I got: http://setiathome.berkeley.edu/workunit.php?wuid=148360386 took quite a while but my claim is ~112 credits. I checked and thankfully the partner I am with is running 5.27 , so I'll probably get them. Not a big return for 28000 secs [old machine, used to get 62.4 cr / 14000 secs] , but far better than I've been getting for the noisy ones. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
...thankfully the partner I am with is running 5.27 ... I think you'll be all right with Byron - and more to the point, he's running BOINC v5.10.13. December deadline, eh? That's meaty all right. |
W-K 666 Send message Joined: 18 May 99 Posts: 19322 Credit: 40,757,560 RAC: 67 |
Just be aware there are some truly meaty workunits amongst the noisy ones. I've had similar high claim resultid=592412486 106cr in 4hrs. Most units on this computer are about 22/hr so this is a slight gain. Andy |
Tklop Send message Joined: 11 May 03 Posts: 175 Credit: 613,952 RAC: 0 |
Hello, again crunchers: msattler: Here's the final outcome of that WU you and I discussed further down this thread... Work Unit ID: 147604236 (09mr07ae.8908.15614.7.4.87) Server State: Over Outcome: Client Error Client State: Compute Error CPU Time in seconds: 54,599.63 Claimed credit: .06 Granted credit: -- (pending) I let it run--but as you can see, it took a while to finally error out... Sadly, it looks like it sent it out again anyway though--see here: http://setiathome.berkeley.edu/workunit.php?wuid=147604236 There used to be only three result ID's (including mine) and now there are four... So, whomever the poor sod is who gets lumped with this nasty bugger, please forgive me... Both I, and my machine worked long and hard to remove it from the queue, but to no avail! Anyway, thought a few of you might be interested in the length of processing time involved to finally get this one to error out... (For those of you who don't feel like browsing this thread for my specs, the computer crunching that one was a Pentium M 1.86GHz, with 1GB RAM, running Windoze XP Pro). P.S. I liked the suggested fix, but I didn't see it here, until after the beastly WU had errored out... Even so, I offer up a big Thank You to Joe for figuring that out! [edit] Also, yes... I too have run through some pretty monstrous but completeable work units from that bunch--and the credit payoff looks promising (once someone else actually manages to finish one! LOL) Keep on crunching, all... SETI@Home Forever! ___Tklop (Step-Founder, U.S. Air Force team) |
bounty.hunter Send message Joined: 22 Mar 04 Posts: 442 Credit: 459,063 RAC: 0 |
|
cRunchy Send message Joined: 3 Apr 99 Posts: 3555 Credit: 1,920,030 RAC: 3 |
I just managed to get rid of a never ending WU (with the minus <triplet_thresh> issue) but now I'm getting units that state they will take 180+ hours yet finish in 8 hours. Is this a result of this same range of problems we have been experiencing or is it a problem somewhere else? |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Not necessarily faulty workunits in my opinion. I have had some of those long ones finish in 5 to 6 hours on my old machines [with chicken optimised app] for about 112 credits. The prediction mechanism is up to kaka. I think it may be safe to ignore those predictions. [This isn't to say they couldn't also be noisy or faulty in some way too] [More: My understanding, could be wrong, is that some better prediction formulae were made by Joe Segur, but somehow got left out of the current splitter code (not the app's problem, as I understand it). I also understand this may be examined when Eric gets back from holiday] [cRunchy: I just looked at your results and I see that long one you let go. My you have much more patience than I do! :D If you want to crunch those results a bit faster, you might consider going for the chicken/lunatics app, as you are on XP it should be okay if you are game.] "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
I just managed to get rid of a never ending WU (with the minus <triplet_thresh> issue) but now I'm getting units that state they will take 180+ hours yet finish in 8 hours. To add to Jason's comment: It isn't just that the estimates are long, but also because your computer thinks - for the time being - that all WUs are going to be as slow as the one you've just got rid of. It's a safety mechanism called the 'Result Duration Correction Factor' (RDCF), and it's self-correcting: you don't have to do anything about it. As more 'normal' WUs pass through the system, your machine will gradually relax after its panic, and the estimates will get closer and closer to reality. Expect it to take about 20 or 30 WUs, though. |
makosky Send message Joined: 7 Jul 00 Posts: 56 Credit: 3,908,782 RAC: 0 |
i just got a new pc .. and my results from the old pc are still present on status page ..not showing new pc's work ...both pcs ids are on .. it tells me i have to communicate with host i have no idea what that means ..both results are being sent home help someone |
Bob Foertsch Send message Joined: 19 Jun 99 Posts: 25 Credit: 11,969,667 RAC: 25 |
I've had this work unit grinding away for nearly 4 days. Should I abort it? What can I do to prevent someone else from having the same problem? WU ID is 147512879 Wed Aug 15 20:42:12 2007|SETI@home|Starting 04mr07aa.8827.23385.12.4.227_4 Wed Aug 15 20:42:13 2007|SETI@home|Starting task 04mr07aa.8827.23385.12.4.227_4 using setiathome_enhanced version 523 Thu Aug 16 12:18:05 2007|SETI@home|Task 04mr07aa.8827.23385.12.4.227_4 exited with zero status but no 'finished' file Thu Aug 16 12:18:05 2007|SETI@home|If this happens repeatedly you may need to reset the project. Thu Aug 16 12:18:05 2007|SETI@home|Restarting task 04mr07aa.8827.23385.12.4.227_4 using setiathome_enhanced version 523 Thu Aug 16 12:21:11 2007|SETI@home|Task 04mr07aa.8827.23385.12.4.227_4 exited with zero status but no 'finished' file Thu Aug 16 12:21:11 2007|SETI@home|If this happens repeatedly you may need to reset the project. Thu Aug 16 12:21:11 2007|SETI@home|Restarting task 04mr07aa.8827.23385.12.4.227_4 using setiathome_enhanced version 523 Thu Aug 16 22:57:22 2007|SETI@home|Restarting task 04mr07aa.8827.23385.12.4.227_4 using setiathome_enhanced version 523 Sun Aug 19 09:56:27 2007|SETI@home|Sending scheduler request: To fetch work Sun Aug 19 09:56:27 2007|SETI@home|Requesting 73 seconds of new work Sun Aug 19 09:56:32 2007|SETI@home|Scheduler RPC succeeded [server version 511] Sun Aug 19 09:56:32 2007|SETI@home|Deferring communication for 11 sec Sun Aug 19 09:56:32 2007|SETI@home|Reason: requested by project Sun Aug 19 09:56:34 2007|SETI@home|[file_xfer] Started download of file 16fe07ab.13278.19704.10.5.177 It currently shows 38:28:15 CPU time, 0.048% complete, 44:09:25 to completion. While all my other work units have the completion time go DOWN, this was goes UP! |
Dirk Sadowski Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
This thread is very very long.. Is there now a recommendation what to do is with this WUs? To now I had one WU.. ~ 2 hours and the 'completion time' gone higher and higher.. http://setiathome.berkeley.edu/workunit.php?wuid=147939514 I aborted.. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
That's what most people will probably do, many others will let them run until they either overflow on Pulses or BOINC kills them for Maximum CPU time exceeded. The small proportion of users who read these forums thoroughly can't have much effect on the problem. When the same thing happened not long after _enhanced was first released, Eric Korpela produced a script which helped clean up the situation. If he weren't on vacation he'd probably do so again, or maybe he has already told Matt or Jeff where to find the script so they can tailor it to the present situation. More likely it wasn't saved. The Hanging workunit and odd credit claims problem solved. thread is where that cleanup was discussed, and gives figures on how many bad WUs were involved that time. The few we've identified are probably only a fraction of what's out there, but each group has 256 WUs so it's a considerable number. For the record here's the list of groups mentioned in this thread: 04mr07ab.10282.4980 04mr07ab.14840.4980 04mr07ab.14852.4980 04mr07ab.32128.5798 04mr07ab.7106.5389 04mr07ab.7106.5798 04mr07ab.7106.6207 04mr07ab.7106.6616 05mr07aa.12210.24612 05mr07aa.12591.24612 05mr07aa.15859.24612 05mr07aa.32396.24612 05mr07aa.3769.20522 05mr07ab.6072.369046 05mr07ab.7301.368637 09mr07ae.8908.15614 Joe |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
It seems that editing of WU header should be done on both computers participating in WU calculation. In other case WU may ended with CPU time overflow and just will be sended on another computer to slow it down one more time. Cause error limit is setted on 5 - there will be 5 slowed hosts before one WU disappears. For example: http://setiathome.berkeley.edu/workunit.php?wuid=148035389 CPU time 82697.222107 stderr out <core_client_version>5.8.16</core_client_version> <![CDATA[ <message> Maximum CPU time exceeded </message> ]]> |
CougarKy Send message Joined: 20 Aug 01 Posts: 5 Credit: 4,076,741 RAC: 1 |
I have been receiving WU with to completion times of 119 hours+. However as the WU is crunched the to completion time drops dramatically. It ends up being only 5 or 6 hours to process. |
laconic Send message Joined: 19 Jul 99 Posts: 5 Credit: 1,280,678 RAC: 0 |
... For the record here's the list of groups mentioned in this thread:... One more for you. 04mr07aa.16181.20522 laconic . |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
I have been receiving WU with to completion times of 119 hours+. However as the WU is crunched the to completion time drops dramatically. It ends up being only 5 or 6 hours to process. If you've run one of the nasty WUs to completion, BOINC will have adjusted your DCF (Duration Correction Factor) to match its extended time. It will gradually come down as you do normal work, taking about 20 to 30 WUs to get back to normal. If you want to fix it quickly, you can close down BOINC, find the <duration_correction_factor> entry for <project_name>SETI@home in the client_state.xml file, and edit it. Given a shown estimate of 119 hours for an unstarted WU which will actually take 5 hours, multiply the value by 5/119 or 0.042. That should be close enough, if it's slightly too low it will fully correct when one WU completes, if it's slightly high it will creep down as usual. Joe |
Tklop Send message Joined: 11 May 03 Posts: 175 Credit: 613,952 RAC: 0 |
... For the record here's the list of groups mentioned in this thread:... And yet another... (Sigh) 04mr07ab.10898.4571 Thanks again, Joe, for the fix you posted... Has it been 'officially sanctioned' at this point? I only ask--because as best as I can tell, whether letting these error out on their own, or we force the error sooner, they just keep getting passed around... Anyways, I'm hanging in here! Keep on crunching, all... SETI@Home Forever! ___Tklop (Step-Founder, U.S. Air Force team) |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30935 Credit: 53,134,872 RAC: 32 |
http://setiathome.berkeley.edu/workunit.php?wuid=147633867
|
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.