Got *much* more work than asked for |
![]() |
| log in |
Message boards : Number crunching : Got *much* more work than asked for
1 · 2 · 3 · 4 . . . 5 · Next
| Author | Message |
|---|---|
|
This got to be the first time someone complains about receiving work on this board... 21.1.2009 10:37:30|SETI@home|Requesting 6198 seconds of new work, and reporting 1 completed tasks 21.1.2009 10:37:35|SETI@home|Scheduler RPC succeeded [server version 607] 21.1.2009 10:37:35|SETI@home|Deferring communication for 11 sec 21.1.2009 10:37:35|SETI@home|Reason: requested by project After this BOINC downloaded 20 workunits, totalling about 100-130 hours of crunch time. Estimated crunch time per work unit is around 3 hours or 10 hours. As far as I can tell, all the time stats, DCF and all the other metrics were sane at time of request (scheduler request and reply have not yet been overwritten). I have cache set to 0.5+0.5 so hundred hours is way too much. This may have something to do with reporting a -9 result at the same time, although I am not quite sure if it's related. I think this is a bug of some sort, likely something in server side. I'm using BOINC 5.10.13 and this is the first time anything like this has happenened so I don't think it is a bug on this end. I haven't been around the boards for a while so excuse me if this has already been reported in some other thread. -Juha | |
| ID: 855982 · | |
|
I looked at your new work. YOu should be fine with the work you have. you have about 50% small WU's with short(1 week) TAT and 50% large WU's with long(3+ weeks) TAT. I don't think you'll have any time issues after looking at how fast you are returning the small WU's 10 small WU's should take you about 30 hours to return. thats not very long considering you have aweek to do them. | |
| ID: 855984 · | |
|
I also got a ton of work units, many times the usual number. | |
| ID: 855990 · | |
|
I had the same thing on 15 January, with a 69-second work request being filled with a 12-day allocation - reported in Work fetch anomaly. So although it's still rare, there does seem to be more than a one-off problem: that perhaps makes it worth investigating? | |
| ID: 855992 · | |
|
It looks like they received short WU's with a few long ones sprinkled in. I don't think that its an anomaly. Just that the server is sending out short WU's recently | |
| ID: 855996 · | |
It looks like they received short WU's with a few long ones sprinkled in. I don't think that its an anomaly. Just that the server is sending out short WU's recently No. I received 5 Astropulse tasks on a slow, single-core P4 that should never, ever, be allocated more than one AP task at a time - they take 2 days to run, and I have a 1 day, 50% resource share cache. | |
| ID: 855997 · | |
I looked at your new work. YOu should be fine with the work you have. you have about 50% small WU's with short(1 week) TAT and 50% large WU's with long(3+ weeks) TAT. I don't think you'll have any time issues after looking at how fast you are returning the small WU's 10 small WU's should take you about 30 hours to return. thats not very long considering you have aweek to do them. Yes, one could say I got lucky. Had I had larger cache, say 5 days, I think I would have hard time returning the shorties in time. I also have some Spinhenge workunits on board and those too have one week deadline. For some other cruncher this may have meant missed deadlines. -Juha | |
| ID: 856000 · | |
I had the same thing on 15 January, with a 69-second work request being filled with a 12-day allocation - reported in Work fetch anomaly. So although it's still rare, there does seem to be more than a one-off problem: that perhaps makes it worth investigating? I thought this was a new problem and looked at the first page only. Your thread is, of course, in second page. IIRC, the server is allowed to send at most 20 workunits at a time. What I find interesting is that that is what I got. As if the server didn't bother counting how much work it had already assigned to me and just gave as much as it's allowed. Yours doesn't quite match that or maybe the server didn't have enough work at hand at that time. There's a post at Beta that sounds like same issue. So that's four reports. Might need investigating. Btw. I have preferences set to allow AP but it's not included in app_info.xml so that is why I only got MB workunits. Twenty 300 hour AP workunits would have been fun. -Juha | |
| ID: 856004 · | |
So that's four reports. Might need investigating. On two different servers, both of which are likely to have been recently updated with the very latest server patches. Also there are two reports at BOINC dev of work being allocated with no work request at all! Might be related. | |
| ID: 856007 · | |
Also there are two reports at BOINC dev of work being allocated with no work request at all! Might be related. So what we really have is several cases where scheduler ignores some of the constraints given to it. It sure looks related. -Juha | |
| ID: 856020 · | |
|
I have this report: | |
| ID: 856022 · | |
|
You can still see what what the messages were, by retrieving the stdoutdae.txt file (I think that's the right name, don't have a copy here) from the BOINC data directory. | |
| ID: 856025 · | |
You can still see what what the messages were, by retrieving the stdoutdae.txt file (I think that's the right name, don't have a copy here) from the BOINC data directory. No I can't mine is cprrupt for some reason, since June last year. And I have been too busy (maybe should read "idle") to do anything about it. | |
| ID: 856027 · | |
You can still see what what the messages were, by retrieving the stdoutdae.txt file (I think that's the right name, don't have a copy here) from the BOINC data directory. Hm. Well if it is corrupt, one thing you can do is shut BOINC down and just delete it. It will create a new one from scratch. One thing I did for mine was in cc_config, I set it to rotate once it gets to 100mb. Often times with a 3-5 week uptime, a 2mb log file only goes back about 15-20 days, depending on how much work a system can do, and how much (if any) debugging flags you have set. Just more options for you if you decide that you want to do that. [on topic] I have also noticed how the requested work seconds and the tasks that get assigned don't really seem to have any correlation. I still use 6.2.19, and I've noticed this oddidty since the 5.x.x days. Requesting <100 seconds of work will either get one normal MB, one shorty, or one AP. That doesn't really seem to add up at all. The other thing I've noticed is sometimes for example, requesting 1500 seconds of work results in one full-length MB, but requesting 1100 seconds of work results in 20 shorties. I don't know, just seems like there's an issue somewhere. *shrug* ____________ Linux laptop uptime: 1484d 22h 42m Ended due to UPS failure, found 14 hours after the fact | |
| ID: 856048 · | |
You can still see what what the messages were, by retrieving the stdoutdae.txt file (I think that's the right name, don't have a copy here) from the BOINC data directory. Easy enough to fix then... 1. Stop BOINC 2. Rename stdoutdae.txt [edit]or delete it[/edit] 3. Start BOINC | |
| ID: 856050 · | |
|
This is what I get from a new stderrdae.txt file: | |
| ID: 856055 · | |
|
Those messages only tell you that the science application version you are running wasn't built against the latest BOINC API. Seeing how you run an optimized 5.28, that's the reason. | |
| ID: 856057 · | |
Those messages only tell you that the science application version you are running wasn't built against the latest BOINC API. Seeing how you run an optimized 5.28, that's the reason. Jord, Now you really got me confused, AFAIK, Richard runs same versions of BOINC and Seti app, but he has good stderrdae.txt. Richards Computer ID 3751792 Report deadline 28 Jan 2009 9:36:08 UTC CPU time 1164.969 stderr out <core_client_version>5.10.13</core_client_version> <![CDATA[ <stderr_txt> Windows optimized S@H Enhanced application by Alex Kan Version info: SSSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan SSSE3x Win32 Build 59 , Ported by : Jason G, Raistmer, JDWhale CPUID: Intel(R) Core(TM)2 Quad CPU @ 2.40GHz Speed: 4 x 2398 MHz Cache: L1=64K L2=4096K Features: MMX SSE SSE2 SSE3 SSSE3 My Computer ID 688149 Report deadline 27 Jan 2009 4:31:48 UTC CPU time 821.7813 stderr out <core_client_version>5.10.13</core_client_version> <![CDATA[ <stderr_txt> Windows optimized S@H Enhanced application by Alex Kan Version info: SSSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan SSSE3x Win32 Build 41 , Ported by : Jason G, Raistmer, JDWhale CPUID: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz Speed: 4 x 2996 MHz Cache: L1=64K L2=4096K Features: MMX SSE SSE2 SSE3 SSSE3 Andy | |
| ID: 856065 · | |
|
Richard is still using 5.8.something, isn't he? Or was that Brian Silvers? | |
| ID: 856076 · | |
|
@ Andy, UNRECOGNIZED: suspend_if_no_recent_input UNRECOGNIZED: max_ncpus_pct - yes, I get them too, with a v5 client and a v5 SETI app. Maybe it's because I run Einstein too. stdoutdae.txt : BOINC (daemon) output, the message log we were looking for in the first place. stderr.txt : An application (SETI, in this case) output file, copied to the application task page like the sample you showed us. | |
| ID: 856085 · | |
Message boards : Number crunching : Got *much* more work than asked for
| Copyright © 2013 University of California |