留言板 :
Number crunching :
finish file present too long
留言板合理
| 作者 | 消息 |
|---|---|
jason_gee 发送消息 已加入:24 Nov 06 贴子:7489 积分:91,093,184 近期平均积分:0
|
Yes haven't been following that boinc dev discussion 100% myself either. For Windows, you have first the process priority class, then second the thread priority class. The first one (process) is absolute, while the thread priority class is relative to the process ( 'normal thread priority being effectively the same as the process priority). Lower would be a 'more idle' thread and higher being a small bump over the regular process one, used for managing multiple threads within the same app. For this particular client and gui, messing with lowering any priorities ( process, thread, IO or memory ) isn't likely going to work well, since there is hardwired time sensitive code pretty much everywhere. People queuing up for a major Apple store release, after christmas clearance sales, or blockbuster movie premiere have the idea: bring lots of coffee, patience, maybe a chair and a book. The hardwired timeouts on this code is problematic for some normal cases (under contention). Lowering anything will indeed just expose the coded in limitations even more. Writing low priority services, and the like, is harder than just changing the assigned priorities. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove ![]() 发送消息 已加入:4 Jul 99 贴子:14141 积分:200,643,578 近期平均积分:874
|
My concern is that lowering the client thread priority will actually make this problem worse, and then leave lingering collateral damage throughout the ongoing client session until the next startup (from hours for laptops, to weeks for workstations). I've just been comparing 7.6.9, .15, and .16 for Windows7/64 (haven't got hold of a .17 yet) The Manager runs at normal priority in all three versions - I think we can discount that. .15 shows the client running at Low priority (according to Task Manager), and .16 at Normal priority - but there's a difference at the thread level. Process Explorer says: .9 .15 .16 -- --- --- App Priority Normal Low Normal Base Priority 8 4 4 Dynamic Priority 10 6 6 I/O Priority Normal Very Low Very Low Memory Priority 5 1 1 That's for the worker thread - the one with millions of Cycles Delta. The other threads are in accordance with the main app priorities. |
Jord 发送消息 已加入:9 Jun 99 贴子:15170 积分:4,362,181 近期平均积分:3
|
My concern is that lowering the client thread priority will actually make this problem worse, and then leave lingering collateral damage throughout the ongoing client session until the next startup (from hours for laptops, to weeks for workstations). I've built 7.6.17 from source, it's back to running the client and manager at Normal priority, even though it doesn't say that in the change log messages. |
rob smith ![]() 发送消息 已加入:7 Mar 03 贴子:18752 积分:416,307,556 近期平均积分:380
|
On three of my four the red dot is there for a second or two, on the forth it is there for maybe thirty seconds on most re-starts of BOINC. All four are running Windows 7, have oodles of RAM. Two of the three "quick starts are running W7 Pro, the third is running W7 home, the slow coach is running W7 Pro. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Richard Haselgrove ![]() 发送消息 已加入:4 Jul 99 贴子:14141 积分:200,643,578 近期平均积分:874
|
There are indications (specifically, a 'red dot' on the tray icon, and 'reconnecting to client' in the Manager status bar) that the BOINC client's own initial startup is fighting for resources with Windows - especially Windows 10. Yes, it always starts that way - that is to say that BOINC Manager always starts that way, which is the usual startup routine for "user mode" installations. Service mode starts diferently, and doesn't necessarily involve starting the Manager at all. The question which has come up for discussion on the mailing lists is how long does it take for the red dot to disappear (i.e. for the Manager to establish connection with a running client), whether this is too long in general, and whether it is longer for certain computers, and/or certain versions of Windows, than it needs to be - and whether anything can be done to speed up or disguise the process. |
Bill G 发送消息 已加入:1 Jun 01 贴子:1282 积分:187,688,550 近期平均积分:182
|
There are indications (specifically, a 'red dot' on the tray icon, and 'reconnecting to client' in the Manager status bar) that the BOINC client's own initial startup is fighting for resources with Windows - especially Windows 10. On a personal note: I see the 'red dot' and the 'reconnecting to client' every time I restart BOINC. This has happened since Windows 7 and is not new to Windows 10 (for me). It may be more evident in Windows 10 but I can not say that as it has always been there for me. It shows when I restart BOINC and when I restart Windows 7 or 10. SETI@home classic workunits 4,019 SETI@home classic CPU time 34,348 hours |
William 发送消息 已加入:14 Feb 13 贴子:2037 积分:17,689,662 近期平均积分:0 |
... brings into play terms like "systems analysis", "engineering", and even "theory" - none of which fit comfortably with publishing a rapid-reaction bugfix. In the case of BOINC taking the bandaids off might have an equally devastating effect as taking the wrappings off a mummy. edit: so the crux is Windows (10) doesn't play nice (pun intended)? A person who won't read has no advantage over one who can't read. (Mark Twain) |
jason_gee 发送消息 已加入:24 Nov 06 贴子:7489 积分:91,093,184 近期平均积分:0
|
... brings into play terms like "systems analysis", "engineering", and even "theory" - none of which fit comfortably with publishing a rapid-reaction bugfix. *looks out of cave briefly while recovering from a cold*: And possibly throw in a healthy dose of ripping off bandaids to fit the engineered solution(s) in place. Not painless or cost free. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove ![]() 发送消息 已加入:4 Jul 99 贴子:14141 积分:200,643,578 近期平均积分:874
|
Apologies in advance, because some of this will only be accessible to those who are following the related "Slow to startup, slow to start running?" boinc_alpha email thread I was alluding to. Sekerob's post at Nov 11 at 9:56 PM is relevant: Does ol <start_delay>300</start_delay> and the windows service delay There are issues concerning (both, but separately) BOINC's own startup, and the startup of the science applications under BOINC's control after it has itself started. <start_delay> applies to the second phase only. There are indications (specifically, a 'red dot' on the tray icon, and 'reconnecting to client' in the Manager status bar) that the BOINC client's own initial startup is fighting for resources with Windows - especially Windows 10. My concern is that lowering the client thread priority will actually make this problem worse, and then leave lingering collateral damage throughout the ongoing client session until the next startup (from hours for laptops, to weeks for workstations). Sekerob's "Automatic (Delayed Start)" is ideal (and I've used it myself), but it only applies to service mode, and thus draws blank stares from the portion of the SETI message board readership who use GPUs. 95%? We perhaps need to import tricks from the Linux side of the community, like startup scripts where delays or sequence directives can be included to ensure BOINC starts after the X-server, allowing GPU detection. But that's a feature enhancement, and - as I've already posted this morning - brings into play terms like "systems analysis", "engineering", and even "theory" - none of which fit comfortably with publishing a rapid-reaction bugfix. |
William 发送消息 已加入:14 Feb 13 贴子:2037 积分:17,689,662 近期平均积分:0 |
I vaguely remember a setting that tells boinc how long to wait before starting? RTFM... <start_delay>nseconds</start_delay> Specify a number of seconds to delay running applications after client startup. Options portion of cc_config Helps especially if there's a lot loading at startup on not so fast systems. A person who won't read has no advantage over one who can't read. (Mark Twain) |
jason_gee 发送消息 已加入:24 Nov 06 贴子:7489 积分:91,093,184 近期平均积分:0
|
For CPU only tasks, In a *slightly* better implementation for a complex situation, the client might sense the radical overcommit in a similar way to how we do, then take some evasive action. For me manually that would be by noticing the CPU time on a task is only a tiny fraction of elapsed, and snoozing the client until settled. A periodic probe would only take a few seconds at most. The same method wouldn't trigger for GPU tasks running alone (or possibly other kinds to come), and something optional/user-configurable as opposed to one-size-fits-all seems appropriate as touched on. I've mentioned the use of fixed magic numbers for file timeouts as being problematic to devs already before. I'm not sure it registered exactly what I was talking about amidst the noise, but I've been gaining a lot of 'faith' in Murphy lately, in that less whining and more patience seems to see the natural order assert itself. [The natural order being that time sensitive programming breaks on a non-realtime OS, which is why a lot has moved to callbacks/IO-completion ports, away from event/interrupt driven and synchronous code. ] "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove ![]() 发送消息 已加入:4 Jul 99 贴子:14141 积分:200,643,578 近期平均积分:874
|
The request came from someone with a laptop who does extreme things like run multiple VMs at startup under Windows 10 - and I guess startup happens more often with a laptop. I've re-quoted "Hard cases make bad law" back at them. |
Bill G 发送消息 已加入:1 Jun 01 贴子:1282 积分:187,688,550 近期平均积分:182
|
Personally I wish there was a way to delay the Start of programs waiting to auto start with the start of Windows. However this computer runs 24/7 so startup should not have affected it. (It did an upgrade yesterday which means I had to reinstall the video drivers before I ran SETI) SETI@home classic workunits 4,019 SETI@home classic CPU time 34,348 hours |
Richard Haselgrove ![]() 发送消息 已加入:4 Jul 99 贴子:14141 积分:200,643,578 近期平均积分:874
|
Win10 probably does have something to do with it - I haven't tried it myself, but from what I read, Win10 itself can be enough to over-commit the machine, what with all the i/o and disk access going on, especially at startup. That was why BOINC was asked to reduce its process priority, so that Windows startup didn't get impacted by BOINC getting in the way. But I think they've gone too far, and not allowed BOINC to retain enough resources to do what it needs to do. We'll see. |
Bill G 发送消息 已加入:1 Jun 01 贴子:1282 积分:187,688,550 近期平均积分:182
|
Thanks Richard for the explanation. I will wait and see about the next BOINC. This computer is just a cruncher, but it does run the latest Windows10 beta if that means anything. I have not noticed this error before that I am aware of, just happened to be checking in on an errored WU. I have always run betas on this computer, but it did have a video card upgrade not too long ago. No overclocking and with the temp in the computer room now running around 4-10C I do not think cooling had anything to do with it. SETI@home classic workunits 4,019 SETI@home classic CPU time 34,348 hours |
Richard Haselgrove ![]() 发送消息 已加入:4 Jul 99 贴子:14141 积分:200,643,578 近期平均积分:874
|
It probably means that your computer is (at least marginally) over-committed - trying to do too many things at once, and having to juggle resources around to meet all the demands on it. One known problem is that while the AMD FX(tm)-8320 Eight-Core Processor has eight true cpu cores, it only has four floating-point arithmetic units - so while simple applications run at full speed, complicated mathematical apps like SETI have to wait and share. The specific error message - BOINC is fussy about the length of time an application takes to clean up all the housekeeping, release memory, etc. etc. after it finishes. BOINC wants to get busy and working on the next task, and if the previous one hangs around and refuses to leave home (like an unwanted teenager....), BOINC just boots it out - no The good news: a forthcoming BOINC update is expected to extend the housekeeping limits, but don't upgrade yet - the current test version (v7.6.15) also fiddles with the working priority of the BOINC client program, and this afternoon I reported three 'finish file present too long' errors on my test machine since loading v7.6.15 four days ago. I don't think that's a coincidence. I also reported a fifty-fold increase in the rate of 11-Nov-2015 22:07:06 [SETI@home] Task 18my11ab.5995.476.8.12.207_0 exited with zero status but no 'finished' file warnings since the upgrade. If BOINC runs at too low a (process/thread) priority, and is hard-pressed on resources anyway, it's more likely that BOINC will fail to notice that it's attention is needed to service a heartbeat check in an application, or do a task cleanup. If this is the first time you've seen the error - probably just random bad luck (did two or three tasks all need cleaning up at almost the same time?). If it's repeated - see if the machine is showing signs of stress, like constant hard-disk activity. See what you can do to lighten the load. |
Bill G 发送消息 已加入:1 Jun 01 贴子:1282 积分:187,688,550 近期平均积分:182
|
What does this error mean. One wingmate has finished this WU and his results were the same as mine, except I got this error in my stderr file. The WU is: http://setiathome.berkeley.edu/workunit.php?wuid=1963564702 Mostly just curious if there is something I am doing wrong. SETI@home classic workunits 4,019 SETI@home classic CPU time 34,348 hours |
©2020 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.