finish file present too long

留言板 : Number crunching : finish file present too long
留言板合理

To post messages, you must log in.

前 · 1 · 2 · 3 · 4

作者消息
Profile jason_gee
志愿者开发人员
志愿者测试人员
Avatar

发送消息
已加入:24 Nov 06
贴子:7489
积分:91,093,184
近期平均积分:0
Australia
消息 1744504 - 发表于:24 Nov 2015, 13:20:12 UTC - 回复消息 1744489.  
最近的修改日期:24 Nov 2015, 13:23:31 UTC

Yes haven't been following that boinc dev discussion 100% myself either.

For Windows, you have first the process priority class, then second the thread priority class. The first one (process) is absolute, while the thread priority class is relative to the process ( 'normal thread priority being effectively the same as the process priority). Lower would be a 'more idle' thread and higher being a small bump over the regular process one, used for managing multiple threads within the same app.

For this particular client and gui, messing with lowering any priorities ( process, thread, IO or memory ) isn't likely going to work well, since there is hardwired time sensitive code pretty much everywhere.

People queuing up for a major Apple store release, after christmas clearance sales, or blockbuster movie premiere have the idea: bring lots of coffee, patience, maybe a chair and a book.

The hardwired timeouts on this code is problematic for some normal cases (under contention). Lowering anything will indeed just expose the coded in limitations even more.

Writing low priority services, and the like, is harder than just changing the assigned priorities.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1744504 · 举报违规帖子
Richard Haselgrove Project Donor
志愿者测试人员

发送消息
已加入:4 Jul 99
贴子:14141
积分:200,643,578
近期平均积分:874
United Kingdom
消息 1744489 - 发表于:24 Nov 2015, 11:18:02 UTC - 回复消息 1744385.  

My concern is that lowering the client thread priority will actually make this problem worse, and then leave lingering collateral damage throughout the ongoing client session until the next startup (from hours for laptops, to weeks for workstations).

I've built 7.6.17 from source, it's back to running the client and manager at Normal priority, even though it doesn't say that in the change log messages.

I've just been comparing 7.6.9, .15, and .16 for Windows7/64 (haven't got hold of a .17 yet)

The Manager runs at normal priority in all three versions - I think we can discount that.

.15 shows the client running at Low priority (according to Task Manager), and .16 at Normal priority - but there's a difference at the thread level. Process Explorer says:

			.9	.15		.16
			--	---		---
App Priority		Normal	Low		Normal
Base Priority		8	4		4
Dynamic Priority	10	6		6
I/O Priority		Normal	Very Low	Very Low
Memory Priority		5	1		1

That's for the worker thread - the one with millions of Cycles Delta. The other threads are in accordance with the main app priorities.
ID: 1744489 · 举报违规帖子
Profile Jord
志愿者测试人员
Avatar

发送消息
已加入:9 Jun 99
贴子:15170
积分:4,362,181
近期平均积分:3
Netherlands
消息 1744385 - 发表于:23 Nov 2015, 21:00:59 UTC - 回复消息 1742528.  

My concern is that lowering the client thread priority will actually make this problem worse, and then leave lingering collateral damage throughout the ongoing client session until the next startup (from hours for laptops, to weeks for workstations).

I've built 7.6.17 from source, it's back to running the client and manager at Normal priority, even though it doesn't say that in the change log messages.
ID: 1744385 · 举报违规帖子
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者负责人
志愿者测试人员

发送消息
已加入:7 Mar 03
贴子:18752
积分:416,307,556
近期平均积分:380
United Kingdom
消息 1742603 - 发表于:16 Nov 2015, 17:47:04 UTC

On three of my four the red dot is there for a second or two, on the forth it is there for maybe thirty seconds on most re-starts of BOINC. All four are running Windows 7, have oodles of RAM. Two of the three "quick starts are running W7 Pro, the third is running W7 home, the slow coach is running W7 Pro.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1742603 · 举报违规帖子
Richard Haselgrove Project Donor
志愿者测试人员

发送消息
已加入:4 Jul 99
贴子:14141
积分:200,643,578
近期平均积分:874
United Kingdom
消息 1742567 - 发表于:16 Nov 2015, 15:40:35 UTC - 回复消息 1742538.  

There are indications (specifically, a 'red dot' on the tray icon, and 'reconnecting to client' in the Manager status bar) that the BOINC client's own initial startup is fighting for resources with Windows - especially Windows 10.

On a personal note: I see the 'red dot' and the 'reconnecting to client' every time I restart BOINC. This has happened since Windows 7 and is not new to Windows 10 (for me). It may be more evident in Windows 10 but I can not say that as it has always been there for me. It shows when I restart BOINC and when I restart Windows 7 or 10.

Yes, it always starts that way - that is to say that BOINC Manager always starts that way, which is the usual startup routine for "user mode" installations. Service mode starts diferently, and doesn't necessarily involve starting the Manager at all.

The question which has come up for discussion on the mailing lists is how long does it take for the red dot to disappear (i.e. for the Manager to establish connection with a running client), whether this is too long in general, and whether it is longer for certain computers, and/or certain versions of Windows, than it needs to be - and whether anything can be done to speed up or disguise the process.
ID: 1742567 · 举报违规帖子
Profile Bill G Special Project $75 donor
Avatar

发送消息
已加入:1 Jun 01
贴子:1282
积分:187,688,550
近期平均积分:182
United States
消息 1742538 - 发表于:16 Nov 2015, 14:08:27 UTC - 回复消息 1742528.  

There are indications (specifically, a 'red dot' on the tray icon, and 'reconnecting to client' in the Manager status bar) that the BOINC client's own initial startup is fighting for resources with Windows - especially Windows 10.

On a personal note: I see the 'red dot' and the 'reconnecting to client' every time I restart BOINC. This has happened since Windows 7 and is not new to Windows 10 (for me). It may be more evident in Windows 10 but I can not say that as it has always been there for me. It shows when I restart BOINC and when I restart Windows 7 or 10.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1742538 · 举报违规帖子
Profile William
志愿者测试人员
Avatar

发送消息
已加入:14 Feb 13
贴子:2037
积分:17,689,662
近期平均积分:0
消息 1742536 - 发表于:16 Nov 2015, 13:42:38 UTC - 回复消息 1742534.  
最近的修改日期:16 Nov 2015, 13:43:36 UTC

... brings into play terms like "systems analysis", "engineering", and even "theory" - none of which fit comfortably with publishing a rapid-reaction bugfix.


*looks out of cave briefly while recovering from a cold*:
And possibly throw in a healthy dose of ripping off bandaids to fit the engineered solution(s) in place. Not painless or cost free.

In the case of BOINC taking the bandaids off might have an equally devastating effect as taking the wrappings off a mummy.

edit: so the crux is Windows (10) doesn't play nice (pun intended)?
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1742536 · 举报违规帖子
Profile jason_gee
志愿者开发人员
志愿者测试人员
Avatar

发送消息
已加入:24 Nov 06
贴子:7489
积分:91,093,184
近期平均积分:0
Australia
消息 1742534 - 发表于:16 Nov 2015, 13:33:57 UTC - 回复消息 1742528.  

... brings into play terms like "systems analysis", "engineering", and even "theory" - none of which fit comfortably with publishing a rapid-reaction bugfix.


*looks out of cave briefly while recovering from a cold*:
And possibly throw in a healthy dose of ripping off bandaids to fit the engineered solution(s) in place. Not painless or cost free.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1742534 · 举报违规帖子
Richard Haselgrove Project Donor
志愿者测试人员

发送消息
已加入:4 Jul 99
贴子:14141
积分:200,643,578
近期平均积分:874
United Kingdom
消息 1742528 - 发表于:16 Nov 2015, 12:02:23 UTC - 回复消息 1742511.  

Apologies in advance, because some of this will only be accessible to those who are following the related "Slow to startup, slow to start running?" boinc_alpha email thread I was alluding to.

Sekerob's post at Nov 11 at 9:56 PM is relevant:

Does ol <start_delay>300</start_delay> and the windows service delay
function [if installed as service] "Automatic (Delayed Start)" not fit
the bill to take care of sluggish starting? (It does for me!).

There are issues concerning (both, but separately) BOINC's own startup, and the startup of the science applications under BOINC's control after it has itself started. <start_delay> applies to the second phase only.

There are indications (specifically, a 'red dot' on the tray icon, and 'reconnecting to client' in the Manager status bar) that the BOINC client's own initial startup is fighting for resources with Windows - especially Windows 10. My concern is that lowering the client thread priority will actually make this problem worse, and then leave lingering collateral damage throughout the ongoing client session until the next startup (from hours for laptops, to weeks for workstations).

Sekerob's "Automatic (Delayed Start)" is ideal (and I've used it myself), but it only applies to service mode, and thus draws blank stares from the portion of the SETI message board readership who use GPUs. 95%?

We perhaps need to import tricks from the Linux side of the community, like startup scripts where delays or sequence directives can be included to ensure BOINC starts after the X-server, allowing GPU detection. But that's a feature enhancement, and - as I've already posted this morning - brings into play terms like "systems analysis", "engineering", and even "theory" - none of which fit comfortably with publishing a rapid-reaction bugfix.
ID: 1742528 · 举报违规帖子
Profile William
志愿者测试人员
Avatar

发送消息
已加入:14 Feb 13
贴子:2037
积分:17,689,662
近期平均积分:0
消息 1742511 - 发表于:16 Nov 2015, 9:55:03 UTC

I vaguely remember a setting that tells boinc how long to wait before starting?
RTFM...

<start_delay>nseconds</start_delay>
Specify a number of seconds to delay running applications after client startup.

Options portion of cc_config

Helps especially if there's a lot loading at startup on not so fast systems.
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1742511 · 举报违规帖子
Profile jason_gee
志愿者开发人员
志愿者测试人员
Avatar

发送消息
已加入:24 Nov 06
贴子:7489
积分:91,093,184
近期平均积分:0
Australia
消息 1742400 - 发表于:16 Nov 2015, 0:48:58 UTC - 回复消息 1742389.  
最近的修改日期:16 Nov 2015, 1:16:27 UTC

For CPU only tasks, In a *slightly* better implementation for a complex situation, the client might sense the radical overcommit in a similar way to how we do, then take some evasive action.

For me manually that would be by noticing the CPU time on a task is only a tiny fraction of elapsed, and snoozing the client until settled. A periodic probe would only take a few seconds at most. The same method wouldn't trigger for GPU tasks running alone (or possibly other kinds to come), and something optional/user-configurable as opposed to one-size-fits-all seems appropriate as touched on.

I've mentioned the use of fixed magic numbers for file timeouts as being problematic to devs already before. I'm not sure it registered exactly what I was talking about amidst the noise, but I've been gaining a lot of 'faith' in Murphy lately, in that less whining and more patience seems to see the natural order assert itself. [The natural order being that time sensitive programming breaks on a non-realtime OS, which is why a lot has moved to callbacks/IO-completion ports, away from event/interrupt driven and synchronous code. ]
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1742400 · 举报违规帖子
Richard Haselgrove Project Donor
志愿者测试人员

发送消息
已加入:4 Jul 99
贴子:14141
积分:200,643,578
近期平均积分:874
United Kingdom
消息 1742389 - 发表于:15 Nov 2015, 23:48:42 UTC - 回复消息 1742385.  

The request came from someone with a laptop who does extreme things like run multiple VMs at startup under Windows 10 - and I guess startup happens more often with a laptop. I've re-quoted "Hard cases make bad law" back at them.
ID: 1742389 · 举报违规帖子
Profile Bill G Special Project $75 donor
Avatar

发送消息
已加入:1 Jun 01
贴子:1282
积分:187,688,550
近期平均积分:182
United States
消息 1742385 - 发表于:15 Nov 2015, 23:33:09 UTC - 回复消息 1742368.  
最近的修改日期:15 Nov 2015, 23:48:35 UTC

Personally I wish there was a way to delay the Start of programs waiting to auto start with the start of Windows.

However this computer runs 24/7 so startup should not have affected it. (It did an upgrade yesterday which means I had to reinstall the video drivers before I ran SETI)

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1742385 · 举报违规帖子
Richard Haselgrove Project Donor
志愿者测试人员

发送消息
已加入:4 Jul 99
贴子:14141
积分:200,643,578
近期平均积分:874
United Kingdom
消息 1742368 - 发表于:15 Nov 2015, 22:05:26 UTC - 回复消息 1742362.  

Win10 probably does have something to do with it - I haven't tried it myself, but from what I read, Win10 itself can be enough to over-commit the machine, what with all the i/o and disk access going on, especially at startup.

That was why BOINC was asked to reduce its process priority, so that Windows startup didn't get impacted by BOINC getting in the way. But I think they've gone too far, and not allowed BOINC to retain enough resources to do what it needs to do. We'll see.
ID: 1742368 · 举报违规帖子
Profile Bill G Special Project $75 donor
Avatar

发送消息
已加入:1 Jun 01
贴子:1282
积分:187,688,550
近期平均积分:182
United States
消息 1742362 - 发表于:15 Nov 2015, 21:46:45 UTC - 回复消息 1742351.  

Thanks Richard for the explanation. I will wait and see about the next BOINC.

This computer is just a cruncher, but it does run the latest Windows10 beta if that means anything. I have not noticed this error before that I am aware of, just happened to be checking in on an errored WU. I have always run betas on this computer, but it did have a video card upgrade not too long ago. No overclocking and with the temp in the computer room now running around 4-10C I do not think cooling had anything to do with it.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1742362 · 举报违规帖子
Richard Haselgrove Project Donor
志愿者测试人员

发送消息
已加入:4 Jul 99
贴子:14141
积分:200,643,578
近期平均积分:874
United Kingdom
消息 1742351 - 发表于:15 Nov 2015, 21:02:40 UTC - 回复消息 1742344.  

It probably means that your computer is (at least marginally) over-committed - trying to do too many things at once, and having to juggle resources around to meet all the demands on it.

One known problem is that while the AMD FX(tm)-8320 Eight-Core Processor has eight true cpu cores, it only has four floating-point arithmetic units - so while simple applications run at full speed, complicated mathematical apps like SETI have to wait and share.

The specific error message - BOINC is fussy about the length of time an application takes to clean up all the housekeeping, release memory, etc. etc. after it finishes. BOINC wants to get busy and working on the next task, and if the previous one hangs around and refuses to leave home (like an unwanted teenager....), BOINC just boots it out - no supper credits for you, my son.

The good news: a forthcoming BOINC update is expected to extend the housekeeping limits, but don't upgrade yet - the current test version (v7.6.15) also fiddles with the working priority of the BOINC client program, and this afternoon I reported three 'finish file present too long' errors on my test machine since loading v7.6.15 four days ago. I don't think that's a coincidence. I also reported a fifty-fold increase in the rate of

11-Nov-2015 22:07:06 [SETI@home] Task 18my11ab.5995.476.8.12.207_0 exited with zero status but no 'finished' file
11-Nov-2015 22:07:06 [SETI@home] If this happens repeatedly you may need to reset the project.

warnings since the upgrade. If BOINC runs at too low a (process/thread) priority, and is hard-pressed on resources anyway, it's more likely that BOINC will fail to notice that it's attention is needed to service a heartbeat check in an application, or do a task cleanup.

If this is the first time you've seen the error - probably just random bad luck (did two or three tasks all need cleaning up at almost the same time?). If it's repeated - see if the machine is showing signs of stress, like constant hard-disk activity. See what you can do to lighten the load.
ID: 1742351 · 举报违规帖子
Profile Bill G Special Project $75 donor
Avatar

发送消息
已加入:1 Jun 01
贴子:1282
积分:187,688,550
近期平均积分:182
United States
消息 1742344 - 发表于:15 Nov 2015, 20:28:55 UTC

What does this error mean. One wingmate has finished this WU and his results were the same as mine, except I got this error in my stderr file.
The WU is: http://setiathome.berkeley.edu/workunit.php?wuid=1963564702

Mostly just curious if there is something I am doing wrong.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1742344 · 举报违规帖子
前 · 1 · 2 · 3 · 4

留言板 : Number crunching : finish file present too long


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.