Message boards :
Number crunching :
Optimum Cache Size?
Message board moderation
| Author | Message |
|---|---|
|
Jim Wilkins Send message Joined: 11 Oct 99 Posts: 70 Credit: 1,658,376 RAC: 0
|
I have 2 medium fast machines. Each one runs around 7- 8 projects. My cache size is set to zero and I always have work, albeit maybe not SETI work. Works for me. Jim |
gizbar Send message Joined: 7 Jan 01 Posts: 586 Credit: 21,087,774 RAC: 0
|
I am in awe of the programmers. I dabbled with Basic enough to get my Computer Sciences 'O' level (in UK) over 20 years ago. I know what I want to achieve, but normally can't make it do what I want. Back on topic, I let Boinc do the managing. If I come across a problem, I try to let enough people know so that a workaround or fix can be engineered. Normally the problem has already been seen or dealt with. My cache, I believe, is set to 7 days. It's been that long since I checked. regards, Gizbar. A proud GPU User Server Donor! |
|
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0
|
Figure it out. You are the programmers, not I....... People say this like programming is easy. It isn't. First one has to figure out the exact cause. It could be the RPCs, I haven't done the analysis. Makes sense. Then, the question is: do you need a new RPC, or do you need a smarter way to use the RPCs you have. Then it's time to write the code. Then there is testing. For each hour of coding, 5 to 10 hours of "bench testing" isn't unusual. If the fix doesn't work, you have to play again. |
Keith T. Send message Joined: 23 Aug 99 Posts: 919 Credit: 537,293 RAC: 20
|
My car speedo goes up to 160MPH, should I drive it at that speed? BOINC allows 10 days cache, should I use it? I usually drive within, or maybe 5 or 10 over the limit. I usually run 2 or 3 days cache. |
kittyman ![]() Send message Joined: 9 Jul 00 Posts: 50494 Credit: 1,018,363,574 RAC: 2,276
|
It's a basic flaw in the way Boinc works. Not a simple one-line cockup. I am not the only one who has documented this. And will not be the last. BM has to be fixed. There are far too many RPCs to handle...... It should not have to check every WU every freakin' second to make sure all is well. "Learn from yesterday. Live for today. Hope for tomorrow." Albert Einstein "With cats." kittyman
|
|
Nick Send message Joined: 16 Oct 09 Posts: 81 Credit: 112,909 RAC: 0
|
I guess the testers did not do a stress test on a machine with a Cuda card and the cache set on a high setting? Aren't problems like this usually caught in the testing phase of the project's life cycle? The programmers usually code the thing and do some basic testing. Unless they are experts in testing, problems do and will creep through... |
kittyman ![]() Send message Joined: 9 Jul 00 Posts: 50494 Credit: 1,018,363,574 RAC: 2,276
|
So anybody tell the Boinc devs about their 'task handling' problem? Because I CAN........LOL. No, I did not intend for it to get that out of hand.........at all. The rescheduler simply tweaked something in Boinc that made it go rouge. And without me asking it to........it downloaded many thousands of bits of work....which, I might add, it will probably turn around within the alloted timeframe. But my point was....... BOINC SHOULD BE ABLE TO HANDLE IT. The simple fact that the sucker downloaded WAY too much work should not bring the show to it's knees. There is a problem with Boinc, and I simply found the flaw. "Learn from yesterday. Live for today. Hope for tomorrow." Albert Einstein "With cats." kittyman
|
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0
|
Hi, since the use of CUDA, it seems logical, IMHO, you up your cache, with a few days. For over an half year, I used:Maintain enough work for an additional 5 days Enforced by version 5.10+ It strongly depends on your hardware, that why I added 2 days, to my 3 days, cache. Just have to try, f.i. by starting with 2 days and up it slowly, if you do run out of work. Once you've set it, let BOINC adapt to it, which takes some time, about a week.
|
Raistmer Send message Joined: 16 Jun 01 Posts: 6242 Credit: 106,370,077 RAC: 275
|
If ten days cause issues, why can you set for 10 days? Because they cause issues not for all hosts, only for high-performance ones. And nobody had time to invent some algorithm to account for this and change upper limit for each host individually. Each user has own head to see consequencies of his/her actions. 10 days is not default value. If not sure - use default, if change default - think and learn about consequencies. |
James Sotherden Send message Joined: 16 May 99 Posts: 10436 Credit: 110,373,059 RAC: 123
|
I run a 3/4 day cache. I find I can get by most small outages. I run all 3 of my crunchers 24/7 also. If i run out of work theres allways the need to blow the dust bunnies out, and then power ther things down for a rest. Or crunch another project if you desire. My Mac shares seti with milkyway. Thats just my two cents . [/quote]Old James |
|
Nick Send message Joined: 16 Oct 09 Posts: 81 Credit: 112,909 RAC: 0
|
If ten days cause issues, why can you set for 10 days? If you are running a modest machine without a Cuda gpu, a 10 day cache would give you, say in the case of my laptop with the GPU turned off, about 130 80-credit WUs (with an optimizer app). BOINC can handle this without breaking a sweat. Now, my E8400 desktop with its mid-range Cuda GPU would load about 900 WUs. Still not too bad, but I'd hate to have that kind of responsibility sitting on my HD in case I have a hardware failure (it does happen). A super cruncher like Sutaru Tsureku runs: http://setiathome.berkeley.edu/show_host_detail.php?hostid=4789793 would probably give BOINC issues if it was set to 10 days... |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0
|
So anybody tell the Boinc devs about their 'task handling' problem? Mark..... Why are you punishing your machines with an excessive cache? It has been known for some time that excessive cache size results in Boinc.exe constantly using CPU time creating updates for the client_state.xml file. This draws CPU resources away from the science app and as you say makes the system unstable. At this time my main machine has an uptime of 2 days and 2 hours. Boinc.exe has used 0:12:58 of cpu time in that same time. This all according to the task manager. I run a 4 day cache and this computer has about 1100 work units in it's cache for both the CPU and GPU. I tried a 5 day cache and immediately saw CPU time for Boinc.exe go up so I reduced it back to a 4 day cache which I consider optimal. It's long enough to cover a holiday weekend, what else do you need? Boinc....Boinc....Boinc....Boinc.... |
|
Nick Send message Joined: 16 Oct 09 Posts: 81 Credit: 112,909 RAC: 0
|
Holy ET! Came back after Smallville and the thread has lots of great input. Decided to go with 3 days since my GTS250 is crunching away at over 4 times the speed of my E8400. Thanks everyone for the input... |
|
Dena Wiltsie Send message Joined: 19 Apr 01 Posts: 1626 Credit: 24,230,968 RAC: 59
|
Setting ten days would not cause me problems because I could never do more than 48 a day and most of the time it would be far less. It's the people with supper crunchers that have the problems. |
|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 12988 Credit: 208,696,464 RAC: 690
|
So anybody tell the Boinc devs about their 'task handling' problem? They know of it (along with many other issues). The easiest fix would be to reduce the maximum cache size people can select. Problem solved. Is that what you want? From posts in other threads, they are working on a work around, but as it involves some significant changes in how things are done it's not an easy problem to resolve without causing other problems in the process. The manager should be able to handle it without pulling a rope around the nuts of the system it is being hosted on.......this is pure BS. Take a few deep breaths. Back in the early days of computing, total memory was less than a few kB. Hundreds of kB just wasn't possible to imagine. Let alone MBs or GB of system RAM. Likewise the thought that 10 days work of work would involve 1,000s of work units & not just a few hundreds just didn't enter in to any calculations untill the GPU client came along. Grant Darwin NT |
j mercer Send message Joined: 3 Jun 99 Posts: 2408 Credit: 12,323,733 RAC: 1
|
If ten days cause issues, why can you set for 10 days? ...
|
kittyman ![]() Send message Joined: 9 Jul 00 Posts: 50494 Credit: 1,018,363,574 RAC: 2,276
|
So anybody tell the Boinc devs about their 'task handling' problem? Or is their only response is to 'not carry such a large cache' Sorry excuse for such bad manager behaviour. The sucker DOES crash itself trying to keep track of over 4-5k WUs............................... It should be bullitproof, dudes. And don't mind telling my what kind of cache I should carry.......that is NOT the freakin' issue........your manager is. 1000, 2000, 9000.... The manager should be able to handle it without pulling a rope around the nuts of the system it is being hosted on.......this is pure BS. Figure it out. You are the programmers, not I....... I only report reality.......and this just bites. "Learn from yesterday. Live for today. Hope for tomorrow." Albert Einstein "With cats." kittyman
|
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 12
|
From my experiences.. 1,500+ tasks and you have one CPU-Core only for boinc.exe (BOINC client). Windows TaskManager. This is a ~ 3 day WU cache on my GPU cruncher.
|
kittyman ![]() Send message Joined: 9 Jul 00 Posts: 50494 Credit: 1,018,363,574 RAC: 2,276
|
Oh Jesus.......let me tell you about this.... Boinc just sux at handling massive caches. My top rig just went through a phase..... Used the 'rescheduler' one too many times, and Boinc went Boinczerk...... It tried to download over 4000 tasks in a couple of days......and then hellstruck. There was so much boinc.exe traffic that it could hardly even manage freakin' comms........got errors on almost every try it made.....so it could hardly clear things.......90% of CPU time was just Boinc.exe.........no real crunching getting done.........nasty. Had thousands of WUs it was trying to upload and download........and couldn't get a word in edgewize against the boinc.exe that was taking full control of the cpu. I fianlly set the master control (took me a loooooooong time to manage to even get there.......) to no new tasks........and it took another 2 days to download, 1 by 1, the tasks it had already been assigned by the scheduler. Very nasty situation....... If you go over 4k tasks, you risk meltdown............ BabyK had over 8000........it is now less than 6000, and things are starting to flow normally again. So I would say anything more the 4k Wus .......you might be in trouble. "Learn from yesterday. Live for today. Hope for tomorrow." Albert Einstein "With cats." kittyman
|
Pappa Send message Joined: 9 Jan 00 Posts: 2562 Credit: 12,301,681 RAC: 0
|
Welcome Before anyone tells you to set 10 Days or your Cuda cards will starve. There are some things to know. With the more recent versions of Boinc you can set a low connect to (how often the machine phones home in your preferences) and in a file called "cc_config.xml" you can tell it to maintain work for xx.xx days. If you just set it for xx.xx days then it cause other problems (Pending Credits, When it actually reqeuts work and more). So a low connect time coupled with the caching effect gets past a few obsticales. Your computer asks for work when it needs it. So with the "newer versions" of Boinc it is actually attmepting to request what is needed for the hardware resource. Regards Please consider a Donation to the Seti Project. |
©2020 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.