Message boards :
Number crunching :
geforce 260 question
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
zpm Send message Joined: 25 Apr 08 Posts: 284 Credit: 1,659,024 RAC: 0 |
the only other thing that would be an i7 would a dual i7 or 5500+ processor series, with a lot of gpus.. |
Westsail and *Pyxey* Send message Joined: 26 Jul 99 Posts: 338 Credit: 20,544,999 RAC: 0 |
The problem I had was boinc.exe takes nearly a whole core when a couple day cache is run with the multi 295 etc rigs. While getting flops and cache settings etc dialed I once had nearly 1000 tasks in cache. Boinc.exe used so much CPU as to make the manager unable to function and computer lagged hard. (What is default process priority for boinc.exe?) 'Responsiveness' is at 95% now; it would sit at 5-20% with boinc.exe pegged with >~800 tasks. I have since got it to only fetch about 200-400 tasks at a time and this makes everything work..well,.. like a computer lol "The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov |
zpm Send message Joined: 25 Apr 08 Posts: 284 Credit: 1,659,024 RAC: 0 |
the best cache i've found on my quad would be no more than 600 lines in the bm. |
Questor Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 |
Does the "Write to disk at most every '60' seconds" setting have any impact on this? Does BOINC.exe hit a magic number of tasks whereby it spends all it's time making updates and comes back round to the beginning again with no free time? (Where # tasks > ~600) What happens if it can't complete all updates before the default 60 seconds is up - some sort of gridlock? GPU Users Group |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
Some versions back there were a couple of changes to help reduce the BOINC overhead. 1. Not write to client_state (now has some seperate files flying around). 2. Reduced the update frequency to be something like Checkpoint interval x No of cores. Were your i7 observations done using a fairly recent BOINC client or was it some versions ago? BOINC blog |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
Does the "Write to disk at most every '60' seconds" setting have any impact on this? Does BOINC.exe hit a magic number of tasks whereby it spends all it's time making updates and comes back round to the beginning again with no free time? (Where # tasks > ~600) As I understand it, the "write to disk" setting has been modified in BOINC v6.6.x in that it spreads the writes around based upon the number of cores. For example, a quad-core system with a default "write to" will take a total of 4 minutes for all four applications to save to disk. An 8 core system will take a total of 8 minutes for all four cores to write to disk with the default setting. |
Questor Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 |
All very interesting. I just checked on two of my machines and hadn't twigged they both have about 1000 WUs each na both set to "write to disk = 60 secs". i7 has 8 active threads, all threads processing MB and 1 GTS260. Q6600 quad has 4 active cores, all threads processing MB and 1 GTS275. Q6600 normal running each CPU core/thread is running MB at 23-35%, boinc.exe 0-2% boincmgr.exe 0-2%, CUDA task0-2% i7 pretty much the same but 12-13% each CPU running MB All other processes pretty well 0% If I suspend a CUDA task, from cold to start processing takes ~20 seconds and grabs a whole core/thread to get loaded but then drops back to 0-2%. I have just reduced the Q6600 to three cores and it now has ~25% system idle so the GPU isn't trying to use the total capacity of one core. I reduced the i7 to 6 out of 8 threads and it now has ~25% idle time so the GTS260 also isn't trying to use the total capacity of the 2 threads. All of this is not optimal as task switching will occur between tasks when the CPU core/threads are shared. I'm not seeing any odd behaviour / sluggishness etc (I only ever got that when the GPU was running VLAR tasks) or peaks of boinc.exe. I don't have any multi GPU machines so don't know what happens in that case but it sounds from what I'm reading here as though things don't scale well? This is all with BOINC 6.6.37, AK SSSE3X and SSE41 and stock CUDA app. The quad has a RAC of ~8000 and i7 ~8600 at present only usually running about 12hrs per day so in theory 16000 and 17000 per day if 24/7. On the Q6600 say the new MB tasks take ~2 hrs on CPU and ~30 mins on GPU so in 2 hrs I complete 4 GPU and 4 CPU tasks so GPU is repsonsibe for 50%of RAC. If I add another 3 GPUs this gives 3 * 8000 + 16000 = 40000 RAC per day (no overclocking of GPU / CPU) It's late, this is all very approximate and I've probably made some silly error but I'm interested in trying moving a couple of GPUS to one machine (if I can find the crowbar) to see what happens to these figures. GPU Users Group |
Questor Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 |
Back to my original point about VLARkill. If GPU tasks are taking 15 mins each then you can process 96 in a 24 hour period. If you get a bunch of shorties or if you use VLARkill then you are quickly going to exceed your daily quota of 100 per CPU and the GPUs will go idle. My suggestion was based around keeping some CPU processing going so you can swap tasks around using "Reschedule" between CPU and GPU to avoid having to kill VLARs. On one machine with 1000 WUs currently over 200 of those are VLARS which have been rescheduled to CPU i.e. a whole processors worth in a quad machine with one GPU. With no CPU processing your only option with VLARs if you don't want to crunch them is throw the away. So it 'may' be worth sacrificing a small fraction of GPU processing to keep an extra buffer of CPU tasks which you can swap with GPU when you have a lot of VLAR tasks to help avoid running dry and so keep your RAC up. These calcs will all change of course with the longer tasks but the principle remains the same. GPU Users Group |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
I made some test with only 2 or 1 CPU task, but the GPU performance went also down. To now I didn't used the 'rebranding tool'. AFAIK, I need to do it manually.. so nothing for me.. I like it that everything go automatically. ;-D If you have a GPU installed, the 'Maximum daily WU quota per CPU 100/day' is disabled and only the 'Maximum daily WU quota per GPU 500/day' is used. |
Questor Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 |
Hi Sutaru, In what way does the GPU performance go down? I would like to understand this more to avoid causing myself headaches if I try to run multiple GPUs. If only small anyway I wondered if this would be worth it to avoid running out of WUs and thefore the GPU being idle? The Reschedule 1.9 has an automatic mode so it will run when you want it to. I understand you wanting to do things automatically with the amount of work you are crunching! Thanks, I didn't realise about the change in quota for GPU - this explains some things. So are you saying that on my quad I will get 500 for the GPU and 400 for the 4 CPUS i.e. 900 per day or just a total of 500 - I assume just 500? So it would still be possible with a fast GPU and lots of VLARs to run out of work? I think I have seen some of your messages saying sometimes you have troubles getting enough work to keep your monster cruncher active :-) John. GPU Users Group |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
... Sutaru may have been relying on some of my postings, I'm not totally sure they still apply. There was a post about an 8 core system with one GPU which was being limited by quota to 500 tasks/day, and at that time the source code was written so the quota times the number of GPUs multiplied by the <gpu_multiplier> value set in the project config.xml file was the total, except the old CPUs formula was used if there were no GPUs. The source code has since been rewritten so it ought to be 100 * nCPUs + 500 * nGPUs, it's hard to tell whether that change is in use here or not. Any GPU which averages less than 172.8 seconds (86400/500) elapsed time per task might run into the quota, though, so perhaps Sutaru or Vyper can clarify by observation. I thought the 16 core system... now running out of work thread indicated the CPUs were still not being fed if a GPU was present, but that may have been a bad guess. Joe |
Questor Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 |
Thanks for clarifying the quotas Joe. Sutaru, I'm still not clear from your earlier post. If you have a big WU cache.. you have very big [25 % CPU - 100 % Core] and long boinc.exe peaks in TaskManager. Is this when you are running WUs on your CPU or none? If none how long for? When my WUs are running on CPU each core is running flat out at about 25% each (i.e. 100% for each core but windows divides by the number of cores) but this is expected. I do not see big peaks in boinc or System activity but only have a cache 1/3 of yours. If this CPU activity is when you are not running Seti CPU tasks, what processes are peaking all the cores to maximum? Every ~ 5 sec. for ~ 5 sec. This increase with much ULs/DLs. I can understand all activity peaking while lots of UL/DL happening but isn't that not an unusual case. When things are running smoothly you would have infrequent upload / downloads which should not have a big system impact. One exception to this could of course be the VLARkill - you download a task and almost immediately it is aborted and more downloading happens. Also the 'System' in TaskManager have high, up to ~ 13 % CPU. If the OS is busy this can briefly increase but should only be transient? If I would crunch also on the CPU, this boinc.exe/System peaks [normal priority] disturb all which have lower priority. CPU and GPU tasks. The Windows task switching will mean some tasks get a smaller slice of CPU. By the GPU do you mean the "Windows CUDA app"? This is only showing the amount of CPU being used to feed the GPU so is only very small anyway - or is on my system (0-2% of a core). Task switching has an overhead but your "Windows CUDA app" will probably task switch even if not running SETI tasks as it has no CPU affinity. Because of this, the GPUs would idle/stop from time to time. Do you mean that you see all your GPU tasks show as waiting and none of them running for a period of time - if so how long. Old MB AR=0.44x WU on CPU = ~ 60 min. Same WU on GPU was 6m:45s , now longer WU [same AR].. ~ 10 min. On the CPU it should be now ~ 120 min. This mean the GPU is 12 x faster than one CPU Core. And this are 4 x OCed GTX260-216 with AMD Phenom II X4 940 BE @ 4 x 3.0 GHz. Are these new figures for GPU tasks (10 mins) for the newer tasks or are they resent older short WUs or do you just have some shorties? I am seeing GPU tasks completing in 6-10 mins but that is faster than they ran with the shorter WUs. Even with a 30% lift from CUDA2.3 this seems too fast for the longer WUs but I need to monitor for a few days to see some averages. On another machine I am seeing 17-25 mins from a GTS250 (I am also running CUDA2.3 on all but one machine) Sorry - so many questions ............ P.S. As another aside, do you run virus checker software and does this include the BOINC data directories? [Edit]On the machine where I was seeing 10min CUDAs I just saw one at 21mins - I assume this is a new one.[/edit] GPU Users Group |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
The 500 WUs/GPU are.. I had only 2 GPUs -> max. 1,000 WUs/day Now with 4 GPUs -> max. 2,000 WUs/day If you have a big WU cache.. you have very big [25 % CPU - 100 % Core] and long boinc.exe peaks in TaskManager. Both, if CPU/GPU and only GPU - it depend how big WU cache. Every ~ 5 sec. for ~ 5 sec. Yes, but with high WU cache and the normal unplanned server outages at Berkeley.. the boinc.exe peaks are long and often. Also the 'System' in TaskManager have high, up to ~ 13 % CPU. In the same time if boinc.exe have big (25 % CPU - 100 % Core) acivity, System have ~ the half activity. Little strange.. if I reduce boinc.exe priority to 'lower as normal'.. the System have no peaks/activities. If I would crunch also on the CPU, this boinc.exe/System peaks [normal priority] disturb all which have lower priority. CPU and GPU tasks. The OS Windows (XP) isn't intelligent enough for to disturb tasks in their 'priority hierarchy'. The TaskManger show well this. For example with BOINC: CPU tasks have 'low' priority. GPU tasks have 'lower than normal' priority. boinc.exe have 'normal' priority. So - if boinc.exe have activity, CPU and GPU tasks are involved/disturbed. Yes, the GPU get only CPU support, but if this is 0 % CPU - the GPU stop/idle. And if you have high performance GPUs (GTX2xx series) this is very bad. Because of this, the GPUs would idle/stop from time to time. Sometimes the GPU wall clock calculation time was ~ 3 x slower for the same AR WU. Old MB AR=0.44x WU on CPU = ~ 60 min. This are the new longer MB WUs. A GTX260-216 should have the double performance than a GTS2xx. My GPU cruncher is a 100 % pure crunching machine. So no virus protection or others. Of course - firewall (Windows XP).. ;-) I hope I answered all your questions well. If not.. we could continue in the new thread here? 'Best GPU performance' |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.