Message boards :
Number crunching :
>1 Task per CPU? (Ans: Probably Not)
Message board moderation
Author | Message |
---|---|
Bill Butler Send message Joined: 26 Aug 03 Posts: 101 Credit: 4,270,697 RAC: 0 |
BOINC loads 8 tasks on my Intel i7 board, just like it is supposed to do. And when I check account boxes to use GPU's, BOINC puts 1 task on the Intel GPU and 1 on the GeForceGT for a total of 10 concurrent tasks. Nice. This is all good. Now I get greedy and write an app_config.xml file to put 5 tasks on the GPU and limit the # concurrent tasks to 18. BOINC works well: it puts 5 tasks on the Intel GPU and 5 on the GeForceGT. Those 10 + the 8 on the CPUs = 18. Things work well. Very nice. Messing with various combinations in the app_config.xml file seem to work as expected too. Except that I never get > 1 task on a CPU. Yes, the CPU's are also concurrently building kernels for the GPU's to use. But the CPU's will just run 1 full task at a time, unlike a GPU which multitask several. Therefore, this experiment to try to load > 1 task on a CPU. Clear the account check boxes for Intel & GeForceGT GPU's. Delete the app_config.xml file, restart, let the queue of CPU & GPU tasks run out after about a day. Then BOINC starts loading just 8 tasks on the CPU, as before. And no more GPU stuff in the waiting queue, just plain CPU workunits. This is now a simple basic clean test setup of pure unadulterated CPU work. OK so far. Then I stick in a new simplified app.config.xml file. It makes no reference to GPU's. It just says let setiahome_v7 have max_concurrent 10 tasks. I go through initialization. I'd like to see 2 of the CPU's pick up a second task per the .xml specification and see 10 tasks running. Actual result: Instead I get 8 tasks. The app.config.xml file is ignored. So, I think I have just learned the hard way that you cannot get > 1 task per CPU, unlike a GPU which munches on several at a time. Has anyone else tried > 1 task per CPU? Thank you. "It is often darkest just before it turns completely black." |
rob smith Send message Joined: 7 Mar 03 Posts: 22190 Credit: 416,307,556 RAC: 380 |
No chance of more than one task per CPU, but your '640 should be able to do two at a time. That is unless the CPU supports hyperthreading - an Intel trick that makes each CPU core behave like two. Running more than 2 or 3 on the '640 will actually give you worse overall throughput. I'm not sure about the IBM GPU, but I wouldn't expect it to be as good as the '460, so cut back to 2 per CPU, and enjoy the ride - it will be faster. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
I think there is a ncpus or something in cc_config for testing purposes .. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Indeed <ncpus>N</ncpus> could be added to a cc_config.xml to tell BOINC to simulate any number of CPUs. However doing so will likely have little to no gains. More than likely it would just result in less work done in a given time if gone to far. Much like I suspect is likely happening with the iGPU running 5 tasks at once. Given people have been trying such things since SETI@home started ~16 years ago I suspect it will continue to go on. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Bill Butler Send message Joined: 26 Aug 03 Posts: 101 Credit: 4,270,697 RAC: 0 |
...but your '640 should be able to do two at a time. That is unless the CPU supports hyperthreading - an Intel trick that makes each CPU core behave like two. That's a good point you point out. I think that's what this i7-3770S Intel board is. Documentation says it has 4 cores with 2 virtual CPU's per core for 8 CPU's. And BOINC loads up all 8 CPU's. So, I guess the board is already multitasking at 2 tasks per core. "It is often darkest just before it turns completely black." |
Bill Butler Send message Joined: 26 Aug 03 Posts: 101 Credit: 4,270,697 RAC: 0 |
However doing so will likely have little to no gains. More than likely it would just result in less work done in a given time if gone to far. Much like I suspect is likely happening with the iGPU running 5 tasks at once. Indeed I can load up the 640 with tasks and it slows throughput down. Worse yet too many running tasks can trash one out. As I recall a task failed when a branch ran out of memory. The program tried to get more memory with a malloc() and when that failed the program took an error exit noting that it ran out of memory. Tnx for the link for the <ncpus>. I was unaware of this feature. "It is often darkest just before it turns completely black." |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3776 Credit: 1,114,826,392 RAC: 3,319 |
However doing so will likely have little to no gains. More than likely it would just result in less work done in a given time if gone to far. Much like I suspect is likely happening with the iGPU running 5 tasks at once. This is pretty much it. CPUs run BOINC code natively so are always very close to 100% utilization; there's no need to run >1 instance per core. Conversely, GPUs have to be loaded with work by the CPU; you can see these helper processes feeding work to the GPU in Task Manager->Processes when you have BOINC GPU work running. If they aren't allocated enough due to the CPU core running the process not getting enough timeslices, being too slow, or due to bus overloading, the GPU will run at less than 100% utilization. In this case, running >1 instance per GPU does have significant gains. When I got my first 980, I tested it. With 0.4 cores reserved per GPU thread, I found with 1 instance on the GPU there was 45-55% utilization, with 2 was 70-75% and 3 is 95-100%. So I stuck with 3. Anything more than this is just going to slow it down and waste memory, as the GPU is always maxed out and then other factors such as bus oversaturation can occur. If someone has a slow CPU and a fast GPU, 5 instances could actually be worthwhile, if 4 are tested to be consistently and significantly less than 100% utilization, but this would usually be unlikely. |
Bill Butler Send message Joined: 26 Aug 03 Posts: 101 Credit: 4,270,697 RAC: 0 |
I think there is a ncpus or something in cc_config for testing purposes .. Thanks for the note on this parameter. I was unaware of this. --BB "It is often darkest just before it turns completely black." |
Bill Butler Send message Joined: 26 Aug 03 Posts: 101 Credit: 4,270,697 RAC: 0 |
When I got my first 980, I tested it. With 0.4 cores reserved per GPU thread, I found with 1 instance on the GPU there was 45-55% utilization, with 2 was 70-75% and 3 is 95-100%. So I stuck with 3. How did you measure % utilization on the GPU? "It is often darkest just before it turns completely black." |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3776 Credit: 1,114,826,392 RAC: 3,319 |
How did you measure % utilization on the GPU? EVGA Precision. Essential in the cruncher toolkit. :^) |
Bill Butler Send message Joined: 26 Aug 03 Posts: 101 Credit: 4,270,697 RAC: 0 |
How did you measure % utilization on the GPU? Nice! I see the system requirements include the 640 card. Do you know if it will look at the Intel GPU on the Intel system board? That GPU is slower than your average bear GPU. It is somewhat faster than the CPU, but not impressively so, like Nvidia. "It is often darkest just before it turns completely black." |
rob smith Send message Joined: 7 Mar 03 Posts: 22190 Credit: 416,307,556 RAC: 380 |
Try GPU-Z or CPU-Z, these allow the inspection of the GPU or CPU to see what is running, and one or other MAY be able to inspect your Intel GPU. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Try GPU-Z or CPU-Z, these allow the inspection of the GPU or CPU to see what is running, and one or other MAY be able to inspect your Intel GPU. Or SIV. Claggy |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.