CPU affinity in multicore systems

Author	Message
John Clark Volunteer tester Send message Joined: 29 Sep 99 Posts: 16515 Credit: 4,418,829 RAC: 0	Message 716043 - Posted: 20 Feb 2008, 23:37:11 UTC Last modified: 20 Feb 2008, 23:37:33 UTC I remember earlier versions of Crunch3r's optimised SETI clients there was code to set processor affinity. I see from Task Manager that this is not the case with the V2.4. What are the advantage or disadvantage to having 4 WUs crunching on a Quad, but either floating across the 4 cores or to have each WU locked to a core? It's good to be back amongst friends and colleagues ID: 716043 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14661 Credit: 200,643,578 RAC: 874	Message 716048 - Posted: 20 Feb 2008, 23:41:08 UTC - in response to Message 716043. Last modified: 20 Feb 2008, 23:52:24 UTC I remember earlier versions of Crunch3r's optimised SETI clients there was code to set processor affinity. I see from Task Manager that this is not the case with the V2.4. What are the advantage or disadvantage to having 4 WUs crunching on a Quad, but either floating across the 4 cores or to have each WU locked to a core? Processor affinity isn't set by the SETI application, but by the BOINC client. I think Crunch3r's recent BOINCs still have it. I tried one of them (and Trux's, before that) when I was trying to work out the behaviour of my then-new 8-core, 15 months or so ago: I couldn't detect any difference. I think others have reported similar findings. Edit - results reported here. ID: 716048 ·

John Clark Volunteer tester Send message Joined: 29 Sep 99 Posts: 16515 Credit: 4,418,829 RAC: 0	Message 716113 - Posted: 21 Feb 2008, 1:30:00 UTC Thanks Richard I felt there may have been an advantage to the processor lock. But if past trials showed little differences then the question has been answered. I will leave the thread open to see if anyone else wants to chime in. After that I will ask for the thread to be locked. It's good to be back amongst friends and colleagues ID: 716113 ·

Alinator Volunteer tester Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0	Message 716130 - Posted: 21 Feb 2008, 1:52:51 UTC IIRC, the benefits are: 1.) Locking a task to specific core avoids the overhead involved if the OS decided to move it off to a different core for some reason. 2.) There's an advantage by not having two tasks from the same project running on shared cache CPU's. Presumably this is due to reduced contention for the shared resource between the two processes. Alinator ID: 716130 ·

David Volunteer tester Send message Joined: 19 May 99 Posts: 411 Credit: 1,426,457 RAC: 0	Message 716132 - Posted: 21 Feb 2008, 1:54:57 UTC I run Crunch3rs SSSE3 version and when a WU completed, one of the 4 cores drops from 100% to about 15% for a second or so before the next WU is started. I had a run of 23min Wu's and noted that each time it was a different core that dropped, thus I believe that each instance seems to be running on 1 core. It's funny tho, on one PC the processor drops, but the other one does not. ID: 716132 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14661 Credit: 200,643,578 RAC: 874	Message 716133 - Posted: 21 Feb 2008, 1:55:42 UTC - in response to Message 716130. IIRC, the benefits are: ..... But are they measurable in practice? ID: 716133 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51470 Credit: 1,018,363,574 RAC: 1,004	Message 716136 - Posted: 21 Feb 2008, 1:58:59 UTC - in response to Message 716133. IIRC, the benefits are: ..... But are they measurable in practice? NOT.......... "Time is simply the mechanism that keeps everything from happening all at once." ID: 716136 ·

Alinator Volunteer tester Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0	Message 716139 - Posted: 21 Feb 2008, 2:04:41 UTC - in response to Message 716133. Last modified: 21 Feb 2008, 2:07:27 UTC IIRC, the benefits are: ..... But are they measurable in practice? I'm pretty sure archae86 did some testing a ways back (could have just been for hyperthreaded CPU's though), and although it wasn't a huge difference it was measurable. Something like 3 or 4 percent comes to mind about it. Most likely it amounts to even less today for the current Intels. @ David: Most likely is that one of the cores has to pick up the BOINC CC to handle the comm and other 'cleanup' chores for ending the current task and starting the next one. <edit> LOL... I see Mark saw this thread, and if anyone would know he probably does! ;-) Alinator ID: 716139 ·

archae86 Send message Joined: 31 Aug 99 Posts: 909 Credit: 1,582,816 RAC: 0	Message 716188 - Posted: 21 Feb 2008, 4:07:17 UTC - in response to Message 716139. I'm pretty sure archae86 did some testing a ways back (could have just been for hyperthreaded CPU's though), and although it wasn't a huge difference it was measurable. Something like 3 or 4 percent comes to mind about it. I saw benefit from mixing SETI and Einstein that was strong on my hyperthreaded Gallatin, though it varied quite a bit with specific application releases. There remain some interesting effects in mixing Einstein with SETI, and in the interaction of high-VHAR SETI units, but none of those points is the topic of this thread. I don't recall ever reporting a benefit from restricting SETI or Einstein applications to a specific processor, however, which is the topic of this thread. I did a brief experiment reported in this post, from which I concluded there was no observable benefit from affinity setting for the case at hand. In fact, while the results were probably below significance, at face value they showed a tiny deficit. My personal opinion is that a general impression that affinity "ought to" help keeps a durable myth alive in the face of multiple non-confirming tests and a lack of carefully reported confirming tests. ID: 716188 ·

Alinator Volunteer tester Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0	Message 716199 - Posted: 21 Feb 2008, 4:28:24 UTC Last modified: 21 Feb 2008, 4:31:10 UTC Yep, thanks for posting the link to the more recent work you did on the question. That was was got me think of you when I posted earlier, but couldn't find it right away. Sorry if my post made it sound like you had confirmed that 'urban legend'. When you stop to think about it, any well designed SMP kernel would try to avoid setting off a scenario which would lead to wasteful 'context switches' across true multi-core processors unless there was no other choice. I suppose it's even possible if you restricted the kernels ability to schedule by setting affinity manually, it might lead to it having to make 'bad' choices for other tasks and have the net effect of degrading everything as a result. Hyperthreading ones are a whole different story, and the well documented problems with that were one reason it was withdrawn when the Core family was released. Alinator ID: 716199 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19227 Credit: 40,757,560 RAC: 67	Message 716280 - Posted: 21 Feb 2008, 9:38:10 UTC I ran dual P3 and a P4 HT computers on Einstein and Seti, with no work cache and 50:50 share. On both there was a benefit to Seti of about 7% compared to running Seti:Seti but saw no benefit to Einstein in any configuration. This was over 18 months ago and apps at both sites have changed considerably since then. I have not tried since then, E6600 and Q6600 in family belong to sons and computers are frequently not here so monitoring would be difficult. ID: 716280 ·

John Clark Volunteer tester Send message Joined: 29 Sep 99 Posts: 16515 Credit: 4,418,829 RAC: 0	Message 716282 - Posted: 21 Feb 2008, 9:57:37 UTC An interesting debate, and concensus is there is no advantage to using processor affinity. Good! It's good to be back amongst friends and colleagues ID: 716282 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14661 Credit: 200,643,578 RAC: 874	Message 716286 - Posted: 21 Feb 2008, 10:11:37 UTC I saw Crunch3r was active in the next-door thread five minutes ago. Surprised he didn't drop in here - he's been the main advocate of CPU affinity in recent years. ID: 716286 ·

Keck_Komputers Volunteer tester Send message Joined: 4 Jul 99 Posts: 1575 Credit: 4,152,111 RAC: 1	Message 716307 - Posted: 21 Feb 2008, 11:09:08 UTC - in response to Message 716282. An interesting debate, and concensus is there is no advantage to using processor affinity. Good! I would say the advantage is negligable rather than none existant. Earlier comparisons seemed to indicate there was some advantage to setting affinity with discreate processors, less than 5% though. Since there are so few systems of this type it was not the effort to put into a standard client. BOINC WIKI BOINCing since 2002/12/8 ID: 716307 ·

DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2	Message 716378 - Posted: 21 Feb 2008, 14:54:27 UTC It would make sense that a dumber kernel would require CPU affinity on tasks. However, with the Linux 2.6 kernel and the work MS did on theirs, they are smart enough now to use CPUs better than ever before. However, if you use Intel speedstep or amd powernow, keeping a task on 1 CPU will prevent the CPU from speeding up and hitting the brakes back and forth. I would also venture to guess that having a lot of CPUs (9 or more) would make affinity more worthwhile. This would keep cache misses lower and the CPUs from fighting each other for memory access. ID: 716378 ·

AlphaLaser Volunteer tester Send message Joined: 6 Jul 03 Posts: 262 Credit: 4,430,487 RAC: 0	Message 716388 - Posted: 21 Feb 2008, 15:40:40 UTC On a related note, Intel's Dynamic Acceleration, a feature on mobile Penryn processors meant to increase performance when only one core is being utilized, was found to only work when the application was set via CPU affinity (source). This was the case under Windows Vista, other OS's may be different. ID: 716388 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.