Message boards :
Number crunching :
one cpu or two cpu's, that is the question
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Astro Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
One thing I get comfort from (as I assume many new people should) is that any answer they get to a question should be trusted. Because, if the person answering the question makes even the slightest error, or even a percieved error, then other members of this forum will quickly correct them. So, a person who asks a question just has to wait 20 minutes after the first answer to be sure of it's accuracy. sorry to be off topic. I know if I make a mistake, I'll be quickly corrected, and as long as I see the bigger picture (many others learning from someone elses posts)then I can try to bury my feelings about being corrected. tony |
Captain Avatar Send message Joined: 17 May 99 Posts: 15133 Credit: 529,088 RAC: 0 |
> sorry to be off topic. > > I know if I make a mistake, I'll be quickly corrected, and as long as I see > the bigger picture (many others learning from someone elses posts)then I can > try to bury my feelings about being corrected. > > tony > Thats for sure! I'll answer a question the 20min later someone else will say the same thing but the add a bunch of unneeded info which confuses the situation.... Hi Tony! |
N/A Send message Joined: 18 May 01 Posts: 3718 Credit: 93,649 RAC: 0 |
Sometimes the best way to speed something up is by leaving it alone. A stripped-down Linux kernel is always fastest - more so when compiled for a particular CPU. And when you have the kernel and the CPU it's being run on trying to manage clock cycles... [shakes head] You only need one traffic cop - Two cause chaos. Turn off HT if you're only crunching. Turn it on when you're shooting up aliens in a 3D game. |
fienna Send message Joined: 23 Aug 99 Posts: 14 Credit: 864,061 RAC: 0 |
yes, disabling HT is an ok thing to do. for single threaded processes, it will ALWAYS be faster to not have HT enabled. for multi-threaded processes, it will always be faster to have HT, though you'll never get more around a 40% increase in speed. because BOINC lets you run multiple processes, using HT does mean that you process work units faster, however, it looks like your benchmarks are cut in half, meaning that in effect, there's no difference in the amount of claimed credit. Now this is where things get interesting. Let's say that i'm using 2 HT processors (so 4 total) and all of them are processing BOINC units. My claimed credit will be about 1/2 as much as if i had HT disabled. (i measured it and it looks like it's about 42% less per unit) however, i'll be producing roughly 1.8ish to 1.9 times as many work units in the same amount of time, meaning that my total claimed credit is only reduced by about 10-20% HOWEVER, because i'm crunching units SO MUCH FASTER than most of the people out there (xeons are good like that) about 80-90% of the time MY claimed credit is the one that is dropped. the next highest claimed credit (the one that is actually counted) is on AVERAGE 40% or so higher than mine, which offsets the loss of claimed credit earlier. meaning that even though my claimed credit drops when i have HT enabled, i'm making up for it with the higher claims of others. so despite the loss of claimed credits when you have HT enabled, it's much better to use it cause you can do twice as many units and get relatively the same credit for each unit (effectively doubling your output.) This thinking more or less works with any HT enabled INTEL Pentium 4 or Xeon processor - there are a LOT of Athlons out there with inferior integer and FPU scores and people using old processors to run BOINC. -- dead stars still burn dead still stars burn <br> <img src="http://www.boincstats.com/stats/banner.php?cpid=eb47c3deca50beb5e7e1d23c186e7bad"> |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
The use of HT in a single physical processor is like all attempts to increase throughput, a compromise. In a simple case lets look at using two physical processors in a system. WIth that you would think that you would get 2x performance. But now you have two processors using the same memory bandwidth, meaning that you have created a new throughput bottleneck. Think of trying to fill the pool while watering the yard ... both tasks take longer because you have only so much capacity to pipe water through your house. With HT turned ON, you get a boost in throughput at the expense of run time. As was stated below you get 40% to 60% more done in a unit time, but each of the tasks takes longer. If you compare the numbers for a 2.8 GHz single threaded processor against a HT at 3.0 GHz you will see that the 2.8 GHz machine is faster. But it only does one WU at a time. With HT on, you get two done at the same time (see the numbers). If you look at those numbers you can see that to "beat" the 2.8 GHz machine I had to use a 3.2 GHz chip with four times the cache memory. The up and comming processors are going to be increasing the parallelism with more physical processors and internal logical processors and / or additional special function accellerators (think Floating Point Units). We may even see "designer" chips where they are constructed of different "blends" of accellerators so that you can pick the CPU that will do best for your workloads. For those of us with BOINC Farms we want lots of FPU and some integer, but may trade off on some of the other possible accellerators ... [edit] FOrgot, for those interested there are "lessons" where I talk to the evvolution of the CPU ... there are also small discussions of Logical vs. physical processors and FPU ... also see Floating Point ... |
Steve Cressman Send message Joined: 6 Jun 02 Posts: 583 Credit: 65,644 RAC: 0 |
- there are a LOT of Athlons out there with inferior integer and FPU > scores Why are people so misinformed about the speed of AMD? Measured floating point speed 1909.55 million ops/sec Measured integer speed 3248.21 million ops/sec What is inferior about those numbers? 98SE XP2500+ @ 2.1 GHz Boinc v5.8.8 And God said"Let there be light."But then the program crashed because he was trying to access the 'light' property of a NULL universe pointer. |
fienna Send message Joined: 23 Aug 99 Posts: 14 Credit: 864,061 RAC: 0 |
> - there are a LOT of Athlons out there with inferior integer and FPU > > scores > > Why are people so misinformed about the speed of AMD? > > Measured floating point speed 1909.55 million ops/sec > Measured integer speed 3248.21 million ops/sec > > What is inferior about those numbers? What is your typical Claimed Credit per work unit? Also, there is a wide variety of Athlon processors available. I know that the 3000+ series and up are nice and fast, but still not as good as a Xeon processor. BTW - for a great white paper about HT - go here: http://www.intel.com/business/bss/products/hyperthreading/server/ht_server.pdf -- dead stars still burn dead still stars burn <br> <img src="http://www.boincstats.com/stats/banner.php?cpid=eb47c3deca50beb5e7e1d23c186e7bad"> |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> Measured floating point speed 1909.55 million ops/sec > Measured integer speed 3248.21 million ops/sec > > What is inferior about those numbers? Nothing at all ... they are fine numbers. They, like all benchmarks, are meaningless. They key question is "how fast is the processor to do my actual workload?" The Dell dual-Xeon I just bought impresses me in that it is doing 4 work units at the same time my other machines are doing only 2 ... though I wrote down the benchmark numbers they are useless ... what is interesting is that it does a SETI@Home WU in 2.5 to 3.0 hours ... |
fienna Send message Joined: 23 Aug 99 Posts: 14 Credit: 864,061 RAC: 0 |
> The Dell dual-Xeon I just bought impresses me in that it is doing 4 work units > at the same time my other machines are doing only 2 ... though I wrote down > the benchmark numbers they are useless ... what is interesting is that it does > a SETI@Home WU in 2.5 to 3.0 hours ... yeah, but the sad thing is that your credit for those wu's is dependent in part by thos meaningless benchmarks because of the new system. -- dead stars still burn dead still stars burn <br> <img src="http://www.boincstats.com/stats/banner.php?cpid=eb47c3deca50beb5e7e1d23c186e7bad"> |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13739 Credit: 208,696,464 RAC: 304 |
> > Measured floating point speed 1909.55 million ops/sec > > Measured integer speed 3248.21 million ops/sec > > > > What is inferior about those numbers? > > Nothing at all ... they are fine numbers. They, like all benchmarks, are > meaningless. They key question is "how fast is the processor to do my > actual workload?" Yep, Whetstone & Dhrytsone have no relationship to actual computing performance- especially the Whetstone. This chart is the perfect example of that. If you were to buy a CPU based on those figures you'd think that any 2.4GHz Intel CPU or faster would be better than the best that AMD has to offer for the desktop. Yuo'd be very disappointed as in actual appliactions the AMD is generally faster than the fastest Intel Desktop CPU, and by a significant amount in some of them. Grant Darwin NT |
Kajunfisher Send message Joined: 29 Mar 05 Posts: 1407 Credit: 126,476 RAC: 0 |
AuthenticAMD mobile AMD Athlon(tm) XP 2000+ w/1GB RAM Measured floating point speed 1131.93 million ops/sec (today) Measured integer speed 2595.49 million ops/sec (today) It may not be the fastest out there (that's for sure) but I'm content with it for now As I said before, I have seen these benchmark stats change, up & down *Question Do all the BOINC projects use the same benchmark program? I'm sure the answer is in Paul's FAQ, just happened to think about it in the middle of my post... . No matter where you go, there you are... |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
> The use of HT in a single physical processor is like all attempts to increase > throughput, a compromise. In a simple case lets look at using two physical > processors in a system. WIth that you would think that you would get 2x > performance. But now you have two processors using the same memory bandwidth, > meaning that you have created a new throughput bottleneck. ... and you get the additional overhead of trying to manage tasks running on two CPUs. Back in the day (when I had dual and quad Burroughs B-Series mainframes at my disposal) it was generally accepted that each additional processor added something like .8 so dual mainframes were 1.8 times the power of a single machine, three came out around 2.6, etc. |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> *Question > Do all the BOINC projects use the same benchmark program? I'm sure the answer > is in Paul's FAQ, just happened to think about it in the middle of my post... Yes. The benchmark is part of the BOINC Daemon. There is code that will allow a project to tilt the scoring if they desire, but as far as I know, no project has chosen to do so. @Ned, Yea, 1.8 to 1.9 ... though for the IAPX432 that held good until about the 3rd of 4th processor and then total performance dropped way low because the processors were talking to each other so much they had no time left to work. :) |
Kajunfisher Send message Joined: 29 Mar 05 Posts: 1407 Credit: 126,476 RAC: 0 |
> > *Question > > Do all the BOINC projects use the same benchmark program? I'm sure the > answer > > is in Paul's FAQ, just happened to think about it in the middle of my > post... > > Yes. The benchmark is part of the BOINC Daemon. > > There is code that will allow a project to tilt the scoring if they desire, > but as far as I know, no project has chosen to do so. > could that possibly mean "in their favor"? what i mean is, that the WU they send is predetermined (larger/slower/more cpu intensive) to maybe avoid server overloads, etc.? (i'm way out on a limb here i think) . No matter where you go, there you are... |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> could that possibly mean "in their favor"? what i mean is, that the WU they > send is predetermined (larger/slower/more cpu intensive) to maybe avoid server > overloads, etc.? (i'm way out on a limb here i think) Not to tilt it in any direction, but to "balance" the amount of floating point vs. integer work in the science application. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.