Faster BOINC clients for Linux

Author	Message
Harold Naparst Volunteer tester Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0	Message 171153 - Posted: 24 Sep 2005, 0:23:26 UTC I've posted faster BOINC clients for Linux on my website, http://naparst.name. The clients are only for Pentium 4 and AMD64. They may run on AMD 32-bit chips supporting SSE2, but they'll probably run slow. There is also an extensive discussion on the website about the morality of tweaking benchmark code. I'm sure this will generate a lot of discussion!! Happy crunching. Harold Naparst ID: 171153 ·

Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0	Message 171265 - Posted: 24 Sep 2005, 10:36:53 UTC Nice job Harold :) I'll get the discussion going... I agree with you on the BOINC optimization issue - anything other than straight compiler optimization of the Berkeley source code would be morally wrong. Of course it would be very easy to alter the source to dramatically increase claimed credit but that in my book would be cheating. It would also be a somewhat pointless exercise as the credit averging system throws away the highest and lowest values anyway, so unless everyone cheats there's no point. When I did my original optimized boinc client for linux, I did it for one reason - to eliminate the large disparity that existed at that time between credits claimed on Windows vs Linux platforms. Linux crunchers were claiming about half the credit of their windows cousins on otherwise identical hardware. This was back on v4.13 (or before??) and continued with v4.19. Windows users would complain the linux users were dragging down their averages and linux users would complain they weren't able to compete on a level playing field. My original optimized boinc clients leveled this playing field. Peter Smithson worked with me on this and then came up with the solution we have today (post v4.19) of splitting the benchmark source in an attempt to permenantly level out the amount of (over) optimization any compiler could achieve. If one wants to see the effect of (re)combining the benchmark source, one could always use the v4.19 sources before it was split :) Of course the more recent introduction of optimized seti clients for both platforms has confused the issue somewhat, although the averaging credit system on the whole does a good job of leveling out any large descrepincies :) Ned * My Guide to Compiling Optimised BOINC and SETI Clients * * Download Optimised BOINC and SETI Clients for Linux Here * ID: 171265 ·

Harold Naparst Volunteer tester Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0	Message 171270 - Posted: 24 Sep 2005, 10:52:04 UTC - in response to Message 171265. I agree with you on the BOINC optimization issue - anything other than straight compiler optimization of the Berkeley source code would be morally wrong. Well, I'm not saying that, exactly. What I'm saying is this: Suppose you are using the really fast fastsin() function in your SETI code, and this causes your workunits to be processed faster. Then I think you have the moral right to use that same function in the BOINC benchmarks. Do you disagree? There is obviously no "right" answer. Harold Harold Naparst ID: 171270 ·

ML1 Volunteer moderator Volunteer tester Send message Joined: 25 Nov 01 Posts: 21786 Credit: 7,508,002 RAC: 20	Message 171310 - Posted: 24 Sep 2005, 13:29:39 UTC - in response to Message 171270. Well, I'm not saying that, exactly. What I'm saying is this: Suppose you are using the really fast fastsin() function in your SETI code, and this causes your workunits to be processed faster. Then I think you have the moral right to use that same function in the BOINC benchmarks. ... This comes back to the question of: How representative is the benchmark in the first place? There's already been long discussions about the idea of using a calibration WU to directly calibrate a user's system for whatever optimised or non-optimised code they are using. Additionally, this would also be a good validation test that they produce correct results. There's various ideas to minimise the calibration overhead for slow machines and all in general. Aside: I guess there's a bit of 'translation' for the use of the phrase 'moral right'. :) My own view is that a 'fiddle factor' to convert the benchmark into a closer match against the "Cobble Computer" reference is fair and good. It is exactly what is being done with the latest client scheduler 'fiddle-factor' that successively brings users claimed credit into agreent. I strongly prefer the idea of using a "calibration WU", despite the extra complexity of the calibration. However, the calibration then makes credit calculation much simpler all round. Regards, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) ID: 171310 ·

Harold Naparst Volunteer tester Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0	Message 171312 - Posted: 24 Sep 2005, 14:11:56 UTC - in response to Message 171310. I strongly prefer the idea of using a "calibration WU", despite the extra complexity of the calibration. However, the calibration then makes credit calculation much simpler all round. Martin Well, of course, the best benchmark for SETI is a SETI workunit. But the problem is that SETI isn't the only BOINC project. We need a BOINC benchmark that will be representative across all BOINC projects. Harold Naparst ID: 171312 ·

ML1 Volunteer moderator Volunteer tester Send message Joined: 25 Nov 01 Posts: 21786 Credit: 7,508,002 RAC: 20	Message 171351 - Posted: 24 Sep 2005, 17:08:44 UTC - in response to Message 171312. I strongly prefer the idea of using a "calibration WU", despite the extra complexity of the calibration. However, the calibration then makes credit calculation much simpler all round. Martin Well, of course, the best benchmark for SETI is a SETI workunit. But the problem is that SETI isn't the only BOINC project. We need a BOINC benchmark that will be representative across all BOINC projects. Yes, and so each project just have their own calibrated WU specific for their project that is calibrated against the standard "Cobble Computer" standard. (Or they can rely on the existing benchmarking if that is accurate enough.) Regards, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) ID: 171351 ·

Harold Naparst Volunteer tester Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0	Message 171380 - Posted: 24 Sep 2005, 19:08:42 UTC - in response to Message 171351. I strongly prefer the idea of using a "calibration WU", despite the extra complexity of the calibration. However, the calibration then makes credit calculation much simpler all round. Martin Well, of course, the best benchmark for SETI is a SETI workunit. But the problem is that SETI isn't the only BOINC project. We need a BOINC benchmark that will be representative across all BOINC projects. Yes, and so each project just have their own calibrated WU specific for their project that is calibrated against the standard "Cobble Computer" standard. (Or they can rely on the existing benchmarking if that is accurate enough.) Regards, Martin How would you propose then to aggregate credits across projects? If I decide to split my cpu time equally among SETI and Climate Prediction, I want to get paid in BOINC credits, not two different types of currency, so to speak. Harold Naparst ID: 171380 ·

michael37 Send message Joined: 23 Jul 99 Posts: 311 Credit: 6,955,447 RAC: 0	Message 171522 - Posted: 25 Sep 2005, 1:29:16 UTC Harold, Here is my take on it, as a practical user rather than a purist. My boinc benchmark using Metod's P4-optimized 4.72 binary. 2005-09-24 20:51:00 [---] Number of CPUs: 1 2005-09-24 20:51:00 [---] 2116 double precision MIPS (Whetstone) per CPU 2005-09-24 20:51:00 [---] 4608 integer MIPS (Dhrystone) per CPU My boinc benchmark using Harold Naparst's "For Pentium 4" ICC-built client. 2005-09-24 21:17:43 [---] Number of CPUs: 1 2005-09-24 21:17:43 [---] 2833 double precision MIPS (Whetstone) per CPU 2005-09-24 21:17:43 [---] 5846 integer MIPS (Dhrystone) per CPU Using the first boinc client, my P4Xeon claimed the lowest score of the three for the past two weeks. On average, the granted credit for this computer is 70% higher than the claimed credit. Assuming granted credit to be "the fair credit", even the newer faster boinc client will not address the unfairness of the boinc benchmark for the calculation of claimed credit. ID: 171522 ·

Tern Volunteer tester Send message Joined: 4 Dec 03 Posts: 1122 Credit: 13,376,822 RAC: 44	Message 171526 - Posted: 25 Sep 2005, 2:17:48 UTC - in response to Message 171380. How would you propose then to aggregate credits across projects? If I decide to split my cpu time equally among SETI and Climate Prediction, I want to get paid in BOINC credits, not two different types of currency, so to speak. Credits are credits - the issue is making the benchmark match the project. If it is decided that a "reference" SETI WU run in "x" time equals "y" flops and "z" iops, then the "y" and "z" are what would be in the benchmark fields on the SETI pages. (Which would automatically compensate for ANY given condition of optimized client, type of CPU, amount of level 2 cache, whatever.) Meanwhile if CPDN is happy with the current benchmarks, then that is what would be stored on the CPDN pages. This isn't a major rewrite of the whole system - it's just an improvement over using some arbitrary benchmark system that has been shown to not "match" reality in way too many cases. ID: 171526 ·

Harold Naparst Volunteer tester Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0	Message 171527 - Posted: 25 Sep 2005, 2:25:44 UTC - in response to Message 171522. On average, the granted credit for this computer is 70% higher than the claimed credit. Assuming granted credit to be "the fair credit", even the newer faster boinc client will not address the unfairness of the boinc benchmark for the calculation of claimed credit. I'm acutely aware of this. The SETI client is just too fast, because it uses super-optimized IPP routines. It would be more realistic to have the benchmark crunch only in single precision, because that is what SETI does. However, SETI has made the decision to use Whetstone, which is double precision. I can't in good conscience just decide to change the benchmark. Harold Naparst ID: 171527 ·

ML1 Volunteer moderator Volunteer tester Send message Joined: 25 Nov 01 Posts: 21786 Credit: 7,508,002 RAC: 20	Message 171668 - Posted: 25 Sep 2005, 14:06:20 UTC - in response to Message 171380. Last modified: 25 Sep 2005, 14:07:39 UTC ... Yes, and so each project just have their own calibrated WU specific for their project that is calibrated against the standard "Cobble Computer" standard. (Or they can rely on the existing benchmarking if that is accurate enough.) Regards, Martin How would you propose then to aggregate credits across projects? If I decide to split my cpu time equally among SETI and Climate Prediction, I want to get paid in BOINC credits, not two different types of currency, so to speak. Exactly as described above. s@h use a calibrated s@h WU, CPDN use a representative trickle example WU, the other projects similarly use a calibrated example of one of their own WUs. This then gives an accurate calibration for all aspects of the host hardware and client software. You also get a validation by checking that the results are correct. Or if accurate enough for that project, the existing benchmark can be used instead. (And yes, CPDN don't use any benchmark or anything. They just award 'trickles' and a completion value regardless of the simulation.) Regards, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) ID: 171668 ·

Paul D. Buck Volunteer tester Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0	Message 171706 - Posted: 25 Sep 2005, 15:25:16 UTC - in response to Message 171312. We need a BOINC benchmark that will be representative across all BOINC projects. The point I was trying to make with regard to calibration and using ONLY the SETI@Home reference work unit to start is that the whole point to using a synthetic benchmark is that it is project neutral. However, the current benchmark system is so poor, and has so little relationship to reality that going back to a reference work unit type test would be preferable even if we sacrifice cross-project neutrality. Just watching my systems using my thumbnail tells me that the proportion of work done on each system for each project IS proportional. With that in mind, and accurate calibration of the system using SETI@Home work will be relatively accurate for other projects. Though the projects DO have substantial differences in what they are doing, the similarities are much bigger. In other words, in the large the projects are more alike than different in what they are doing. Large numbers of iterative calculations. So, benchmarking using one project will in fact give us a better measurement than we are getting with a synthetic benchmark. The only purpose that the synthetic benchmark will be used for is to see if it makes sense to benchmark the system with the work unit test. If the system is one that takes more than 6 hours to do a current SETI@Home work unit I would use cross-calibration on that system. For faster systems I would run the reference work unit and directly calibrate that system. I would also issue credit as this testing IS part of the process. The two biggest complaints I have about the synthetic benchmark are: 1) it can vary significantly one run to the next 2) it does not measure what we intended it to measure very well at all Because of the variability the synthetic benchmark is so cross-project neutral that it does not have any relationship to the projects, or itself, at all ... ID: 171706 ·

Harold Naparst Volunteer tester Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0	Message 171710 - Posted: 25 Sep 2005, 15:39:39 UTC - in response to Message 171706. 2) it does not measure what we intended it to measure very well at all This is very true. The dhrystone overwhelms the benchmark calculation, and most of the dhrystone is irrelevant for BOINC projects. As an example, 25% of the dhrystone calculation is (uh, make that was) spent in string copying and string comparison, which BOINC projects do very little of. It might be a better idea to replace the whetstone+dhrystone calc with either just whetstone or perhaps some other benchmark like single precision LINPACK. There probably would be better choices for standard benchmarks. Harold Naparst ID: 171710 ·

ML1 Volunteer moderator Volunteer tester Send message Joined: 25 Nov 01 Posts: 21786 Credit: 7,508,002 RAC: 20	Message 171771 - Posted: 25 Sep 2005, 18:09:14 UTC - in response to Message 171706. ... The two biggest complaints I have about the synthetic benchmark are: 1) it can vary significantly one run to the next 2) it does not measure what we intended it to measure very well at all Because of the variability the synthetic benchmark is so cross-project neutral that it does not have any relationship to the projects, or itself, at all ... Ouch! I agree. And using the s@h calibration as a relative calibration for the other projects is a good starting idea. Note also though: The present benchmark was a good first attempt at a general benchmark to characterise a host system. We just need someone to volunteer to update the system to something better. Regards, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) ID: 171771 ·

Astro Volunteer tester Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0	Message 171803 - Posted: 25 Sep 2005, 19:25:18 UTC Computers (like my laptop which started overheating and doubling crunch times) have a funny way of not continually behaving in a predictable manner. To offset any errors of this type the benchmarks are run every 5 days. Now, if we use a "calibration" WU/result to set claimed credit, then the WU will have to be run every so often. What happens to the P2 and P3 that take 24 hours to crunch a WU, and that means every 5th or 6th WU would end up being a "Calibration" wu. NOT fair to them. ID: 171803 ·

ML1 Volunteer moderator Volunteer tester Send message Joined: 25 Nov 01 Posts: 21786 Credit: 7,508,002 RAC: 20	Message 171809 - Posted: 25 Sep 2005, 19:51:04 UTC - in response to Message 171803. Last modified: 25 Sep 2005, 19:53:24 UTC Computers ... not continually behaving in a predictable manner ... What happens to the P2 and P3 that take 24 hours to crunch a WU, and that means every 5th or 6th WU would end up being a "Calibration" wu. NOT fair to them. The benchmark is used (with its unpredictability); Or, those machines are calibrated against another (faster) machine that has already been calibrated and has run the same WU. This is similar in some ways to what is already done with the present 'claimed credit'/'granted credit' fiddling. Hence, slow machines (and faster machines even) need not run the calibration WU very often at all. Provided that is, their credits claims continue to closely agree with other's claims. Any big discrepancy and then a recalibration is triggered. Too many recalibrations in succession and that host gets a black mark for claimed credit and then simply just gets told what it gets! Yes? Regards, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) ID: 171809 ·

Paul D. Buck Volunteer tester Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0	Message 172030 - Posted: 26 Sep 2005, 15:24:50 UTC - in response to Message 171803. What happens to the P2 and P3 that take 24 hours to crunch a WU, and that means every 5th or 6th WU would end up being a "Calibration" wu. NOT fair to them. As Martin said, during the compare of credit claimed, if the machines claim changed the calibration constant would be changed. Don't see how it would not be "fair", as I stated, the current system you get 0 credit for the time used in benchmarking. In calibration the work unit processed would count as any other work unit. So, where is the complaint? Granted, if the work unit did not return a result that is correct the credit granted would be zero, but it is always that way. If your machine is not returning valid results why should you be rewarded? And last but not least, I already pointed out that for the computers that took excessive time to complete would be cross-calibrated not directly. Again, this is all design discussion and there are many ways to implement this core idea. The whole point is to move the system to a more, um, scientific basis. I mean, I am still surprised that there has not been a harder push to qualify the work performed. When we put up a new scientific tool it always goes through a period of testing and calibration ... why we are not doing that with the computation system is beyond me ... especially with the fact that iterative computation systems are more prone to numerical error effects ... but, that is just the engineer in me not liking sloppiness ... ID: 172030 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19738 Credit: 40,757,560 RAC: 67	Message 172058 - Posted: 26 Sep 2005, 16:28:42 UTC Tony We had a similar discussion about 3 months ago, and I suggested that once the hosts were calibrated then the claimed credits for each user crunching the same result should be the same. So if there was a variable set up and given a value, of say 500, it could be decreased by one for each result returned when the claimed credits agreed, within fairly tight limits, with the others, and decreased by a greater figure if it exceeded specified limits. When the variable reaches zero then the host is re-calibrated. This would give a time between calibration runs of about 5 days for the fastest 4 cpu machines and longer for us slower people. And ensure that machines with suspect 'claimed credits' were calibrated more often. Andy ID: 172058 ·

©2025 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.