Message boards :
Number crunching :
Refresh My Memory, Why can't we detect CPU to use optimized
Message board moderation
Author | Message |
---|---|
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
Can someone refresh my memory why we cannot detect the CPU capability and then use the most efficient processing code that you folks have laboriously tested? I mean, it is not THAT hard, is it? Sorry for the question, but, I am back in confused mode as to why we (the collective we) are being sub-optimal in our approach. It just does not make sense to me that we are not using the fastest processing code possbile on the widest possible set of contributors. I mean, if there is a rash of errors, then you fall back to stock (or the next level down) ... As it is, we waste more time than we need to processing ... I suppose it is a silly question ... |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Can someone refresh my memory why we cannot detect the CPU capability and then use the most efficient processing code that you folks have laboriously tested? Not that old chestnut again....... You know darned well why Seti cannot support all platform tweaks,,,,,,, So quit the argument......please. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
Not that old chestnut again....... Um, I am not arguing. I asked an honest question. If I knew the answer I would not have asked the question. If Iknew the answer at one time in the past, well, I have since forgotten it. At one time in my past I could even drive a car. At this point, there are many things I can no longer perform. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Not that old chestnut again....... I am sorry, Sir........ I sometimes forget who I am talking to........ You have my respect, and apologies...... "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Toby Send message Joined: 26 Oct 00 Posts: 1005 Credit: 6,366,949 RAC: 0 |
Its not that it can't be done... It just hasn't been done yet. David Anderson ceated Trac ticket 562 one month ago that describes how he would like to see this handled. Of course it would only work with reasonably new clients that actually report the CPU capabilities to the server. A member of The Knights Who Say NI! For rankings, history graphs and more, check out: My BOINC stats site |
StokeyBob Send message Joined: 31 Aug 03 Posts: 848 Credit: 2,218,691 RAC: 0 |
Long time, no see! Paul D. Buck I haven't been on the message boards for a long time. It is good to see you still around. |
W-K 666 Send message Joined: 18 May 99 Posts: 19080 Credit: 40,757,560 RAC: 67 |
Joe (Josef W. Segur) in post 729751 Optimised Apps question states that; The stock app has limited support for up to SSE3, but only in certain specific routines. |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
Can someone refresh my memory why we cannot detect the CPU capability and then use the most efficient processing code that you folks have laboriously tested? Not only do you have to put code to determine what the cpu is capable of, you also need all this conditional stuff in there to use the optimizations at the appropiate point. It would make the stock app much larger and harder to maintain. Does it really matter if the app can detect the best capability and use it? After all we have the optimized app (currently from Crunch3r, and another version coming through) that can use your cpu to its potential. Its just that the user has to ascertain their cpu's capability (ie run cpu-z) and then use the appropiate app rather than the stock one. BOINC blog |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Can someone refresh my memory why we cannot detect the CPU capability and then use the most efficient processing code that you folks have laboriously tested? We do, though not yet everything which may eventually be included. If you do a standalone run of the stock S@H app now and use a -verbose argument then stderr.txt will show all the variant routines which were checked: v_GetPowerSpectrum 0.00263 0.00000 test That was on one of my systems which doesn't have more than SSE2, so doesn't show the sse3_ChirpData_ak variant chirping routine. The situation is fairly complex, simply checking the CPU capabilities is not always enough. For instance, on some Core 2 systems the SSE1 chirping turns out faster than the SSE3 version. That's why the app tests all the variants which the host can do for speed and accuracy. Joe |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
Um, thanks for the answers. Though, the thrust was not to "bulk-up" the stock application or to do conditionals. But, to follow the logic of Josef, to test the CPU and then to download the appropriately "tuned" application. I know we have the "stable" of applications and that there is a great deal of advice on the selection process which is why I posed the question. In the years prior, I remember that this was one of the most common questions on this forum, which app and how to install it ... It seems that this is one more area where we are no further along years later than we were ... We used to call situations like this, when I was a boy, "stuck in the mud" ... Anyway, thanks for the answers. |
ML1 Send message Joined: 25 Nov 01 Posts: 20334 Credit: 7,508,002 RAC: 20 |
Um, thanks for the answers. I think that the only workable solution is indeed to suffer 'bulking up' the stock application. The 'best' optimisation critically depends on what features the CPU supports but also upon all of:
We used to call situations like this, when I was a boy, "stuck in the mud" ... That's more an issue of what development effort is available. Yes, the science application can be improved. There's fantastic volunteer effort working on that. However, for Berkeley, I suspect that mere survival and getting something very visual working such as the NITPICKER are far far greater concerns for the 3(?) people available there. The present s@h is 'ticking along' nicely (server hardware panics aside!). The most urgent problems and developments are being worked on. Adding a few more percent performance is, I would guess, not a hot priority for the time being. Consistency doesn't always mean "thick mud". Good question still :-) Happy crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
It seems that this is one more area where we are no further along years later than we were ... I don't think I'd agree with the "no further along" argument, as Joe Segur points out, the current stock application does test (using benchmarks, not just CPU ID) and use routines that are more suited to certain processors.... But I also think there is a constant theme that the project(s) have more developer resources than they actually have. Does anyone know off the top of their heads how many staff developers are working on SETI (not BOINC, SETI) science applications? I think it's just Eric, plus a few volunteers and contributions from those doing the separate optimized apps. |
W-K 666 Send message Joined: 18 May 99 Posts: 19080 Credit: 40,757,560 RAC: 67 |
The other thing about the Seti units is that the various apps, available at the moment, may be the good at one or two portions of the AR range and another app better at other parts of the AR range. It's a pity Tony's research has been withdrawn, it illustrated the problem very well. Ned, I'm pretty sure are correct in your assumption that Eric is the only one working on the Seti app, when he has time, I think he is also involved with the Nitpicker and helping Josh with AstroPulse, plus all the other Seti paperwork etc. And we keep bugging him for updates and news. |
Clyde C. Phillips, III Send message Joined: 2 Aug 00 Posts: 1851 Credit: 5,955,047 RAC: 0 |
Maybe it would be better to compare crunchtimes for several samples of each angle range class for each processor than to use Dhrystone and Whetstone values. Often just the number of credits awarded each unit can serve as a proxy for the comparison of angle-range workunits. It might get tougher when comparing Intels with AMDs, though. Also, for days on end, there might not be workunits of certain angle range/number-of-credits classes distributed. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
Maybe it would be better to compare crunchtimes for several samples of each angle range class for each processor than to use Dhrystone and Whetstone values. Often just the number of credits awarded each unit can serve as a proxy for the comparison of angle-range workunits. It might get tougher when comparing Intels with AMDs, though. Also, for days on end, there might not be workunits of certain angle range/number-of-credits classes distributed. Absolutely. The very best way to do this is to use the app. as the benchmark, and to use several selected "benchmark work units" to measure performance. Trouble is, running these benchmarks will take hours, and while crunch times are generally similar, you'd want to select the "benchmark" WUs carefully. So (at least for SETI) the best compromise is what we have: whetstone and dhrystone for a rough measure, and duration correction factor, averaged across several work units for a more representative time estimate. Also, for any project that counts flops, the benchmark has no role in claimed credit. If the benchmark is off, you'll overfetch or underfetch work, but it won't change your scores. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Paul D. Buck: Um, thanks for the answers. In essence, I think it's simply a matter of how many different versions of the app the project has the time to test or maintain. They currently have apps for 8 platforms (though some may be duplicates). If they had separate apps for the various CPU architectures, etc. then that might be multiplied by 4 or so. Much optimization involves finding the parts of the code which are executed billions or trillions of times while processing a WU and figuring out more efficient ways of doing those operations. That tends to be fairly small sequences of instructions, so providing alternate variants doesn't "bulk-up" the app significantly. The added variants in stock plus the code to test them amount to about 63 KB now, for instance, and similar for the Lunatics 2.4 builds. But of course using different compiler options for different target architectures makes it necessary to have many more Lunatics builds than the project can support. I do regret that BOINC doesn't provide a convenient method for users to get and update optimized builds from third parties. I wince every time I come across a host running an obsolete optimised version. The anonymous platform mechanism wasn't really designed for that purpose, and some fraction of users who install the optimised apps will fail to check for updates. Joe |
Odysseus Send message Joined: 26 Jul 99 Posts: 1808 Credit: 6,701,347 RAC: 6 |
Does anyone know off the top of their heads how many staff developers are working on SETI (not BOINC, SETI) science applications? I think it's just Eric, plus a few volunteers and contributions from those doing the separate optimized apps. There’s a grad-student as well, Josh Von Korff, working on Astropulse. That’s two, or maybe one and a halfâ€â€I don’t know whether Josh is full- or part-time. |
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0 |
The anonymous platform mechanism wasn't really designed for that purpose, and some fraction of users who install the optimised apps will fail to check for updates. Joe, Does this statement by you match what I had said, that the anonymous platform mechanism should only be for / was originally designed for unsupported OSes and not so much for SIMD optimization levels? Thanks... Brian |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
The anonymous platform mechanism wasn't really designed for that purpose, and some fraction of users who install the optimised apps will fail to check for updates. BOINC provides a set of capabilities which allow the projects to focus on the work they want to get done. The documentation for the anonymous platform mechanism certainly indicates the BOINC developers were not specifically thinking about optimized versions of open source science applications, but the feature is flexible enough to allow that usage (with limitations). As to "should only be for", I think the BOINC developers would rather not spend time creating something new and better to handle optimized apps. They are probably pleased that what they provided is "good enough". I also think that if someone submitted code changes for something better they would accept them. They are currently working on ways to deal efficiently with multi-core and/or CPU plus GPU processing, perhaps some of the additions for that purpose will be adaptable. Joe |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.