Questions and Answers :
Windows :
SSE2 or SSE3?
Message board moderation
Author | Message |
---|---|
Bert Send message Joined: 12 Oct 06 Posts: 84 Credit: 813,295 RAC: 0 ![]() |
When Boinc starts up it identifies my processor as being SSE2. Here are the first few lines of the messages: 5/2/2008 3:15:00 PM||Starting BOINC client version 5.10.45 for windows_intelx86 5/2/2008 3:15:00 PM||log flags: task, file_xfer, sched_ops 5/2/2008 3:15:00 PM||Libraries: libcurl/7.18.0 OpenSSL/0.9.8e zlib/1.2.3 5/2/2008 3:15:00 PM||Executing as a daemon 5/2/2008 3:15:00 PM||Data directory: C:\\Program Files\\BOINC 5/2/2008 3:15:00 PM||BOINC is running as a service and as a non-system user. 5/2/2008 3:15:00 PM||No application graphics will be available. 5/2/2008 3:15:00 PM|SETI@home|Found app_info.xml; using anonymous platform 5/2/2008 3:15:00 PM||Processor: 1 AuthenticAMD AMD Athlon(tm) 64 Processor 3500+ [x86 Family 15 Model 47 Stepping 2] 5/2/2008 3:15:00 PM||Processor features: fpu tsc pae nx sse sse2 3dnow mmx 5/2/2008 3:15:00 PM||OS: Microsoft Windows XP: Professional Edition, Service Pack 2, (05.01.2600.00) 5/2/2008 3:15:00 PM||Memory: 2.00 GB physical, 3.85 GB virtual 5/2/2008 3:15:00 PM||Disk: 111.78 GB total, 70.64 GB free 5/2/2008 3:15:00 PM||Local time is UTC -4 hours CPUZ identifies the processor as being SSE3 capable. The optimized application I have been using up to now is only SSE2 and I would like to use the AK application which has no version for SSE2. Who do I believe? BOINC manager or CPUZ? I looked it up in google and I read that this CPU (Toledo) is SSE3 capable. |
![]() ![]() Send message Joined: 24 Feb 01 Posts: 131 Credit: 3,739,307 RAC: 0 ![]() |
I decided to give my "old" P4 "Prescott", 3.8Ghz, Hyperthreading, Intel, a try on the new app. CPUZ says SSE 3 capable, so does the Boinc manager, but the new app knocked him out in seconds ! Is my little baby really too old for V8 ? edit to Bert: Please excuse me joining your thread, but almost the same problem ! Happy crunchin` Kurt ![]() |
Bert Send message Joined: 12 Oct 06 Posts: 84 Credit: 813,295 RAC: 0 ![]() |
I decided to give my "old" P4 "Prescott", 3.8Ghz, Hyperthreading, Intel, a try on the new app. Not a problem your joining in. It is an open forum after all. I may have found the answer to my problem at the AMD site. This is what is says about the Athlon processors: # Sixteen 128-bit SSE/SSE2/SSE3 registers # AMD Digital Media XPressâ„¢ provides support for SSE, SSE2, SSE3 and MMX instructions So I will assume the chip is SSE3 and try the new application. |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 ![]() |
I decided to give my "old" P4 "Prescott", 3.8Ghz, Hyperthreading, Intel, a try on the new app. Your operating system also has to know about the extension set. The OS is in charge of doing context switches (when one application stops and another starts on the CPU - can happen several times per second). During a context switch, ALL of the registers must be saved so that the old application can gets its context back when it is switched back onto the CPU. For example. Application 1 is in the middle of a calculation and leaves the value pi (3.14159...) in a register. The OS does not know about this register. Application 2 comes along and sticks the value sqrt(2) in the register (1.414...). Now Application 1 comes back, and it sees sqrt(2) instead of pi, and continues along assuming that the value is pi. (I hope that you can see where this is going). BOINC asks the OS what the CPU can do. This is safe, if the OS recognizes the extension, then the OS can handle the context switches that use that registry set. CPUZ uses its own code to recognize the hardware - this is dangerous because the OS may NOT be able to do a context switch. ![]() ![]() BOINC WIKI |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 ![]() |
I decided to give my "old" P4 "Prescott", 3.8Ghz, Hyperthreading, Intel, a try on the new app. If the OS says it is SSE3, then the OS knows how to recoginze SSE3 and do context switches on that extension. This should be safe. ![]() ![]() BOINC WIKI |
![]() ![]() Send message Joined: 24 Feb 01 Posts: 131 Credit: 3,739,307 RAC: 0 ![]() |
I decided to give my "old" P4 "Prescott", 3.8Ghz, Hyperthreading, Intel, a try on the new app. Thanks for your reply, John, Call it Bad Luck, Murphys Law, Kismet or whatever - Fact is, my motherboard broke in the very second I started Boinc. Now this is no issue regarding the new app, it broke anyway, as the guy in the store told me this morning. Once repaired, I`ll give it another try ! Best regards, Kurt ![]() |
Bert Send message Joined: 12 Oct 06 Posts: 84 Credit: 813,295 RAC: 0 ![]() |
I decided to give my "old" P4 "Prescott", 3.8Ghz, Hyperthreading, Intel, a try on the new app. Thanks for the information. I understand the explanation (BS in Computer Science here) and will do some more checking before jumping in. There has to be a utility out thee somewhere that will check it out. Best regards Adalberto |
Bert Send message Joined: 12 Oct 06 Posts: 84 Credit: 813,295 RAC: 0 ![]() |
I decided to give my "old" P4 "Prescott", 3.8Ghz, Hyperthreading, Intel, a try on the new app. I finally tried the AK application. Although Boinc still shows SSE2 found and CPU-Z still shows the processor to be SSE3 capable I guess running the opti application is the final answer. Good news, the AK application has been running for 3 days and there is a very definite decrease on crunch time and no errors in any of the WUs crunched. They are also verifying just fine. I still don't see much of an increase in RAC but that takes time to stabilize. |
![]() ![]() Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 ![]() |
I finally tried the AK application. Although Boinc still shows SSE2 found and CPU-Z still shows the processor to be SSE3 capable I guess running the opti application is the final answer. The following is taken from one of your completed work units. Windows optimized S@H Enhanced application by Alex Kan Version info: SSE3 (AMD/Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan SSE3 Win32 Build 41 , Ported by : Jason G, Raistmer, JDWhale CPUID: AMD Athlon(tm) 64 Processor 3500+ Speed: 1 x 2200 MHz Cache: L1=64K L2=512K Features: MMX SSE SSE2 SSE3 The line with "Version info" indicates that you are using SSE3, AKV8 so that is correct. The "Features" line indicates that your cpu is capable of SSE3. Looks like you got the right version of AKV8 and are using it correctly. |
Bert Send message Joined: 12 Oct 06 Posts: 84 Credit: 813,295 RAC: 0 ![]() |
I finally tried the AK application. Although Boinc still shows SSE2 found and CPU-Z still shows the processor to be SSE3 capable I guess running the opti application is the final answer. Yes it looks that way. The problem was that although cpuz shows the processor to be SSE3 capable, Boinc reports it is SSE2 only. Here are a few lines of the messages when Boinc starts up: 5/17/2008 2:41:52 PM|SETI@home|Found app_info.xml; using anonymous platform 5/17/2008 2:41:53 PM||Processor: 1 AuthenticAMD AMD Athlon(tm) 64 Processor 3500+ [x86 Family 15 Model 47 Stepping 2] 5/17/2008 2:41:53 PM||Processor features: fpu tsc pae nx sse sse2 3dnow mmx 5/17/2008 2:41:53 PM||OS: Microsoft Windows XP: Professional Edition, Service Pack 2, (05.01.2600.00) 5/17/2008 2:41:53 PM||Memory: 2.00 GB physical, 3.85 GB virtual 5/17/2008 2:41:53 PM||Disk: 111.78 GB total, 69.90 GB free The line that mentions the processor features only shows sse2, and the difference triggered my original question. I am getting good results with the AK/SSE3 version now. My RAC has climbed to an unprecendeted 467+. This is not too bad for a single processor, not overclocked chip. Or at lest not overclocked that I know of. ASUS boards do have a tendency to OC without the owner trying to do it or so I have read. |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
The problem was that although cpuz shows the processor to be SSE3 capable, Boinc reports it is SSE2 only. BOINC will only tell what options your operating system supports. It won't read them directly from the CPU or BIOS, like CPUZ does. For instance, under Windows 2000 my P4 2.8GHz was said to use only MMX and SSE. It is SSE2 capable and I was already using the optimized Seti for that. Under Windows XP it shows the CPU is SSE2 compliant as well. Therefore you always check with CPUZ what the capabilities of the CPU are. |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 ![]() |
One more time. Using a processor extension that the Operating System does not support is a very bad idea. The Operating System is responsible for saving the state of the processor extension when the Operating System decides to do a context switch. If you have two processes that are using the extension, you will get wrong answers. This is not a maybe, it is a will. I cannot state this strongly enough. Occasionally, you are just going to get wrong answers, and you will not be able to figure out why. Using a CPU extension not supported by the Operating System is harming the science. When you are choosing an optimized application, use the ones that BOINC says the operating system supports, not what CPUZ says the chipset supports. ![]() ![]() BOINC WIKI |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 ![]() |
One more time. Sorry to take issue with on this but I feel I must. I have been crunching SETI under BOINC for almost 12 months using XP Home SP2 on a machine that has progressed from an E6400 to a Q6600 to a Q9450. This is my main desktop machine on which I also do web browsing, eMail, spreadsheeting, Word, etc, etc. Boinc tells me that it is currently capable of supporting: 19/05/2008 14:29:03||Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz [x86 Family 6 Model 23 Stepping 7] 19/05/2008 14:29:03||Processor features: fpu tsc pae nx sse sse2 mmx and I am sure that sse2 was the max for both of the previous processors too. I have always run at least SSSE3 Optimised Apps on it, and have recently been trying both the latest SSSE3x and SSE4.1 Op Apps since the Q9450 supports SSE4.1 (decided that the SSSE3x is probably faster at the moment). "you will get wrong answers" - This would cause validation failure so it is not "harming the science". I do not recall ever getting a validation failure that could not be explained. For the past 6 months I have also been logging all my WU's fairly rigorously, and, although I cannot guarantee that I have captured every one, there is not a single "Validate Error" in the log. I am sure that, if there were a problem with the Op Apps producing "wrong answers" which failed to validate, it would have been raised as an issue in the Number Crunching forum since most posters there follow the recommendation of CPU-Z when choosing which Op App to load. If your suggestion were widely followed, the speed of the SSE2 Op App compared to the SSSE3x Op Ap would mean that the project would suffer because I (and many others) could do much less work. Could you, perhaps, publish some evidence to support your assertion that the science will be harmed? F. ![]() |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 ![]() |
One more time. If you never ever, have another application that uses the extension, and BOINC never starts a second S@H task that uses the unsupported extension while the first is running, you are correct. It will also only affect the one calculation set where the context switch occurs. Possibly only one peak or triplet or such. I don't know the coded quite well enough to know how large the effect is. Something you have to ask yourself: Is faster, less reliable calculation preferred over slower more reliable? The way a context switch works: At the end of a timeslice as determined by the OS: Step 1: Save all registers, including the processor extensions, from the CPU to RAM for the process that the OS is removing from the CPU. Associate this RAM with the task that is being swapped out. Step 2: Copy all registers, including processor extensions, from RAM to the CPU for the task that is about to run. If the OS does not know about a register, then the OS does not know to save that register to RAM. If it is not saved to RAM, then it cannot be counted on to have the same value from one OP in the code to the next. Note that the OS can do a context switch at any moment. The only thing that is atomic (cannot be broken up) is a single assembly language OP. There are ways to make certain segments of code appear to be atomic with respect to other known programs that share the same variable space. I am going to reiterate that running an extension that is not supported by the OS is unsafe. Telling people to use extensions that are not supported by the OS is bad advice. Again, I don't know what the range of the error is going to be, but there will be errors introduced if there are two programs that use the unsupported extension. If you want to use a CPU extension, get an OS that knows what it is, and how to save it and reconstitute it. ![]() ![]() BOINC WIKI |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
If you want to use a CPU extension, get an OS that knows what it is, and how to save it and reconstitute it. Meaning for SSE3 and SSSE3, Windows XP 32bit SP2. Meaning for SSE4, Windows XP Professional x64 Edition and 64-bit versions of Vista. Going to be costly. ;) |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 ![]() |
If you want to use a CPU extension, get an OS that knows what it is, and how to save it and reconstitute it. But I AM using XP (Home) 32bit SP2 and Boinc says that supports only SSE2!! F. ![]() |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
At the end of a timeslice as determined by the OS: That is obviously exactly what needs to happen. But you are making a big leap of intuition if you assume that an OS, Windows for example, achieves that total save/total restore of the registers by explicitly saving/restoring each one separately. Research in Number Crunching has turned up "the FXSAVE and FXRSTOR instructions, which is the extended pair of instructions which can save all x87 and SSE register states all at once." - sourced to Wikipedia, so treat with the usual bargepole, but it changes the landscape quite a bit. If the OS uses a "save everything you know about" instruction, and the CPU knows about its own registers, then the problem goes away. And another source from early 1998 says "Indeed, it's expected the instructions [FXSAVE and FXRSTOR] will be exploited by Windows 98 and Windows NT 5.0, two upcoming offerings from Microsoft Corp." Which squares the circle, at least as far as Windows is concerned. |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
I used Pentium’s End: Intel Core Exposed for that bit of knowledge. "EM64T/NX/SSE4. This is Intel’s implementation of the AMD64 64-bit extensions for x86 processors, which makes it possible to run 64-bit software (such as Windows XP Professional x64 Edition and 64-bit versions of Vista). In theory, 64-bitness lets the system work with much bigger sets of memory at once, process data in bigger chunks, and run better in general." And PERFORMANCE ANALYSIS OF INTEL CORE 2 DUO PROCESSOR (PDF file) for the other bit of information: Page 12: "Other important features involve support for new SIMD instructions called Supplemental Streaming SIMD Extension 3, coupled with better power saving technologies." Page 13: "All experiments were run on systems with 32 bit Windows XP SP2 operating system and Intel Core 2 Duo processors." So don't look at me. ;-) |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 ![]() |
I used Pentium’s End: Intel Core Exposed for that bit of knowledge. If you have one process that is tied to one CPU, then you can get away with using unsupported extensions as there will not be anything overwriting the information you need. The problem arises when there are two processes using the same unsupported extension - then the processes overwrite each others data because the OS does not know enough to move it around so the processes have apparently uninterupted access to the registers. ![]() ![]() BOINC WIKI |
![]() Send message Joined: 19 Mar 05 Posts: 551 Credit: 4,673,015 RAC: 0 ![]() |
I have been running optimized apps for sse3+ since they came out. I have not seen one invalid result other than the one in a million that reports an extra signal when it shouldn't. So the question remains... Where are these invalid results that should be here but aren't? ~BoB ![]() Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957 Or Good Shop? http://www.goodshop.com/?charityid=888957 |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.