Amd x87 patch

Author	Message
Bogie Volunteer tester Send message Joined: 21 Dec 06 Posts: 84 Credit: 75,755,114 RAC: 52	Message 1415287 - Posted: 13 Sep 2013, 20:42:31 UTC Saw this at the overclockers forum http://www.overclockers.com/forums/showthread.php?t=737243, would this patch help? with Fx cpu's and Phenom, cpu's ty ID: 1415287 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1415374 - Posted: 14 Sep 2013, 0:33:47 UTC X87 is mostly deprecated. The apps all use some version of SSE. Maybe there is some x87 code for secondary processing but there can't be much to gain from improving it. ID: 1415374 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1415425 - Posted: 14 Sep 2013, 3:17:38 UTC Where practical, SSE or better is used for inner loop calculations of course. However, there are significant sections of 32 bit builds for both SaH v7 and AP v6 which do use x87 code. The patch might make a noticeable speedup, and one of the uses is in the generation of data to implement blanking in AP so it might help the OpenCL AP apps too. FWIW, I think it would definitely improve the Whetstone benchmark in 32 bit BOINC builds. OTOH, nobody knows why AMD has deliberately chosen to operate X87 in Bulldozer and its descendants at less than max capability. My hunch is they found a design glitch which they have not yet been able to fix, if so using The Stilt patch may cause bad results under some specific unknown conditions. The patch is not applicable to earlier AMD CPUs. Joe ID: 1415425 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1415902 - Posted: 15 Sep 2013, 11:02:40 UTC - in response to Message 1415425. Last modified: 15 Sep 2013, 11:10:29 UTC Would described be applicable to Trinity APU ? One another possible reason for any limitations of performance besides wrong design is wrong thermal design. In other words CPU could become just too hot when operates in full speed. Their deliberate core downclocking in Trinity APUs (and not only there) is very in the same line... EDIT: found answer on own question in article: Parts affected: AMD Barracuda (Zambesi, Vishera), AMD Comal (Trinity, Richland), AMD Virgo (Trinity, Richland) Also: egative effects: TBD, none found yet. The performance in non x87 applications remains the same or improves very slightly. No instability, increased power consumption Well... worth to test at least! SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1415902 ·

arkayn Volunteer tester Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0	Message 1415997 - Posted: 15 Sep 2013, 15:42:50 UTC - in response to Message 1415902. Would described be applicable to Trinity APU ? One another possible reason for any limitations of performance besides wrong design is wrong thermal design. In other words CPU could become just too hot when operates in full speed. Their deliberate core downclocking in Trinity APUs (and not only there) is very in the same line... EDIT: found answer on own question in article: Parts affected: AMD Barracuda (Zambesi, Vishera), AMD Comal (Trinity, Richland), AMD Virgo (Trinity, Richland) Also: egative effects: TBD, none found yet. The performance in non x87 applications remains the same or improves very slightly. No instability, increased power consumption Well... worth to test at least! I read that as No increased power consumption. ID: 1415997 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1416130 - Posted: 15 Sep 2013, 20:38:16 UTC - in response to Message 1415997. yes, that means no errors (against Joe's assumption) and no increase in heating (against my own assumption). So, definitely worth to try. Will try when will have more time. Maybe tomorrow. Will post bench results here then. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1416130 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1416345 - Posted: 16 Sep 2013, 12:09:36 UTC Last modified: 16 Sep 2013, 12:10:23 UTC Here is promised benchmark of this patch. For benchmarking I chose ATi AP (it's APU after all) running heavy blanked test task. After reboot, no patch applied: WU : sigind_v5.wu AP6_win_x86_SSE2_OpenCL_ATI_r1761.exe -verbose : Elapsed 289.990 secs CPU 125.628 secs Patch applied: WU : sigind_v5.wu AP6_win_x86_SSE2_OpenCL_ATI_r1761.exe -verbose : Elapsed 287.664 secs CPU 126.813 secs Patch disabled: WU : sigind_v5.wu AP6_win_x86_SSE2_OpenCL_ATI_r1761.exe -verbose : Elapsed 289.055 secs CPU 127.359 secs Enabled again: WU : sigind_v5.wu AP6_win_x86_SSE2_OpenCL_ATI_r1761.exe -verbose : Elapsed 287.648 secs CPU 126.985 secs Pity. No significant difference :/ SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1416345 ·

OzzFan Volunteer tester Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28	Message 1416361 - Posted: 16 Sep 2013, 13:20:38 UTC - in response to Message 1416345. I was under the impression that when this issue was discovered, AMD quickly worked with Microsoft to include a patch for their CPUs to help the Windows task scheduler deal with the AMD's architectural differences. If so, that may explain why you see little to no difference with the patch enabled. ID: 1416361 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34257 Credit: 79,922,639 RAC: 80	Message 1416371 - Posted: 16 Sep 2013, 13:47:44 UTC - in response to Message 1416361. I was under the impression that when this issue was discovered, AMD quickly worked with Microsoft to include a patch for their CPUs to help the Windows task scheduler deal with the AMD's architectural differences. If so, that may explain why you see little to no difference with the patch enabled. Yes, i remember it the same way. With each crime and every kindness we birth our future. ID: 1416371 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1416381 - Posted: 16 Sep 2013, 14:18:58 UTC Last modified: 16 Sep 2013, 14:19:25 UTC Could Windows Server 2008 SP2 include this fix ? SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1416381 ·

OzzFan Volunteer tester Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28	Message 1416392 - Posted: 16 Sep 2013, 14:52:20 UTC - in response to Message 1416381. I'd doubt it. The release date for Server 2008 SP2 was July 22nd, 2009. I think the AMD patch was a hot-fix. Not sure which one it would be. I'd have to do some research on it. ID: 1416392 ·

OzzFan Volunteer tester Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28	Message 1416400 - Posted: 16 Sep 2013, 15:06:54 UTC Last modified: 16 Sep 2013, 15:07:58 UTC Here we go: the hotfix was released by Microsoft in KB2645594. This seems to have been released post Server 2008 SP2 as well as post Server 2008 R2 SP1 (and their corresponding client OSes). ID: 1416400 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1416411 - Posted: 16 Sep 2013, 15:30:59 UTC - in response to Message 1416400. Last modified: 16 Sep 2013, 15:35:01 UTC Here we go: the hotfix was released by Microsoft in KB2645594. This seems to have been released post Server 2008 SP2 as well as post Server 2008 R2 SP1 (and their corresponding client OSes). That was a different issue, the OS scheduling of threads between cores and modules. The x87 patch gets right in and fiddles with the microcode of the CPU itself. I never heard about AMD patching that, it would be in the form of a BIOS update. I suppose they could have issued such changes directly to OEMs without announcing it but that would be unlikely. Edit: also, in his comments, The Stilt said that it wouldn't work on Zambezi-based processors. ID: 1416411 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1416414 - Posted: 16 Sep 2013, 15:54:10 UTC - in response to Message 1415425. Where practical, SSE or better is used for inner loop calculations of course. However, there are significant sections of 32 bit builds for both SaH v7 and AP v6 which do use x87 code. The patch might make a noticeable speedup, and one of the uses is in the generation of data to implement blanking in AP so it might help the OpenCL AP apps too. Joseph, some compilers (ie gcc) can use the SSE units to process non-SIMD floating-point math. IE code that historically would have run on the X87 unit is run on the SIMD units in scalar mode. With gcc you use the flag -mfpmath=sse. Is it definite that x87 code is generated and not scalar SIMD code? ID: 1416414 ·

OzzFan Volunteer tester Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28	Message 1416430 - Posted: 16 Sep 2013, 16:38:19 UTC - in response to Message 1416411. Last modified: 16 Sep 2013, 16:42:06 UTC Here we go: the hotfix was released by Microsoft in KB2645594. This seems to have been released post Server 2008 SP2 as well as post Server 2008 R2 SP1 (and their corresponding client OSes). That was a different issue, the OS scheduling of threads between cores and modules. The x87 patch gets right in and fiddles with the microcode of the CPU itself. I never heard about AMD patching that, it would be in the form of a BIOS update. I suppose they could have issued such changes directly to OEMs without announcing it but that would be unlikely. Hotfixes can be used as work-arounds to existing/known issues that aren't fixed in microcode. Are we certain that this isn't the same issue? [Edit] Hmmm.. maybe you're right. I can't seem to confirm that hotfix directly fixes the x87 issue. ID: 1416430 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1416449 - Posted: 16 Sep 2013, 17:09:55 UTC - in response to Message 1416414. Last modified: 16 Sep 2013, 17:16:42 UTC Where practical, SSE or better is used for inner loop calculations of course. However, there are significant sections of 32 bit builds for both SaH v7 and AP v6 which do use x87 code. The patch might make a noticeable speedup, and one of the uses is in the generation of data to implement blanking in AP so it might help the OpenCL AP apps too. Joseph, some compilers (ie gcc) can use the SSE units to process non-SIMD floating-point math. IE code that historically would have run on the X87 unit is run on the SIMD units in scalar mode. With gcc you use the flag -mfpmath=sse. Is it definite that x87 code is generated and not scalar SIMD code? There is one main problem with direct mapping of x87 fpu to Scalar SIMD. x87 uses 80 bit intermediates while SSE onward observe the IEEE-754 standards at reduced precision. This means without a lot of attention to choices of algorithms the results can turn out quite different. For a simple example finding the mean of as few as 4096 or more floats can be different in the 2nd-3rd decimal places, amplifying problems with the use of absolute thresholds for reporting & validation. (no hysteresis) V7 multibeam received some attention to bring this cross platform difference in linebetween GPUs, CPUs, and makes direct AMD64/Intel64(no x87 allowed) builds feasible from the stock codebase (not workable under V6). It's also meant that near direct ports should be moreportable to other devices without too much variation (e.g. recent ARM/Android) To my knowledge AP hasn't received this attention yet, but looking at the high ratio of AP tasks to WUs waiting to purge, it would need that attention to bring the platforms closer numerically. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1416449 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1416461 - Posted: 16 Sep 2013, 17:29:59 UTC - in response to Message 1416449. Eric Mcintosh of the LHC at Home Classic project has been working on issues of numeric reproducibility. He has recently posted his first notes online: CV and Notes on Floating-Point. x87 is covered in section 3.1.1 of the final piece (The pitfalls of verifying floating-point computation, David Monniaux 2008). I haven't checked the others. ID: 1416461 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1416464 - Posted: 16 Sep 2013, 17:36:12 UTC - in response to Message 1416414. Where practical, SSE or better is used for inner loop calculations of course. However, there are significant sections of 32 bit builds for both SaH v7 and AP v6 which do use x87 code. The patch might make a noticeable speedup, and one of the uses is in the generation of data to implement blanking in AP so it might help the OpenCL AP apps too. Joseph, some compilers (ie gcc) can use the SSE units to process non-SIMD floating-point math. IE code that historically would have run on the X87 unit is run on the SIMD units in scalar mode. With gcc you use the flag -mfpmath=sse. Is it definite that x87 code is generated and not scalar SIMD code? I had experimented with that GCC option more than a year ago and decided to stick with the default usage of x87 for 32 bit builds, so all the AKv8c builds do use x87. But I agree it's time to recheck by making and testing some builds with -mfpmath=sse. I have a Trinity A10-4600M laptop which will be suitable for testing. Josef ID: 1416464 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1416493 - Posted: 16 Sep 2013, 18:21:45 UTC - in response to Message 1416461. Eric Mcintosh of the LHC at Home Classic project has been working on issues of numeric reproducibility. He has recently posted his first notes online: CV and Notes on Floating-Point. x87 is covered in section 3.1.1 of the final piece (The pitfalls of verifying floating-point computation, David Monniaux 2008). I haven't checked the others. Very handy thanks! There are amazingly few solid references in this particular area of Computer Science. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1416493 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.