I guess the burn-in is over?

Author	Message
AlphaLaser Volunteer tester Send message Joined: 6 Jul 03 Posts: 262 Credit: 4,430,487 RAC: 0	Message 902759 - Posted: 2 Jun 2009, 3:27:57 UTC - in response to Message 902719. since for example the L3 did not exist on most x86 machines until recently. I believe the first appearance of L3 cache on the x86 architecture is when the K6-III was introduced with having L2 cache built into the processor, the L2 cache on the motherboard became the L3 cache. The first sighting of L3 cache on Intel chips as far as I can remember was the Pentium 4 Extreme Edition on the Socket 478 platform. These chips had 16K L1, 512K L2 and 2MB L3. Back then, no additional coding needed to be done to existing applications to take advantage of the third level cache. Interesting you note the K6-III. L3 was also used on some high-end Xeons and very large amounts are used on Itanium's. With AMD's new Phenoms and the Core i5/i7, L3's is becoming more of a "standard" feature and not just exclusive to enthusiast parts like the P4EE or server chips. ID: 902759 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 903036 - Posted: 3 Jun 2009, 0:58:17 UTC For correctness: Current AKv8 build uses software prefetch. For example: " // prefetch 1 loop iteration ahead _mm_prefetch((char *) (d+16), _MM_HINT_NTA); " in v_vpChirpData(). Current AP build doesn't use software prefetch. I tried to use some block prefetch in prev builds but it seems that method was AMD specific and could give some speedup mostly on older AMD chips like Athlon XP. Current CPUs have both hardware and software prefetch (pre-load data from memory to cache). ID: 903036 ·

PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1	Message 903348 - Posted: 3 Jun 2009, 22:41:36 UTC Here is a link that describes a bit more. I'm inferring from this and from Intel's instruction set description that you can use prefetch to notify the cpu that you will want to use a line of cache, but you get no guarantee it will be cached when you use it. Noticed that a different argument is used depending on whether your data will be integer or floating point. It seems that the FPU's draw their data directly from L2, which surprised me. But I know less than I think I know, I'm sure. ID: 903348 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 903353 - Posted: 3 Jun 2009, 22:48:39 UTC - in response to Message 903348. Here is a link that describes a bit more. I'm inferring from this and from Intel's instruction set description that you can use prefetch to notify the cpu that you will want to use a line of cache, but you get no guarantee it will be cached when you use it. Noticed that a different argument is used depending on whether your data will be integer or floating point. It seems that the FPU's draw their data directly from L2, which surprised me. But I know less than I think I know, I'm sure. The link points into the Intel Compiler documentation, which says this is a subroutine -- meaning more than one instruction. It'd be interesting to see what is in the actual subroutine. ID: 903353 ·

AlphaLaser Volunteer tester Send message Joined: 6 Jul 03 Posts: 262 Credit: 4,430,487 RAC: 0	Message 903472 - Posted: 4 Jun 2009, 3:41:05 UTC Here's the docs on available prefetch instructions specified by SSE: - prefetcht0 - prefetcht1 - prefetcht2 - prefetchnta ID: 903472 ·

PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1	Message 903715 - Posted: 4 Jun 2009, 20:17:44 UTC - in response to Message 903472. It appears that Ned is rusty on intrinsics. Does anybody have any assembly code to demonstrate the prefetch instruction? ID: 903715 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 903741 - Posted: 4 Jun 2009, 21:42:55 UTC - in response to Message 903715. It appears that Ned is rusty on intrinsics. Does anybody have any assembly code to demonstrate the prefetch instruction? My applications rarely have patterns that would benefit from triggering a prefetch. My code also needs to run on machines that don't have SSE (my applications do not benefit), so it's not anything I would have used. There are some really good examples in the Intel documentation, including examples where the prefetch would actually hurt. Most are in C which is pretty close to Assembly. Then there are Cache Oblivious algorithms, which is incredibly interesting. Wouldn't surprise me to find out that IPP uses the prefetch instructions. FFTW probably doesn't. ID: 903741 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.