Message boards :
Number crunching :
Bulldozer vs Vishera vs Sandy Bridge(small comparison)
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Unfortunately, not. Ivy Bridge is the first Intel "APU" with OpenCL support AFAIK. Sandy Bridge has non-OpenCL GPU in it. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
For example, this run, from first post, with 12 simultaneous tasks, where HT efficiency is 16.5%, when I ran it second time, gave me even worse result than 2 sets of 6 tasks, something like "negative HT efficiency". AstroPulse processing is very processor cache hungry. I think that kills HT advantages because of cache contention between processes. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
In that spirit, any chance of having Intel built-in GPUs to work for SETI? You are our only hope :) Unfortunately, Intel OpenCL driver exibits same behavior as latest ATi and NV ones. GPU load drops considerably if CPU full busy. Still evaluating is it worth to free 1 CPU core to run GPU AP or not. P.S. I hope AP for intel gpu will be available for beta testing soon, now beta test package preparation goes. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Ex: "Socialist" Send message Joined: 12 Mar 12 Posts: 3433 Credit: 2,616,158 RAC: 2 |
You should run it again using an AVX optimized App on the Sandy Bridge. AVX is new with Sandy Bridge and appears to be killer, Intel® AVX is a new-256 bit instruction set extension to Intel® SSE and is designed for applications that are Floating Point (FP) intensive. Look at the AstroPulse times from this client, All AstroPulse v6 tasks for computer 5643864 He appears to be running the Linux AVX App from here, AstroPulse for Linux That's about 3.5 hours for a CPU AP. That is Fast, about 3x as fast as the last Windows SSE2 CPU AP App from Lunatics, r557. I had never run r557 so I gave it a try after seeing those numbers, I'm disappointed it's so much slower than the AVX. It almost makes me want to build a new machine that can use AVX. Or at least buy another AMD 6850 that my present machines can use. AVX is killer, I'm running AVX linux apps on my server's Xeon E3-1230 (sandy bridge). My RAC of 1700 doesn't seem like much, but it's only 25% use of only one of 4 cores. :-) (note, setting the app to 25% usually results in 33% use) (Note my RAC is rising now because I bumped up my crunch time/cores earlier) #resist |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
Does not process priority take care of giving the GPU loader process enough CPU cycles to feed it? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
For discrete ATi GPUs answer was "no". Looks like Windows still not too RTOS to obey process priority settings too much. Still to be tries with Intel GPU. BTW, app already available for testing. But take care, looks like it quite imprecise (at least on my host). SETI apps news We're not gonna fight them. We're gonna transcend them. |
Tom* Send message Joined: 12 Aug 11 Posts: 127 Credit: 20,769,223 RAC: 9 |
Does not process priority take care of giving the GPU loader process enough CPU cycles to feed it? NO! I have IMHO the BOBW right now using Jason's Cuda 5 MB's with the stock 6.04 AP's The stock 6.04 AP's take 40 minutes on my GTX660 when I allocate 1 CPU per AP task and takes 80 minutes when I have something else using that CPU. I also haver all astropulse and MB's using Elevated priority all the time in all cases using EfMer's Priority program makes no diff for the OpenCL Jobs. All Open CL jobs on my GTX660 need a full Core to run the quickest. Bill PS Thanks Claggy for your app_config to allow a full core on my i5 whenever a Astropulse runs as I can use that core for einstein or Test4Theory when I am not running an Astropulse In other words Jason's non OpenCL MB's only need a priority boost to function optimally |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
BTW, app already available for testing. But take care, looks like it quite imprecise (at least on my host). Where did you upload the app? on Lunatics? |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
So if you have Nvidia card the performance should not be so different then? What I experienced is that the priority of the CPU tasks run on Low and the processes feeding the GPU runs on below average. The CPU tasks will always give way to the GPU apps. I have not found significant speed difference on Intel CPU + Nvidia GPUs depending on whether CPUs are maxxed out. |
Ex: "Socialist" Send message Joined: 12 Mar 12 Posts: 3433 Credit: 2,616,158 RAC: 2 |
I am curious about this as well. If I could be convinced that I don't need a core free, I would fill all my cores and let boinc fight itself to feed the GPU. As it stands my Nvidia card is not anything special but I leave a core free anyways even though it's more than likely unnecessary. #resist |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
BTW, app already available for testing. But take care, looks like it quite imprecise (at least on my host). http://lunatics.kwsn.net/12-gpu-crunching/open-beta-for-intel-opencl-ap-application.msg51109.html;topicseen#msg51109 SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Situation is quite complex. First of all, CUDA and OpenCL differs. CUDA has native control on synching mode. OpenCL hasn't. Few proposed tricks to make OpenCL app swtich its synching mode did not have any effect on performance. 2) Synching behavior differs between drivers used. Old enough drivers (OpenCL 1.0 support), up to 267.xx don't need free core. Performance good with full CPU busy. But after that looks like something imortant (allegedly default synching mode) was changed. Full busy CPU now can't provide adequate GPU load. Moreover (this I observed mostly on ATi GPUs) time to time whole task can be completed almost as fast as with idle CPU but again, time to time task execution times increases in FEW TIMES. Looks like time to time task stuck in its processing (with full loaded CPU). It never happens with free CPU. On my ATi GPU (HD6950) I have statistics of few hundreds of tasks (look corresponding threads with performance pictures I posted) so quite convinced that core freeing is nessesary. On NV GPU I just use old drivers that don't need free cores ("If It Ain’t Broken, Don’t Fix It" approach) so any NV observations about how many free cores needed come from third persons. But yes, once I updated to more recent drivers and saw unstable processing on own GPU (GTX260) so staying with old drivers has the reason. SETI apps news We're not gonna fight them. We're gonna transcend them. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.