Message boards :
Number crunching :
I'm falling, I bought a parachute. From 100% AP, to 100% MB.
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next
Author | Message |
---|---|
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
NV scores about 40-50%, that´s incredible (i know you tell the true and the graph crearely shows that) and give another reason why they need to fix creditscrew. I review my guess to 12000 feets (with MB he could release some cores to crunch too). |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
NV scores about 40-50%, that´s incredible (i know you tell the true and the graph crearely shows that) and give another reason why they need to fix creditscrew. I review my guess to 12000 feets (with MB he could release some cores to crunch too). Some of the difference between 2::1 for NV, and 4::1 for ATI, is that with ATI you're crunching MB with an OpenCL application, but with NV you're using a CUDA application. That probably says a lot about the efficiency of OpenCL as a platform. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
NV scores about 40-50%, that´s incredible (i know you tell the true and the graph crearely shows that) and give another reason why they need to fix creditscrew. I review my guess to 12000 feets (with MB he could release some cores to crunch too). Then maybe if the AP could runs on CUDA we could get a big jump in our production too? Or i´m wrong? Ok that´s for another thread. OK Sten i see your point about the cores, i stil use Q8200 on 2 of my hosts and not allow it runs any CPU WU or the entire host will be slower, but i remain on 12000 feet. |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
NV scores about 40-50%, that´s incredible (i know you tell the true and the graph crearely shows that) and give another reason why they need to fix creditscrew. I review my guess to 12000 feets (with MB he could release some cores to crunch too). Sure not. Its all host depended. It might be true for some but not all. I`m running at 18K without AP`s and 23K with AP`s. With each crime and every kindness we birth our future. |
draco Send message Joined: 6 Dec 05 Posts: 119 Credit: 3,327,457 RAC: 0 |
NV scores about 40-50%, that´s incredible (i know you tell the true and the graph crearely shows that) and give another reason why they need to fix creditscrew. I review my guess to 12000 feets (with MB he could release some cores to crunch too). where you see 2:1 for nvidia? on my gt630 rev 2 i see about 20 GFLOPS performance on MB ( lunatics, cuda), and about 80 on AP ( also lunatics for nvidia). 80:20 hardly seems to be 2:1, i think... |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
NV scores about 40-50%, that´s incredible (i know you tell the true and the graph crearely shows that) and give another reason why they need to fix creditscrew. I review my guess to 12000 feets (with MB he could release some cores to crunch too). I was replying to Juan, who had just posted "40 - 50%" - or between 2:1 and 2.5:1 As Mike says, there will be a lot of system dependency in the mix as well. |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
NV scores about 40-50%, that´s incredible (i know you tell the true and the graph crearely shows that) and give another reason why they need to fix creditscrew. I review my guess to 12000 feets (with MB he could release some cores to crunch too). One reason is i ballance it out. I dont get enough AP`s to run full bore. Of course i could if i really wanted but since i`m Lunatic i dont do that anymore. With each crime and every kindness we birth our future. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
[I was replying to Juan, who had just posted "40 - 50%" - or between 2:1 and 2.5:1 I could only speack for my NV´s hosts (high end NVs powered by slow CPU´s and non dedicated and most 2xGPU´s hosts), for the same crunching time on average WU and on the same host, the number 40-50% is real, i reported it several times with examples on the forums, that could be totaly diferent on other NV´s hosts as allways YMMV due the diferences on configuration. But the soul of the problem is the same, AP paids 2x-2,5X more than MB on the same NV host. Sure Mike has a lot more experience and examples on ATI than me, so he could give us betters numbers on ATI configs, but i´m sure the roule still in place there, AP paids a lot more than MB on the same host. That´s why Sten needs a big parachute! |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
NV scores about 40-50%, that´s incredible (i know you tell the true and the graph crearely shows that) and give another reason why they need to fix creditscrew. I review my guess to 12000 feets (with MB he could release some cores to crunch too). Do you have any links demonstrating CUDA superiority in any field other than SETI? Seems to me if CUDA was that much more efficient, nVidia would be ranting about it everywhere. As far as I know, SETI is the only area where CUDA has such an advantage. Too bad we can't have AMD code an OpenCL SETI MB App the way nVidia coded the first CUDA SETI MB App. I have a feeling the AMD coded App would be much more efficient than the current AMD MB App. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
NV scores about 40-50%, that´s incredible (i know you tell the true and the graph crearely shows that) and give another reason why they need to fix creditscrew. I review my guess to 12000 feets (with MB he could release some cores to crunch too). IIRC NVIDIA basically wrote the original CUDA app. So any tricks they knew to boost the efficiency were taken advantage of. Apparently ATI/AMD contributed by sending a card or two for development & pointing to their SDK. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
NV scores about 40-50%, that´s incredible (i know you tell the true and the graph crearely shows that) and give another reason why they need to fix creditscrew. I review my guess to 12000 feets (with MB he could release some cores to crunch too). Well, I also crunch for GPUGrid (currently an NVidia-only project), and I read their forums daily: the same sort of question crops up there periodically. It's hard to find a recent quote to link, but this one - from 2009, but by the lead project scientist, Gianni, the guy HTC brought on to the platform in Barcelona for the recent Android launch - gives a flavour of the much repeated response: Hi, Different projects require different branches of mathematics to perform the work that they need to do. Collatz is at one end of the spectrum, requiring only integer (whole) numbers, albeit very large ones. Projects like GPUGrid, Einstein and SETI rely heavily on a mathematical technique called 'FFT' (Fast Fourier Transform): NVidia have provided, and keep improving, FFT libraries specifically written for their GPUs, which is possibly one reason why the projects with those specific mathematical requirements have tended to favour NV. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
|
janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0 |
[/quote] Different projects require different branches of mathematics to perform the work that they need to do. Collatz is at one end of the spectrum, requiring only integer (whole) numbers, albeit very large ones. Projects like GPUGrid, Einstein and SETI rely heavily on a mathematical technique called 'FFT' (Fast Fourier Transform): NVidia have provided, and keep improving, FFT libraries specifically written for their GPUs, which is possibly one reason why the projects with those specific mathematical requirements have tended to favour NV.[/quote] Are there different versions of FFT libraries when using Lunatics apps? I'm using AMD and FFTW is installed by Lunatics installer. Fastest Fourier Transform in the West. Since the transition to SETI 7 the performance has lowered dramatically, both for GPU and CPU. As far as I can tell there are no optimisation for GPU's in FFTW for any GPU. |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
It's not the performance that has lowered, it's that the typical MB WU does more work ("autocorrelation") and Credit New doesn't recognize it (in addition to other problems with CN), so a typical MB WU takes longer AND gets less credit. Quite a neat trick! |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
It's not the performance that has lowered! Well it has too, because of the added autocorrelation searches, at least for the Cuda apps, and especially for the VHAR Wu's since the autocorrelations made the biggest impact on them, About two years ago I was running the x41 apps on my 9800GTX+ at Seti Beta, I noted that the APR fell when doing VHAR tasks, and rose when back up when Normal AR tasks were done, more recently we worked out why, the autocorrelation searches send thousands of small results across the PCIe bus, and does it particularly inefficiently, slowing the x41zc Cuda app (reduced GPU load), x42 is being designed so as to not be limited by this problem. On the i7-2600K/GTX460/HD7770 on PCI-e 2.0 x8: bandwidthTest.exe --memory=pinned --mode=shmoo [CUDA Bandwidth Test] - Starting... Running on... Device 0: GeForce GTX 460 Shmoo Mode ................................................................................ . Host to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 1024 350.4 2048 624.7 3072 852.4 4096 1047.8 5120 1207.2 6144 1340.4 7168 1462.2 8192 1565.0 9216 1654.6 10240 1736.9 11264 1815.5 12288 1891.6 13312 1956.3 14336 2017.2 15360 2068.5 16384 2111.9 17408 2157.2 18432 2194.6 19456 2227.6 20480 2266.4 22528 2323.9 24576 2378.8 26624 2423.5 28672 2463.2 30720 2500.8 32768 2533.9 34816 2558.8 36864 2586.8 38912 2616.0 40960 2637.1 43008 2654.8 45056 2677.0 47104 2693.9 49152 2708.2 51200 2725.3 61440 2790.8 71680 2835.8 81920 2872.6 92160 2902.4 102400 2927.6 204800 3036.9 307200 3075.5 409600 3095.9 512000 3108.5 614400 3117.3 716800 3122.4 819200 3126.8 921600 3130.4 1024000 3133.1 1126400 3128.0 2174976 3142.4 3223552 3147.4 4272128 3149.1 5320704 3151.6 6369280 3152.7 7417856 3153.5 8466432 3154.0 9515008 3152.9 10563584 3154.9 11612160 3155.1 12660736 3155.4 13709312 3155.6 14757888 3154.6 15806464 3154.8 16855040 3153.2 18952192 3151.2 21049344 3149.9 23146496 3150.0 25243648 3150.2 27340800 3150.3 29437952 3150.4 31535104 3150.6 33632256 3150.6 37826560 3150.7 42020864 3150.9 46215168 3151.0 50409472 3150.8 54603776 3151.2 58798080 3151.2 62992384 3151.2 67186688 3151.0 ................................................................................ . Device to Host Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 1024 373.5 2048 672.2 3072 913.7 4096 1107.7 5120 1290.9 6144 1429.4 7168 1551.4 8192 1660.8 9216 1761.8 10240 1838.4 11264 1916.1 12288 1971.0 13312 2047.1 14336 2103.6 15360 2150.1 16384 2186.7 17408 2240.1 18432 2233.0 19456 2297.3 20480 2342.1 22528 2380.8 24576 2453.7 26624 2495.9 28672 2538.6 30720 2585.5 32768 2602.8 34816 2650.3 36864 2652.4 38912 2686.9 40960 2704.3 43008 2732.3 45056 2753.3 47104 2766.1 49152 2790.2 51200 2802.9 61440 2874.5 71680 2913.6 81920 2960.0 92160 2993.2 102400 3007.8 204800 3122.5 307200 3152.8 409600 3170.8 512000 3187.6 614400 3191.6 716800 3196.8 819200 3203.3 921600 3206.5 1024000 3209.6 1126400 3200.5 2174976 3220.6 3223552 3220.5 4272128 3227.1 5320704 3227.4 6369280 3230.5 7417856 3228.9 8466432 3230.3 9515008 3231.2 10563584 3231.2 11612160 3232.9 12660736 3232.4 13709312 3231.8 14757888 3232.6 15806464 3231.5 16855040 3231.6 18952192 3231.3 21049344 3229.9 23146496 3230.5 25243648 3230.9 27340800 3230.9 29437952 3231.5 31535104 3231.2 33632256 3231.3 37826560 3232.1 42020864 3230.8 46215168 3231.6 50409472 3232.4 54603776 3232.3 58798080 3232.3 62992384 3232.1 67186688 3232.3 ................................................................................ . Device to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 1024 1034.5 2048 2179.8 3072 3212.4 4096 4398.9 5120 5155.0 6144 6586.5 7168 7656.7 8192 8346.7 9216 9791.7 10240 10614.8 11264 11321.9 12288 12026.6 13312 11842.6 14336 10761.9 15360 8669.8 16384 16440.4 17408 17264.5 18432 18341.1 19456 18948.8 20480 19978.8 22528 21762.9 24576 23664.7 26624 25390.6 28672 26786.6 30720 28521.1 32768 30375.2 34816 32173.6 36864 33443.9 38912 34772.7 40960 36657.8 43008 38260.8 45056 35013.6 47104 36510.0 49152 37900.2 51200 39125.1 61440 45323.1 71680 50028.8 81920 55235.4 92160 56747.6 102400 60550.8 204800 85423.6 307200 68501.9 409600 71616.5 512000 74215.9 614400 75383.1 716800 76939.7 819200 78112.5 921600 79220.7 1024000 80193.3 1126400 91208.6 2174976 96622.7 3223552 98801.2 4272128 99901.4 5320704 100500.3 6369280 100948.9 7417856 101247.7 8466432 101608.5 9515008 101802.0 10563584 101891.5 11612160 102079.3 12660736 102250.4 13709312 102386.8 14757888 102399.1 15806464 102472.1 16855040 102478.6 18952192 102681.4 21049344 102664.2 23146496 102733.5 25243648 102821.5 27340800 102817.0 29437952 102872.8 31535104 102901.4 33632256 102976.3 37826560 102977.4 42020864 103003.7 46215168 103060.8 50409472 103136.6 54603776 103094.7 58798080 103111.6 62992384 103170.4 67186688 103233.4 Result = PASS Claggy |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
And still have "someones" who say: nothing is wrong with creditscrew... You need a bigger parachute. |
Gone Send message Joined: 31 May 99 Posts: 150 Credit: 125,779,206 RAC: 0 |
I just hope Sten-Arne doesn't vanish from the radar ! |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
I just hope Sten-Arne doesn't vanish from the radar ! I also hope he's up to date with his sea drills, he's going to have a mare of a time getting out of his parachute and into the dingy once he's landed in the sea, good job there isn't a gale on, yet. ;0 Claggy |
James Sotherden Send message Joined: 16 May 99 Posts: 10436 Credit: 110,373,059 RAC: 54 |
Im wondering when Sten will pull the ripchord? :) Any bets on what the altitude will be? [/quote] Old James |
Dave Stegner Send message Joined: 20 Oct 04 Posts: 540 Credit: 65,583,328 RAC: 27 |
Maybe we should start a pool Put me down for: 17353 That is if he does not pull the cord, MB only Dave |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.