Message boards :
Number crunching :
Considering new Graphics card
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Congratulations on your new card. You could run 2 APs on that card or between 2-3 MBs or 1 AP/1 MB depending on your chip. I used 41 cuda on mine but that is because I also have a 650 in with it. The cuda 50 is supposed to be better with that GPU but the choice is yours. Zalster |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13755 Credit: 208,696,464 RAC: 304 |
You guys think that running 2 tasks on the GPU is best? Or shall I try to run 3? Also I wonder if Cuda50 is best for this card? 2 WUs at a time, and CUDA50 is the way to go. Grant Darwin NT |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
Thx folks! I keep checking results atm and sometimes I'm a bit confused. For example this wu here: http://setiathome.berkeley.edu/workunit.php?wuid=1557775551 As you can see it took my card 1461 seconds to crunch this wu. My wingman was doing this task on a GTX 570, which has less cuda cores, in just 593 seconds. That's 2.5 times faster! Plus, he seems to run stock while I use optimized apps. Is there something wrong with my card or my setup? |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
That is interesting. Probably one of the others has a good explanation. Only thing I see is the difference in your clock speeds. His is clockRate = 1540 MHz Yours is Kepler GPU current clockRate = 1201 MHz One the other hand, yours is more power efficient and runs cooler probably. But it's still faster than most people. 24 minutes is still pretty good. How many work units at a time are you running on that GPU? 2? or 3? That also tends to push the time to completion out a little but overall your total RAC goes up. Happy Crunching.. Zalster |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
Zalster, I run 2 WUs at a time. I use GPU-Z to check load and it goes up to 98% sometimes so I'm not sure if running 3 tasks at once is a good idea. But maybe I will try. PS: Temperature usually stays beyond 60 degrees and the fan runs around 35-37%. It seems to be a very quiet card. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
I would stick with 2 work units for now. Does your GPU come with a fan regulator program? I have EVGA cards and they have a program that lets me set a Temperature to Fan speed curve. (My Cards never get above 60C as they speed up. For me I use 70 C as my top temp but that is me. I know they are rated for higher temps than that but I like to keep mine below the max). If it didn't, then I think I would look for one. I know there are some out there and I'm sure once others here start to wake up they would be happy to list them. I don't know them at the top off my head right now. It's 6 am so give them a few hours to wake up and migrate here into the forum, lol. Zalster |
Wiggo Send message Joined: 24 Jan 00 Posts: 34984 Credit: 261,360,520 RAC: 489 |
Thx folks! That sound about right time wise (and the 570 is likely only doing 1 tasks at a time), just remember your GTX 750 is not in the same class as a GTX 570. Even though the 570 doesn't have as many CUDA cores its cores are much larger than those found of a 750. Memory bandwidth is another area the 570 excels at seeing is has a 320-bit wide bus delivering 152.0 GB/sec while your 750 only has a 128-bit wide bus delivering only 80 GB/sec. Cheers. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Of course there are a lot of other factors like the number of WU running, GPU usage, memorry bandwidth, etc. but: Newer cards are faster, have more core, uses less power, etc. than the olders but there still are some older cards, specialy 570 or 580 who still are clearely SETI cruncher winners and some manage to OC/optimize them to some incredible speeds. We see and talk about that before, i still miss my 580´s heavely OC when i compare with the price/production against my 780´s FTW but i don´t miss the electric bills. What you need to be clear is, your newer card uses a lot less power to do the same job, so it´s fine. My sugestion: Compate your card with similar cards to see if your configuration is realy optimal. |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
@Zalster: Not yet, but I could always use programs like NvidiaInspector or MSI Afterburner to control the fan and/or OC the card. But atm I will let it run as it is. @Wiggo: Ah, didn't know that. Thx! @juan: According to some charts which I found in magazines and on the net, yes, the 570 is faster then my card, but not by that much. So I just wanted to make sure that everything is ok and nothings wrong with my card or my setup. Perhaps my rather slow CPU and/or RAM are a bottleneck here. And yes, I'm well aware that my card uses much less power. That was one of the main reason for buying it. I would never have gotten one of those 180+ watt cards. I'm a low power crunching guy ;-) And in terms of performance/price I guess you can't do much better for Seti then with this card atm: http://www.videocardbenchmark.net/gpu_value.html |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
Haha, now even a GTX 260 is crunching faster then my card: http://setiathome.berkeley.edu/workunit.php?wuid=1558183278 And a GTS 450 also: http://setiathome.berkeley.edu/workunit.php?wuid=1558147245 Well, that's in fact a bit frustrating. But what can I do? Anyway, this morning I used SetiPerformance to test the different CUDA versions and I found out that Cuda50 really is best for my card, although it's just a little bit faster then Cuda42 (less then 10%). BTW: Are there any (test)builds for Cuda55, Cuda60 or even Cuda65? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
BTW: Are there any (test)builds for Cuda55, Cuda60 or even Cuda65? I ran some test builds for Cuda60 on my GTX 670 last december. They produced an unacceptable rate of invalid results, so they were discontinued as 'platform not ready yet'. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13755 Credit: 208,696,464 RAC: 304 |
Haha, now even a GTX 260 is crunching faster then my card: The advantage of the GTX750 series isn't it's processing rate (that won't be the case till an application comes out that is optimised for that architecture) but in how little power it uses to do that crunching. My GTX750Tis are slightly slower than the GTX460 & GTX 560Ti they replace, but I could run 6*750Tis and still use less power than running the 460 & 560Ti. Grant Darwin NT |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Frankie, The 750 is a good GPU. I own 5 myself. The efficiency of the card and power saving is the great advantage. As Grant said, it's a newer card so the apps aren't optimized yet for it but once they are, we should see an increase in it's output. Don't get discouraged. Zalster |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
Thx folks, so lets hope and wait for new cuda versions. |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
Still trying to figure out where the problem is with my setup. Today I found this result: http://setiathome.berkeley.edu/workunit.php?wuid=1560631270 I was running this WU on my GTX 750, Tom Miller was running the task on his GTX 750 Ti. When I check the details for AP on GPU for our computers I see 245 GFLOPs for Tom and 225 GFLOPs for me. A rather small difference and still it took my computer almost twice as long to crunch this WU (3046 seconds for Tom, 6073 seconds for me). So maybe the problem is not the card but something else? I see that Tom is using an older driver then I do. And I read something about problems with the current driver here somewhere. Could that be the problem? Or could it be my PCIe slot? It's 1.1 x16, maybe my card needs a 2.0 slot? Any ideas? Tom, if you read this here, do you run 2 AP tasks at a time on your card? |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Still trying to figure out where the problem is with my setup. They are also using rev 1316 app vs rev 1843 app that you are running. Several variables in play. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Mike Send message Joined: 17 Feb 01 Posts: 34264 Credit: 79,922,639 RAC: 80 |
The difference is the params in use. Frankie uses standard values whilst Tom is using optimized values. DATA_CHUNK_UNROLL set to:12 FFA thread block override value:8192 FFA thread fetchblock override value:4096 With each crime and every kindness we birth our future. |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
@Hal: Yes, many variables, that's the problem. Google says it's not the PCIe slot. 1.1 should be ok even for new cards. Next I will try to go back to 337.88 and see if that changes anything. @Mike: What do those parameters mean and how can I change them? |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
@Hal: Yes, many variables, that's the problem. Google says it's not the PCIe slot. 1.1 should be ok even for new cards. Next I will try to go back to 337.88 and see if that changes anything. You would add them to your ap_cmdline_win_x86_SSE2_OpenCL_NV.txt(or whatever name you have for it) & a full description of what they do can be found in the AstroPulse_OpenCL_NV_ReadMe.txt. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Mike Send message Joined: 17 Feb 01 Posts: 34264 Credit: 79,922,639 RAC: 80 |
@Hal: Yes, many variables, that's the problem. Google says it's not the PCIe slot. 1.1 should be ok even for new cards. Next I will try to go back to 337.88 and see if that changes anything. -ffa_block N :sets how many FFA's different period iterations will be processed per kernel call. N should be integer even number less than 32768. -ffa_block_fetch N: sets how many FFA's different period iterations will be processed per "fetch" kernel call (longest kernel in FFA). N should be positive integer number, should be divisor of ffa_block_N. -unroll N :sets number of data chunks processed per kernel call in main application loop. N should be integer number, minimal possible value is 2. Simply add it to your ap_cmdline_win_x86_SSE2_OpenCL_NV.txt Example -unroll 6 -ffa_block 8192 -ffa_block_fetch 4096. With each crime and every kindness we birth our future. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.