GTX 970 about as fast as a 670 for crunching

Message boards : Number crunching : GTX 970 about as fast as a 670 for crunching
Message board moderation

To post messages, you must log in.

AuthorMessage
Mark Lybeck

Send message
Joined: 9 Aug 99
Posts: 245
Credit: 216,677,290
RAC: 173
Finland
Message 1657376 - Posted: 26 Mar 2015, 20:18:21 UTC

It seems that the GTX 970 is about as fast as the GTX 670 for crunching multibeam work units. The 970 will of course use a little bit less power.

I am amazed the GTX 970 would not increase in performance over the previous generations.
ID: 1657376 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1657378 - Posted: 26 Mar 2015, 20:24:40 UTC - in response to Message 1657376.  

How many work units per card are you running?
ID: 1657378 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1657392 - Posted: 26 Mar 2015, 21:23:23 UTC
Last modified: 26 Mar 2015, 21:52:31 UTC

Yes, We're hitting a number of limits in the application, rather than the GPU. For development purposes, Finding [& understanding] all those limits on my 980 is taking a significant amount of time, as some of those include how chatty the application is, scaling of the pretty small datasets, and some underlying system considerations.

For the time being, the best way is to up the process priority & run multiple instances (usually 2-3 per GPU), and the combined throughput should be around 2x a 670. (the second task on a maxwell GPU seems to scale very well, even with an old Core2Duo driving it)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1657392 · Report as offensive
Profile cliff west

Send message
Joined: 7 May 01
Posts: 211
Credit: 16,180,728
RAC: 15
United States
Message 1657424 - Posted: 26 Mar 2015, 22:30:11 UTC - in response to Message 1657392.  

can any post a smile "how to" to have your gpu work on more than one work unit at a time.

thanks
ID: 1657424 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1657445 - Posted: 26 Mar 2015, 23:18:53 UTC - in response to Message 1657424.  

Simple, create a text file called app_config.xml with Notepad and put it into the Seti directory. The contents would have something like this:



<app_config>
<app>
<name>setiathome_v7</name>
<gpu_versions>
<gpu_usage>.33</gpu_usage>
<cpu_usage>.10</cpu_usage>
</gpu_versions>
</app>

<app>
<name>astropulse_v6</name>
<gpu_versions>
<gpu_usage>.5</gpu_usage>
<cpu_usage>.50</cpu_usage>
</gpu_versions>
</app>

<app>
<name>astropulse_v7</name>
<gpu_versions>
<gpu_usage>.33</gpu_usage>
<cpu_usage>0.50</cpu_usage>
</gpu_versions>
</app>
</app_config>

But in your case, I would reduce the GPU tasks to two per card with a 0.5 entry for gpu_usage.

Cheers, Keith
P.S. They is a whole complete wiki for config.xml at the BOINC site for the syntax of the files.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1657445 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1657447 - Posted: 26 Mar 2015, 23:21:40 UTC - in response to Message 1657392.  

Jason, do you want to divulge the predicted timeline for new apps that harness the power of the Kepler and Maxwell hardware? How far out are they.... 6 months, one year??

Cheers, Keith
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1657447 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1657455 - Posted: 26 Mar 2015, 23:48:03 UTC - in response to Message 1657447.  
Last modified: 27 Mar 2015, 0:02:26 UTC

Jason, do you want to divulge the predicted timeline for new apps that harness the power of the Kepler and Maxwell hardware? How far out are they.... 6 months, one year??

Cheers, Keith


Hard to predict with current family & personal issues, but with Cuda 7.0 being released a couple of days ago, that's one less technical roadblock. (testers were experiencing unexplained reliability issues with Cuda 6.0 and 6.5, and no appreciable performance gains with the current application (x41zc) architecture so they were out)

Also I've been migrating build system to Gradle ( see http://en.wikipedia.org/wiki/Gradle ), it complicates the timeline a bit. That's an extra development burden up front expected to ease cross platform release in the long run (so worthwhile)

Aside from the infrastructure changes, the reengineering parts involved place alpha test x42 builds within the 3 month timeframe. That's after the already confirmed architectural changes needed to reduce the chattiness, up the load with fewer instances, and scale better from the smallest Cuda device through to the TiTan-X.

So short version ~3 months to x42 alpha, which is more or less a completely reengineered design based on everything we found. Aside from improved Cuda scaling, it's expected to have support for OpenCL devices and in a later revision AP (though those come later, and not based on current code/techniques)

[Edit:] Note that Windows 10 release, and adapting to accomodate WDDM2.0/dirextX12 techniques & best practices may or may not extend the timeline. That'll probably be a bit cleaer after I get to play with the tech preview this weekend ( USB is made, machine & new SSD are waiting)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1657455 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1657462 - Posted: 27 Mar 2015, 0:05:34 UTC - in response to Message 1657445.  

P.S. They is a whole complete wiki for config.xml at the BOINC site for the syntax of the files.

Client configuration
Application configuration
ID: 1657462 · Report as offensive
Mark Lybeck

Send message
Joined: 9 Aug 99
Posts: 245
Credit: 216,677,290
RAC: 173
Finland
Message 1657812 - Posted: 27 Mar 2015, 18:04:28 UTC - in response to Message 1657378.  

How many work units per card are you running?


I am running 3 per card.
ID: 1657812 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1657877 - Posted: 27 Mar 2015, 19:32:57 UTC - in response to Message 1657455.  

So short version ~3 months to x42 alpha, which is more or less a completely reengineered design based on everything we found. Aside from improved Cuda scaling, it's expected to have support for OpenCL devices and in a later revision AP (though those come later, and not based on current code/techniques)

[Edit:] Note that Windows 10 release, and adapting to accomodate WDDM2.0/dirextX12 techniques & best practices may or may not extend the timeline. That'll probably be a bit cleaer after I get to play with the tech preview this weekend ( USB is made, machine & new SSD are waiting)



Thanks for the update Jason. Good luck with the Windows10 Tech Preview. So, likely a X42 general release sometime around the end of the year barring hitting any roadblocks with the new Windows graphics engine. Looking forward to it.

Cheers, Keith
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1657877 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1657884 - Posted: 27 Mar 2015, 19:40:43 UTC - in response to Message 1657812.  

first thing I would raise the priority in your mbcuda.cfg

find that in your setiathome folder and open with notepad

go down to the last few line and change them to the following

remove the ;; before the last 3 lines

processpriority = normal
pfblockspersm = 16
pfperiodsperlaunch = 400


(where it says processpriority you can either make it normal, abovenormal or high) but depending on what you use the computer for beside crunching these will affect usage. If it's only a cruncher then you can try abovenormal or high

I also see that there is a 670 in the same machine as the 970.

One of the other will have to say if the 670 might be slowing down the 970. I know when I had a 750 in with some 780s it tended to slow the 780s somewhat. Others might know more than I on that.

Happy crunching...

Zalster
ID: 1657884 · Report as offensive
Profile cliff west

Send message
Joined: 7 May 01
Posts: 211
Credit: 16,180,728
RAC: 15
United States
Message 1657899 - Posted: 27 Mar 2015, 20:19:39 UTC - in response to Message 1657445.  

thanks... ordered my GTX 980 today to replace my SLIed 570s. will go back to SLI when i have saved enough money to get the second 980.
ID: 1657899 · Report as offensive

Message boards : Number crunching : GTX 970 about as fast as a 670 for crunching


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.