Comparing the performance of various MB apps

Message boards : Number crunching : Comparing the performance of various MB apps
Message board moderation

To post messages, you must log in.

AuthorMessage
qbit
Volunteer tester
Avatar

Send message
Joined: 19 Sep 04
Posts: 630
Credit: 6,868,528
RAC: 0
Austria
Message 1704054 - Posted: 22 Jul 2015, 17:38:45 UTC

I wonder what's the best way to do this. Are the application detail values (computer > details > application details) accurate enough to measure the performance?
I was running the regular lunatics cuda 5 app for some time and was close to 100 GFLOPS (99.7 iirc). Over the last days I was testing 2 different cuda 6.5 apps. With the 1st one, the value dropped to 93 GFLOPS, with the second one it dropped to the current value of 84.8 GFLOPS:
http://setiathome.berkeley.edu/host_app_versions.php?hostid=7563243

I do not have the feeling that it's running slower and my RAC seems ok also, so I wonder how accurate this values are?
ID: 1704054 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1704062 - Posted: 22 Jul 2015, 17:52:51 UTC - in response to Message 1704054.  

Where did you get the new apps from?
ID: 1704062 · Report as offensive
qbit
Volunteer tester
Avatar

Send message
Joined: 19 Sep 04
Posts: 630
Credit: 6,868,528
RAC: 0
Austria
Message 1704067 - Posted: 22 Jul 2015, 17:59:36 UTC - in response to Message 1704062.  

Where did you get the new apps from?

They are not really new, those are alpha builds from Jason, but they seem to run very well on my card. You can find them here:
http://www.jgopt.org/download.html
ID: 1704067 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1704068 - Posted: 22 Jul 2015, 18:00:48 UTC - in response to Message 1704054.  

The figures on the application details page are averages over recent workunits. Maxwell-based NVidia cards, like the GTX 750 in that host, are notably inefficient at running VHAR 'shorty' tasks. A run of shorties would show up as tasks running longer than estimated, or conversely the card running slower. If you want to compare application speeds accurately, you have to use the same mix of tasks for each application, usually in an off-line bench test.
ID: 1704068 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1704069 - Posted: 22 Jul 2015, 18:04:11 UTC - in response to Message 1704054.  

The application details values can be pretty volatile, but for MB should at least fluctuate around a discernible mean value, somewhere in the region of 1/3 'actual' (over say a few days worth of validations).

In the case of Cuda versions, the current implementation uses minimal latency hiding techniques (features that improve performance more in later Cudas). The result of this is that (for the time being) different generation GPUs tend to favour one version or another. This is due to differences in latency depending on how costly the new features are against possible performance improvement in the Cuda version and its libraries.

Future releases will make better use of the features to hide those latencies, thanks to great help from Petri33 that has been accelrated, and work is underway to iron out some kinks.

Once this effort is complete, more than likely newer drivers and later Cuda versions would be preferable for many systems (though I'm sure there will always be exceptions, so older versions will be maintained)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1704069 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1704079 - Posted: 22 Jul 2015, 18:49:50 UTC - in response to Message 1704054.  

I wonder what's the best way to do this. Are the application detail values (computer > details > application details) accurate enough to measure the performance?
I was running the regular lunatics cuda 5 app for some time and was close to 100 GFLOPS (99.7 iirc). Over the last days I was testing 2 different cuda 6.5 apps. With the 1st one, the value dropped to 93 GFLOPS, with the second one it dropped to the current value of 84.8 GFLOPS:
http://setiathome.berkeley.edu/host_app_versions.php?hostid=7563243

I do not have the feeling that it's running slower and my RAC seems ok also, so I wonder how accurate this values are?

The best method for MBs is to compare like Angle Ranges. There are 3 main ranges, you could call them VHAR (Shorties), MAR (Middle), and, LAR (Low). Some have different opinions on the ranges, I would say call the VHARs above 2.0 AR, MAR around 0.8 AR, and LAR around 0.4 AR. So, you would make a log for each App and record the times and ARs in those 3 ranges.

Hopefully I'll be able to test my own 750Ti in a few more days...
ID: 1704079 · Report as offensive
qbit
Volunteer tester
Avatar

Send message
Joined: 19 Sep 04
Posts: 630
Credit: 6,868,528
RAC: 0
Austria
Message 1704100 - Posted: 22 Jul 2015, 19:54:57 UTC

Very interesting, thx so far everbody!

Unless my RAC suddenly drops I will stick with Jasons alpha. It seems to put less stress on the hardware. We had 6 days with 35+ degrees in a row here with 37 degrees today and the GPU runs at 56-57 degrees most times.

BTW: Does anybody know if Cuda will still be officially supported in the future or if MB will switch from Cuda to open CL completly? (openCL app already at beta)
ID: 1704100 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1704103 - Posted: 22 Jul 2015, 20:08:08 UTC - in response to Message 1704100.  

Very interesting, thx so far everbody!

Unless my RAC suddenly drops I will stick with Jasons alpha. It seems to put less stress on the hardware. We had 6 days with 35+ degrees in a row here with 37 degrees today and the GPU runs at 56-57 degrees most times.

BTW: Does anybody know if Cuda will still be officially supported in the future or if MB will switch from Cuda to open CL completly? (openCL app already at beta)


I doubt that Cuda support would be dropped, mostly because from what I understand the project likes to include as wide a support as possible, Also there's the matter of that as application developers we tend to have different strengths and weaknesses, so some or another builds will be preferred by a given system.

As for the third party Cuda builds, after the last of the current x41 series used for stock is tied off, it'll be gutted and reengineered to have MB and AP, with where usable, cuda, opencl, renderscript, and possibly directcompute all within the same sources, and more spohisticated heterogeneous capabilities. Whether or not some specific configuration(s) of that would be suitable for a stock release would be an open question, and depend on the kinds of tools a user would need, versus a baseline casual user where current simpler designs suffice.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1704103 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1704116 - Posted: 22 Jul 2015, 20:57:28 UTC - in response to Message 1704100.  

It seems to put less stress on the hardware. We had 6 days with 35+ degrees in a row here with 37 degrees today and the GPU runs at 56-57 degrees most times.

Just a thought, but I wonder if the reduced stress as indicated by (presumably) lower core temperatures despite the (presumably) higher ambient temperature means that the GPU is actually performing less strongly. But if you find your average remains stable, I suppose it's a benefit to have the cooler-running application. As mentioned in several other places, RAC does tend to be very volatile, though.
Soli Deo Gloria
ID: 1704116 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1704121 - Posted: 22 Jul 2015, 21:12:00 UTC - in response to Message 1704116.  
Last modified: 22 Jul 2015, 21:14:38 UTC

Yes, tough call. I didn't see a particular benefit or disadvantage in that particular build myself (980SC), which is the main reason I never pushed it for wider testing.

That's about when things shifted gear to petri's work and a few of my own tweaks, then looking to new designs. There's enough going on in the background to change direction, and my broken (unpublished) x41zd attempt showed some great promise as with Petri's own builds. At the same time there could need to be some very conservative defaults and stern safety warnings, as 'no free lunch' seemed to be applying. (~30% better load, much shorter runtimes, but some serious heat). If some thrttling will be needed to keep things sane for basic general use, that'll be quite doiable. Just something we'll have to work out if it's necessary during x41 finalisation. That's the kindof thing x42's meant to have, so pulling it forward could be a challenge.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1704121 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1704131 - Posted: 22 Jul 2015, 21:53:19 UTC - in response to Message 1704068.  

The figures on the application details page are averages over recent workunits. Maxwell-based NVidia cards, like the GTX 750 in that host, are notably inefficient at running VHAR 'shorty' tasks. A run of shorties would show up as tasks running longer than estimated, or conversely the card running slower. If you want to compare application speeds accurately, you have to use the same mix of tasks for each application, usually in an off-line bench test.

Another way with less processing overhead (but perhaps requiring more operators work) is to plot time vs AR curve for each of builds. This way real data can be used (online). And more apparent picture of relative performance of builds in comparison can be achieved.
ID: 1704131 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1704135 - Posted: 22 Jul 2015, 22:01:33 UTC - in response to Message 1704100.  


BTW: Does anybody know if Cuda will still be officially supported in the future or if MB will switch from Cuda to open CL completly? (openCL app already at beta)


"CUDA officially supported" can be only by nVidia corporation. Only nVidia can either support CUDA in their drivers or drop support. All others can either use CUDA if it supported for particular hardware/OS or use something else. Just to make words little clearer.
ID: 1704135 · Report as offensive

Message boards : Number crunching : Comparing the performance of various MB apps


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.