Average processing rate

Message boards : Number crunching : Average processing rate
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1617436 - Posted: 22 Dec 2014, 17:09:58 UTC
Last modified: 22 Dec 2014, 17:10:20 UTC

Does "Average processing rate" value for stock app account for few instances of app per device? It's now possible via app_config.xml usage.

Let say if one run 1 instance per device or 2 instances per device (assuming that elapsed time fully linear and running 2 instances takes exactly 2 time longer than single instance) what one will see in APR field for app version?
ID: 1617436 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1617442 - Posted: 22 Dec 2014, 17:47:51 UTC - in response to Message 1617436.  
Last modified: 22 Dec 2014, 17:48:36 UTC

I'm confused in what you are asking. Are you asking why people don't use more than 1 instance of stock app per device? Or are you asking the experience of people running more than 1 instance of stock app on their device?

I'm going to guess the first.

Example from beta on my GTX 980

Running 1-AP 7.06 stock 18 min 56 sec but utilized 98% of 1 Core

modified Command line
AP 7.06 now 30 minutes 3 secs and only 10% of 1 Core

so you would think running 3 APs with commandline would take 1 hr 30 minutes and 30% of 1 CPU core

In reality , I run 5 APs 7.06 with commandline per card in 1 hour 20 minutes with only 50-60% of 1 CPU Core

Probably doesn't answer your question but elapsed time is not fully linear. Hope that helps.

Zalster
ID: 1617442 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1617448 - Posted: 22 Dec 2014, 18:02:49 UTC - in response to Message 1617436.  

Does "Average processing rate" value for stock app account for few instances of app per device? It's now possible via app_config.xml usage.

Let say if one run 1 instance per device or 2 instances per device (assuming that elapsed time fully linear and running 2 instances takes exactly 2 time longer than single instance) what one will see in APR field for app version?

I you run two instances the APR rate will be approximately half what it was when running one instance (subject to how well the app loads the GPU),

Claggy
ID: 1617448 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1617450 - Posted: 22 Dec 2014, 18:06:27 UTC - in response to Message 1617448.  

Does "Average processing rate" value for stock app account for few instances of app per device? It's now possible via app_config.xml usage.

Let say if one run 1 instance per device or 2 instances per device (assuming that elapsed time fully linear and running 2 instances takes exactly 2 time longer than single instance) what one will see in APR field for app version?

I you run two instances the APR rate will be approximately half what it was when running one instance (subject to how well the app loads the GPU),

Claggy


As I'm afraid... That is, to look at APR alone, w/o real knowledge how many app instance are running it's not possible to say anything about GPU performance at all...

In particular, I tried to compare performance of 7.03 and 7.06 on same beta host (my own) but recognized that can't say how many instances were at 7.03 age... So, no comparison via APR possible, pity...
ID: 1617450 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1617454 - Posted: 22 Dec 2014, 18:13:39 UTC - in response to Message 1617450.  

As I'm afraid... That is, to look at APR alone, w/o real knowledge how many app instance are running it's not possible to say anything about GPU performance at all...

In particular, I tried to compare performance of 7.03 and 7.06 on same beta host (my own) but recognized that can't say how many instances were at 7.03 age... So, no comparison via APR possible, pity...

Another problem is that the APR varies with what work is being done,
Do shorties (especially on Cuda apps) and the APR will drop because autocorrelations make up a large percentage of the Wu,
Do VLARs (especially on Cuda apps) and the APR will drop because the pulse finding makes up a large percentage of the Wu.

Claggy
ID: 1617454 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1617568 - Posted: 22 Dec 2014, 21:38:55 UTC
Last modified: 22 Dec 2014, 21:41:01 UTC

I observe similar figures to what Claggy mentioned - running two instances will result in about half the APR as running one instance. APR is also influenced by variances in run time, eg what I mentioned previously about AP running shorter and MB running longer when AP/MB are running simultaneously vs running 2xAP or 2xMB.

It's a shame, because I would want to look at others' APR to get an idea of how well one GPU type compares with another... but without knowing the number of instances they use, it's impractical to do so.
Soli Deo Gloria
ID: 1617568 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1617611 - Posted: 22 Dec 2014, 23:20:06 UTC - in response to Message 1617450.  

The others have confirmed wat I have observed. YMMW - your mileage may wary.

I have run a number of configurations in a past few days. -- Which of them is "best" I cannot say.

One of each a time for two days: OK - shorter time as any of the ten top hosts. No apparent progress on any of the lists or APR. My calculations show that I gained some negative points on average.

Now running 4 at at time.

A few days ago I was running 8 MB at a time (With a different sw version).
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1617611 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1618482 - Posted: 24 Dec 2014, 23:32:36 UTC - in response to Message 1617611.  
Last modified: 24 Dec 2014, 23:33:13 UTC

The others have confirmed wat I have observed. YMMW - your mileage may wary.

I have run a number of configurations in a past few days. -- Which of them is "best" I cannot say.

One of each a time for two days: OK - shorter time as any of the ten top hosts. No apparent progress on any of the lists or APR. My calculations show that I gained some negative points on average.

Now running 4 at at time.

A few days ago I was running 8 MB at a time (With a different sw version).



An update ...


I have returned to running 8 MB at a time (0.125 GPU for GPU MB) and 3 AP at a time (0.33 for GPU AP).

APR may tell something, something can be guessed from the other's APR. It swings up and down, shows some numbers, gives me no clue.

So: 8 MB, 3 AP on 4 GPUs, + some on CPU.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1618482 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1618497 - Posted: 24 Dec 2014, 23:58:44 UTC - in response to Message 1618482.  

I have returned to running 8 MB at a time (0.125 GPU for GPU MB)

Which will result in very low throughput compared to running 2 at a time.

When I had my GTX 460/560Ti I tried running 1, 2 or 3 WUs at a time.
2 at a time gave me the most WU/s per hour processed.
When I got my GTX 750Tis I did the same thing, 1, 2 or 3 at a time. The result was the same, 2 at a time gave the most WU's per hour processed.
Although with MBv7 2v3 at a time was very close- longer running WUs gave much better throughput per hour running 3 at a time, but shorties actually gave much worse throughput- end result 2 at a time was the winner. If the Autocorrelations could be significantly optimised on the GPU then 3 at a time would be possible, with even greater throughput per hour than is possible now with the longer running WUs.
Grant
Darwin NT
ID: 1618497 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1618500 - Posted: 25 Dec 2014, 0:14:38 UTC - in response to Message 1618497.  
Last modified: 25 Dec 2014, 0:19:19 UTC

I have returned to running 8 MB at a time (0.125 GPU for GPU MB)

Which will result in very low throughput compared to running 2 at a time.

When I had my GTX 460/560Ti I tried running 1, 2 or 3 WUs at a time.
2 at a time gave me the most WU/s per hour processed.
When I got my GTX 750Tis I did the same thing, 1, 2 or 3 at a time. The result was the same, 2 at a time gave the most WU's per hour processed.
Although with MBv7 2v3 at a time was very close- longer running WUs gave much better throughput per hour running 3 at a time, but shorties actually gave much worse throughput- end result 2 at a time was the winner. If the Autocorrelations could be significantly optimised on the GPU then 3 at a time would be possible, with even greater throughput per hour than is possible now with the longer running WUs.


I know, I have tried. I compiled my own versions. Tweaked some settings (nv sched yield/sleep/hot). Running on linux. Tried with different kernel sizes. With different thread counts. Tried with __ldg(&). Tried with unroll N or __restricted__ (google that). Tried and tried.

And got tired.

Now running 8 at a time. Just because of my stubbordness.

p.s. And please divide the runtimes according to the numnber of running tasks.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1618500 · Report as offensive

Message boards : Number crunching : Average processing rate


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.