Message boards :
Number crunching :
Average processing rate
Message board moderation
Author | Message |
---|---|
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Does "Average processing rate" value for stock app account for few instances of app per device? It's now possible via app_config.xml usage. Let say if one run 1 instance per device or 2 instances per device (assuming that elapsed time fully linear and running 2 instances takes exactly 2 time longer than single instance) what one will see in APR field for app version? |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
I'm confused in what you are asking. Are you asking why people don't use more than 1 instance of stock app per device? Or are you asking the experience of people running more than 1 instance of stock app on their device? I'm going to guess the first. Example from beta on my GTX 980 Running 1-AP 7.06 stock 18 min 56 sec but utilized 98% of 1 Core modified Command line AP 7.06 now 30 minutes 3 secs and only 10% of 1 Core so you would think running 3 APs with commandline would take 1 hr 30 minutes and 30% of 1 CPU core In reality , I run 5 APs 7.06 with commandline per card in 1 hour 20 minutes with only 50-60% of 1 CPU Core Probably doesn't answer your question but elapsed time is not fully linear. Hope that helps. Zalster |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Does "Average processing rate" value for stock app account for few instances of app per device? It's now possible via app_config.xml usage. I you run two instances the APR rate will be approximately half what it was when running one instance (subject to how well the app loads the GPU), Claggy |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Does "Average processing rate" value for stock app account for few instances of app per device? It's now possible via app_config.xml usage. As I'm afraid... That is, to look at APR alone, w/o real knowledge how many app instance are running it's not possible to say anything about GPU performance at all... In particular, I tried to compare performance of 7.03 and 7.06 on same beta host (my own) but recognized that can't say how many instances were at 7.03 age... So, no comparison via APR possible, pity... |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
As I'm afraid... That is, to look at APR alone, w/o real knowledge how many app instance are running it's not possible to say anything about GPU performance at all... Another problem is that the APR varies with what work is being done, Do shorties (especially on Cuda apps) and the APR will drop because autocorrelations make up a large percentage of the Wu, Do VLARs (especially on Cuda apps) and the APR will drop because the pulse finding makes up a large percentage of the Wu. Claggy |
Wedge009 Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553 |
I observe similar figures to what Claggy mentioned - running two instances will result in about half the APR as running one instance. APR is also influenced by variances in run time, eg what I mentioned previously about AP running shorter and MB running longer when AP/MB are running simultaneously vs running 2xAP or 2xMB. It's a shame, because I would want to look at others' APR to get an idea of how well one GPU type compares with another... but without knowing the number of instances they use, it's impractical to do so. Soli Deo Gloria |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
The others have confirmed wat I have observed. YMMW - your mileage may wary. I have run a number of configurations in a past few days. -- Which of them is "best" I cannot say. One of each a time for two days: OK - shorter time as any of the ten top hosts. No apparent progress on any of the lists or APR. My calculations show that I gained some negative points on average. Now running 4 at at time. A few days ago I was running 8 MB at a time (With a different sw version). To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
The others have confirmed wat I have observed. YMMW - your mileage may wary. An update ... I have returned to running 8 MB at a time (0.125 GPU for GPU MB) and 3 AP at a time (0.33 for GPU AP). APR may tell something, something can be guessed from the other's APR. It swings up and down, shows some numbers, gives me no clue. So: 8 MB, 3 AP on 4 GPUs, + some on CPU. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13720 Credit: 208,696,464 RAC: 304 |
I have returned to running 8 MB at a time (0.125 GPU for GPU MB) Which will result in very low throughput compared to running 2 at a time. When I had my GTX 460/560Ti I tried running 1, 2 or 3 WUs at a time. 2 at a time gave me the most WU/s per hour processed. When I got my GTX 750Tis I did the same thing, 1, 2 or 3 at a time. The result was the same, 2 at a time gave the most WU's per hour processed. Although with MBv7 2v3 at a time was very close- longer running WUs gave much better throughput per hour running 3 at a time, but shorties actually gave much worse throughput- end result 2 at a time was the winner. If the Autocorrelations could be significantly optimised on the GPU then 3 at a time would be possible, with even greater throughput per hour than is possible now with the longer running WUs. Grant Darwin NT |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
I have returned to running 8 MB at a time (0.125 GPU for GPU MB) I know, I have tried. I compiled my own versions. Tweaked some settings (nv sched yield/sleep/hot). Running on linux. Tried with different kernel sizes. With different thread counts. Tried with __ldg(&). Tried with unroll N or __restricted__ (google that). Tried and tried. And got tired. Now running 8 at a time. Just because of my stubbordness. p.s. And please divide the runtimes according to the numnber of running tasks. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.