the latest on release of AP_v7?

Message boards : Number crunching : the latest on release of AP_v7?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1574186 - Posted: 19 Sep 2014, 5:50:11 UTC - in response to Message 1574103.  

With AP7 we will get SIMD CPU builds released as stock as well. So, efficiency of AP7 crunching on stock should improve a lot. Nethertheless relative GPU crunching efficiency for AP remains considerably bigger even versus SIMD optimized CPU builds. Actually it will even increase cause gain from new blanking approach is considerably bigger for GPU.

All this leaves untouched or even hardens all conclusions I made about effective computational resources usage here:
http://lunatics.kwsn.net/2-windows/what-is-best-hardware-for-what-seti-application.0.html

Use CPU and NV/iGPU hardware for MultiBeam processing, use ATi for AstroPulse processing.
It will not result in biggest RAC for host but it will result in most host usefulness for SETI project.
ID: 1574186 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1574949 - Posted: 20 Sep 2014, 9:08:02 UTC
Last modified: 20 Sep 2014, 9:13:55 UTC

AFAIK ...
Currently APv6 is payed more per hour than SETIv7, because the stock APv6 CPU app is not optimized (no usage of CPU instruction set) - so very long calculation time.
The stock CPU apps are the reference points for to calculate the Credits/project task.

The »upcoming APv7« CPU apps use the CPU instruction sets (SSE, SSE2, SSE3 (depend of OS)).
So the calculation of the project task will be faster (reference point is also faster).

Because of this after some time less Credits/AP project task.
I'm correct, or wrong?

After some time the Credits AP/hour are lower than for SETI/hour.
I'm correct, or wrong?

Then all want just SETI project tasks - and noone crunch AP. ;-)
I'm correct, or wrong? :o)
ID: 1574949 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1574955 - Posted: 20 Sep 2014, 9:22:47 UTC - in response to Message 1574949.  

AFAIK ...
Currently APv6 is payed more per hour than SETIv7, because the stock APv6 CPU app is not optimized (no usage of CPU instruction set) - so very long calculation time.
The stock CPU apps are the reference points for to calculate the Credits/project task.

The »upcoming APv7« CPU apps use the CPU instruction sets (SSE, SSE2, SSE3 (depend of OS)).
So the calculation of the project task will be faster (reference point is also faster).

Because of this after some time less Credits/AP project task.
I'm correct, or wrong?

After some time the Credits AP/hour are lower than for SETI/hour.
I'm correct, or wrong?

Then all want just SETI project tasks - and noone crunch AP. ;-)
I'm correct, or wrong? :o)

It will be fascinating to wait and watch, and to see what happens and what people do.
ID: 1574955 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1574958 - Posted: 20 Sep 2014, 9:28:46 UTC - in response to Message 1574955.  
Last modified: 20 Sep 2014, 9:31:04 UTC

The »upcoming APv7« CPU apps use the CPU instruction sets (SSE, SSE2, SSE3 (depend of OS)).
So the calculation of the project task will be faster (reference point is also faster).

The reference points as well as being faster, will also be slower, as there are still Stock code base AP apps,
and at least on Windows the Stock 7.00 app is a lot slower than the Stock 6.01 app.

Claggy
ID: 1574958 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1574964 - Posted: 20 Sep 2014, 9:43:14 UTC

I (or we) don't know which app Eric will choose for reference point.

For 'Mac OS X' (32bit) just the 'non CPU instruction set' app is available.

For other OSs are also 'CPU instruction set' apps available.

Maybe the mix of all available SETI hosts (which app/s will be used) will speed up the AP CPU reference point (I guess there are not much hosts around which will crunch just with the 'non CPU instructions set' app (too old CPUs)) ... -> less Cr./AP task.

I'm correct, or wrong?
ID: 1574964 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1574965 - Posted: 20 Sep 2014, 9:48:12 UTC - in response to Message 1574949.  

You seem to have forgotten about AstroPulse v7 v7.00. Many have complained about it. Many have been aborted. On a Windows machine that use to complete the Stock AP App in around 20 hours this App takes around 60 hours. I don't think I ever did complete one. I did complete a couple in Linux, which is much faster on the same machine;
Run time: 1 days 6 hours 21 min 38 sec
I have a couple in Windows standing by, it took 20 hours to reach 30% complete. I've given up on them, they will be finished with AstroPulse v7 v7.03 (sse) or I will abort them. 60 hours on a machine that can complete the AstroPulse v7 v7.03 (sse) version in 9 hours is ridiculous.

If AstroPulse v7 v7.00 is used as the base, look for credits to increase. It is clearly Much slower than the current AP Base CPU App.
ID: 1574965 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1575074 - Posted: 20 Sep 2014, 16:46:38 UTC

The credits for AP v7 work at Beta are roughly equal to the credits for AP v6 work here. They may be a little lower, but not by a huge amount.

Jason's analysis indicates it's the CPU app version which produces APRs which most exceed the host's Whetstone benchmark which becomes the effective reference. If so, the generic Windows CPU build being terribly slow won't matter long term except for those who are running hardware which can't use the 32 bit SSE or 64 bit SSE2 version.
                                                                   Joe
ID: 1575074 · Report as offensive
qbit
Volunteer tester
Avatar

Send message
Joined: 19 Sep 04
Posts: 630
Credit: 6,868,528
RAC: 0
Austria
Message 1575078 - Posted: 20 Sep 2014, 16:54:28 UTC

Folks, how about commandline for Nvidia CL? Shall we use the same as for V6 or are there any differences?
ID: 1575078 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1575085 - Posted: 20 Sep 2014, 17:23:59 UTC - in response to Message 1575078.  

I wouldn't mess with those just yet. The new builds seem pretty fast. Only improvement I saw with command line was decrease in CPU usage but a big increase in time to completion.
ID: 1575085 · Report as offensive
qbit
Volunteer tester
Avatar

Send message
Joined: 19 Sep 04
Posts: 630
Credit: 6,868,528
RAC: 0
Austria
Message 1575091 - Posted: 20 Sep 2014, 17:44:16 UTC

With my V6 commandline I don't see much improvement in speed. So you would recommend to run it without commandline? Guess I should try, but I'm afraid it could interfere with my vLHC CPU task again, when CPU usage goes up.
ID: 1575091 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1575096 - Posted: 20 Sep 2014, 17:56:35 UTC - in response to Message 1575091.  

NX,

With the v6 you should use the commandline as it will help with the processing of the work unit. I was referring to the use of Commandline in v7 as it currently is. The new v7 apps appear to be much faster without them.

Zalster
ID: 1575096 · Report as offensive
qbit
Volunteer tester
Avatar

Send message
Joined: 19 Sep 04
Posts: 630
Credit: 6,868,528
RAC: 0
Austria
Message 1575100 - Posted: 20 Sep 2014, 18:01:49 UTC

Zalster, I know, I was talking about V7 also, I already run it since yesterday.
ID: 1575100 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1575123 - Posted: 20 Sep 2014, 18:50:47 UTC - in response to Message 1575100.  

NX,

If you want to maximize your GPU utilization then running 2 APs at a time on your 750 is what you want. If you want to decrease the usage of your CPU while doing the maximizing your GPU then use the Command lines. If you just want to blaze through the APs without regard to CPU usage then do 2 APs at a time on the GPU without the Commandline as they will be done quicker than with it. However, I find that it doesn't matter how fast you get the AP done as I'm still waiting 1-2 days for my wingman to validate those results. So, to save CPU life and decrease heat, I use the commandline, thereby increasing the time to complete. I can't give you a yes or no..It depends on what your goals are. Hope this helps.

Zalster
ID: 1575123 · Report as offensive
qbit
Volunteer tester
Avatar

Send message
Joined: 19 Sep 04
Posts: 630
Credit: 6,868,528
RAC: 0
Austria
Message 1575133 - Posted: 20 Sep 2014, 19:20:53 UTC

Thx Zalster, but what I really want to know is if the commandline which I used on V6

-unroll 10 -ffa_block 8192 -ffa_block_fetch 4096 -tune 1 64 4 1 -use_sleep


is also ok for V7 or if I should change anything.

Anyway, atm I run some tasks without commandline to see if there is any difference in speed.
ID: 1575133 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1575157 - Posted: 20 Sep 2014, 19:41:10 UTC - in response to Message 1575123.  
Last modified: 20 Sep 2014, 20:08:37 UTC

NX,

If you want to maximize your GPU utilization then running 2 APs at a time on your 750 is what you want. If you want to decrease the usage of your CPU while doing the maximizing your GPU then use the Command lines. If you just want to blaze through the APs without regard to CPU usage then do 2 APs at a time on the GPU without the Commandline as they will be done quicker than with it. However, I find that it doesn't matter how fast you get the AP done as I'm still waiting 1-2 days for my wingman to validate those results. So, to save CPU life and decrease heat, I use the commandline, thereby increasing the time to complete. I can't give you a yes or no..It depends on what your goals are. Hope this helps.

Zalster

So, did you ever test 1 APv7 running at 100% against 2 APv7 running at 100%? I think you will find there isn't any advantage. With some lower end cards it is a disadvantage. There is nothing magical about running multiple instances, the whole concept was derived because the Multibeam tasks can Not be adjusted to run at 100%. The Only way you can increase the GPU load on a Multibeam task is to run Multiple instances.
AstroPulses ARE NOT Multibeams.
You CAN adjust a single AP instance to run at 100% using the CMDline settings, therefore running Multiple AP instances has No advantage. Except with v6 Blanked tasks where the GPU spends time waiting on the CPU to work the Blanking. AP v7 will Not have the problem with Blanking APv6 has, so, Multiple tasks with APv7 should have No Advantage at all. The reason you don't need CMDlines when running 2 instances is because running 2 instances by themselves will raise the GPU load to 100% so CMDlines are not needed to raise the load to 100% as they are with 1 task.

Running the GPU at 100% is what matters. Adding tasks to a card already running at 100% will not make it run at 200%, or even 101%.

What will be interesting is how many CPU cores will be needed with APv7 since Blanking is not an issue. I can run three ATI cards on two CPUs with APv7 but when I try 3 ATI cards with One CPU I see the GPU Load drop in SIV. This is similar to what I see with Unblanked APv6 tasks, 3 APs with 2 cores works...on my machines.

...should change anything.

You'll have to try things while looking at programs such as GPU-Z and SIV to see how your system responds. Every system responds a little differently, you'll have to test it yourself.
ID: 1575157 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1575215 - Posted: 20 Sep 2014, 21:11:34 UTC - in response to Message 1575157.  
Last modified: 20 Sep 2014, 21:16:39 UTC

When I first started with those v7 AP I did test 1 vs 2 with/without commandlines. You are right, running 1 vs 2 didn't really make much of a different. The time to complete was only 1-2 minutes difference, within the +/- of average time when you doubled it. It was worse when you had the command line, time to complete went up by about 8-10 minutes for 2 APs.

AP v7 will Not have the problem with Blanking APv6 has, so, Multiple tasks with APv7 should have No Advantage at all. The reason you don't need CMDlines when running 2 instances is because running 2 instances by themselves will raise the GPU load to 100% so CMDlines are not needed to raise the load to 100% as they are with 1 task.


That is what I saw as well. There was only 2 reasons for using the CMDlines that I found. 1. Decrease CPU usage and decrease heat on the CPU. 2....<CreditScrew> I have already had this conversation a few times. If you want me to talk about it I will but we've been over this ground in several threads.

Why haven't I tested that 750 again...Meltdown on #1 Cruncher.. MoBo and GPU #1 fryed about 2 hours ago (wasn't the PSU like I first thought). That and Time Capsule bit the dust.. You can see my post over on Beta about that......Sorry been busy.
ID: 1575215 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1575241 - Posted: 20 Sep 2014, 21:46:35 UTC - in response to Message 1575215.  

When I first started with those v7 AP I did test 1 vs 2 with/without commandlines. You are right, running 1 vs 2 didn't really make much of a different. The time to complete was only 1-2 minutes difference, within the +/- of average time when you doubled it. It was worse when you had the command line, time to complete went up by about 8-10 minutes for 2 APs.

AP v7 will Not have the problem with Blanking APv6 has, so, Multiple tasks with APv7 should have No Advantage at all. The reason you don't need CMDlines when running 2 instances is because running 2 instances by themselves will raise the GPU load to 100% so CMDlines are not needed to raise the load to 100% as they are with 1 task.


That is what I saw as well. There was only 2 reasons for using the CMDlines that I found. 1. Decrease CPU usage and decrease heat on the CPU. 2....<CreditScrew> I have already had this conversation a few times. If you want me to talk about it I will but we've been over this ground in several threads.

Why haven't I tested that 750 again...Meltdown on #1 Cruncher.. MoBo and GPU #1 fryed about 2 hours ago (wasn't the PSU like I first thought). That and Time Capsule bit the dust.. You can see my post over on Beta about that......Sorry been busy.

Ouch.
Well, my theory is when you use the same CMDline setting running 2 instances as you do to raise a single task to 100% you are overloading the system. It's just a theory, but it seems to agree with your experience.

My sympathies toward your loss.
ID: 1575241 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1575257 - Posted: 20 Sep 2014, 22:07:57 UTC - in response to Message 1575241.  
Last modified: 20 Sep 2014, 22:09:25 UTC

It's ok, I just keep remember this line,

https://www.youtube.com/watch?v=wRxHYHPzs7s

edit...
good excuse to see what the GTX 980s can do...
ID: 1575257 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1575366 - Posted: 21 Sep 2014, 2:52:06 UTC - in response to Message 1575100.  
Last modified: 21 Sep 2014, 2:55:02 UTC

Zalster, I know, I was talking about V7 also, I already run it since yesterday.

Well, the only place you're supposed to Run V7 is at Beta using the V7 Tasks. AP V7 Results will be different using V6 tasks and shouldn't Validate. You should Run V6 Apps with V6 tasks, not AstroPulse v7 Windows x86 rev 2690
ID: 1575366 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1575379 - Posted: 21 Sep 2014, 4:02:23 UTC - in response to Message 1575133.  
Last modified: 21 Sep 2014, 4:02:59 UTC

NX-01 wrote:
Thx Zalster, but what I really want to know is if the commandline which I used on V6

-unroll 10 -ffa_block 8192 -ffa_block_fetch 4096 -tune 1 64 4 1 -use_sleep


is also ok for V7 or if I should change anything.

Anyway, atm I run some tasks without commandline to see if there is any difference in speed.

AFAIK from Raistmer the 'rule of thumb', the -ffa_block & -ffa_block_fetch values for APv6 is to take /2 for APv7.

Example for your APv7:
-unroll 10 -ffa_block 4096 -ffa_block_fetch 2048 -tune 1 64 4 1 -use_sleep


But like always if you would like to know and use the best/fastest params you should make bench test runs.
ID: 1575379 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : the latest on release of AP_v7?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.