Public beta for nVidia AstroPulse, rev 521

Message boards : Number crunching : Public beta for nVidia AstroPulse, rev 521
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 30 · Next

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1128927 - Posted: 17 Jul 2011, 20:48:06 UTC
Last modified: 17 Jul 2011, 20:52:10 UTC

Alpha and limited beta on Lunatics were successful so I'll start public beta test for this application.
Binary can be downloaded here: http://files.mail.ru/W3B0CG (as usual it's your own responsibility to do all antiviral tests (just reminder)).

What is known already and what requires further testing:

1) App can silently fail reporting overflow (30 signals foud) in signle pulses.
To find 30 single pulses w/o 30 repetitive ones is highly unlikely and can be sign of problems. So far reducing unroll factor from default solve this issue (all involved hosts can produce valid tasks with correctly chosen unroll factor).
What is correct value - task for you, beta tester, to find.

2) 2 hosts already reported greatly increased CPU consumption when running with 27x.xx drivers. Rolling back to 26x.xx ones solve this issue in both cases.
But for now both these hosts are dual NV/ATi GPUs.
Your (as beta tester) goal is to check if this behavior common or not.

To make oneself familiar with app options and behavior I strongly recommend to look through ATi AP release notes and corresponding threads. NV and ATi apps have much more common than different (actually, ATi version could be run on NV GPUs in many cases).

Here is short excerpt:

There are command line switches that can be used for app performance tuning:

-unroll N -sets DATA_CHUNK_UNROLL variable to N. This allows to do N data chunks per FindSinglePulse kernel call improving (in most cases) performance but increasing GPU memory requirements. On low-end GPUs it may be worth to use lower values. Default setted to 10.
-ffa_block 8192 (default value) - defines how many different periods GPU will process per single kernel call
-ffa_block_fetch 2048 (default value) - defines how many threads will be used in FFA initial fetch kernel
Rules for using these values:
-ffa_block_fetch <number> can be used only if -ffa_block <number> already listed in command line
numbers should be even,better if they will be power of 2, ffa_block should be divisible by ffa_block_fetch.
If you experience lags during application execution try to decrease these values.

-instances_per_device <N> allows to run N app instances per single device
-instances_per_device 1 (default value)

Once again, keep in mind: to run 2 instances on single GPU one should use -instances_per_device 2 parameter and set <count>0.5</count> inside app_info.xml file

-hp will instruct application to rise its priority class to high. Useful when host under high non-BOINC loads. Also useful if BOINC client itself imposes high load on CPU.

-no_cpu_lock - disables affinity setting

Example of app_info section (this one known to work on my own GSO9600):

	<app>
		<name>astropulse_v505</name>
	</app>
	<file_info>
        	<name>ap_5.06_win_x86_SSE3_OpenCL_NV_r521.exe</name>
        	<executable/>
    	</file_info>
	<file_info>
	    <name>AstroPulse_Kernels_r521.cl</name>
	    <executable/>
	</file_info>
 <app_version>
        <app_name>astropulse_v505</app_name>
        <version_num>505</version_num>
        <platform>windows_intelx86</platform>
        <avg_ncpus>0.04</avg_ncpus>
        <max_ncpus>0.20</max_ncpus>
        <plan_class>cuda</plan_class>
	<cmdline>-instances_per_device 1 -hp -unroll 10</cmdline>	
	<flops>30987654321</flops>
             <file_ref>
                  <file_name>ap_5.06_win_x86_SSE3_OpenCL_NV_r521.exe</file_name>
                  <main_program/>                           
           </file_ref>
    <file_ref>
        <file_name>AstroPulse_Kernels_r521.cl</file_name>
        <copy_file/>
    </file_ref>
   <coproc>
   <type>CUDA</type>
   <count>1</count>
   </coproc>
 </app_version>


Please, report any findings about this app in this thread.
ID: 1128927 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1128947 - Posted: 17 Jul 2011, 21:24:13 UTC - in response to Message 1128927.  
Last modified: 17 Jul 2011, 21:28:41 UTC

Anyone with Dual vendor hosts, ie with Nvidia & ATI GPU's and want to run OpenCL apps on Nvidia and ATI devices at the same time need to set the -instances_per_device to 2 for all OpenCL apps, otherwise only One OpenCL app will run at once,
ie Boinc will start one app on each device, but progress will only occur on one of them, Count value should be left as it is.

Claggy
ID: 1128947 · Report as offensive
Profile TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 505
Credit: 69,523,653
RAC: 10
Sweden
Message 1128949 - Posted: 17 Jul 2011, 21:25:26 UTC

Will there be support for other cpu then sse3 in future releases?
TRuEQ & TuVaLu
ID: 1128949 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1128954 - Posted: 17 Jul 2011, 21:29:16 UTC - in response to Message 1128949.  

Will there be support for other cpu then sse3 in future releases?

It's GPU app, I don't think it's very important, unless you have CPU with lesser capabilities..
ID: 1128954 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1128955 - Posted: 17 Jul 2011, 21:31:57 UTC - in response to Message 1128954.  

Will there be support for other cpu then sse3 in future releases?

It's GPU app, I don't think it's very important, unless you have CPU with lesser capabilities..

I have a SSE only XP3200 with a 8400 GS ;-)

Claggy
ID: 1128955 · Report as offensive
Profile TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 505
Credit: 69,523,653
RAC: 10
Sweden
Message 1128958 - Posted: 17 Jul 2011, 21:33:33 UTC

An older AMD http://setiathome.berkeley.edu/show_host_detail.php?hostid=6031403

I am not at it's location now so I can't run CPU-z at the moment...

TRuEQ & TuVaLu
ID: 1128958 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1128960 - Posted: 17 Jul 2011, 21:34:05 UTC - in response to Message 1128955.  

Will there be support for other cpu then sse3 in future releases?

It's GPU app, I don't think it's very important, unless you have CPU with lesser capabilities..

I have a SSE only XP3200 with a 8400 GS ;-)

Claggy


When you will be able to run OpenCL runtime there we will talk again ;)
ID: 1128960 · Report as offensive
Profile TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 505
Credit: 69,523,653
RAC: 10
Sweden
Message 1128961 - Posted: 17 Jul 2011, 21:35:10 UTC

Nice Claggy :)

Then I'll try get it going on my AMD this week.

//TQ
TRuEQ & TuVaLu
ID: 1128961 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1128962 - Posted: 17 Jul 2011, 21:35:25 UTC - in response to Message 1128958.  

An older AMD http://setiathome.berkeley.edu/show_host_detail.php?hostid=6031403

I am not at it's location now so I can't run CPU-z at the moment...


AMD Athlon64X2 is SSE3 capable.
ID: 1128962 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1128963 - Posted: 17 Jul 2011, 21:36:20 UTC

Well I'm trying it. Wonder if anyone can say beforehand that it's a good idea to change something in the command line when using a GTX460? I've just copied the app_info as supplied by Raistmer as is.
ID: 1128963 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1128964 - Posted: 17 Jul 2011, 21:38:21 UTC - in response to Message 1128949.  

One word of warning, be prepared for very high to completion estimates and running high priority until the DCF adjusts.

Another little thing, if you try to run more than one per GPU but forget to change the instances per device to two (or more) a second one will try to start but will only count up the elapsed time without actually doing any work.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1128964 · Report as offensive
Jamie
Volunteer tester

Send message
Joined: 5 Apr 06
Posts: 162
Credit: 9,867,955
RAC: 0
United Kingdom
Message 1128968 - Posted: 17 Jul 2011, 21:39:32 UTC - in response to Message 1128963.  

here's my cmdline options for my 465:

<cmdline>-ffa_block 6144 -ffa_block_fetch 1536 -hp</cmdline>

Been working fine like this for a few weeks now
ID: 1128968 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1128970 - Posted: 17 Jul 2011, 21:41:54 UTC - in response to Message 1128963.  
Last modified: 17 Jul 2011, 21:44:41 UTC

Well I'm trying it. Wonder if anyone can say beforehand that it's a good idea to change something in the command line when using a GTX460? I've just copied the app_info as supplied by Raistmer as is.

I've been running basically the same as Raistmer's app_info, but without the -hp switch (on my GTX460) and -ffa_block 6144 -ffa_block_fetch 1536 set on both ATI & Nvidia GPU's

Claggy
ID: 1128970 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1128971 - Posted: 17 Jul 2011, 21:42:18 UTC

I'm going to give 1 AP the whole GPU to see how it works, if it goes well I plan to try 1 AP and 1 or 2 MBs to try that.
ID: 1128971 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1128977 - Posted: 17 Jul 2011, 21:50:17 UTC - in response to Message 1128963.  
Last modified: 17 Jul 2011, 22:08:06 UTC

Well I'm trying it. Wonder if anyone can say beforehand that it's a good idea to change something in the command line when using a GTX460? I've just copied the app_info as supplied by Raistmer as is.



I'm pretty much running just what Raistmer gave on my GTS 450 but have changed the count to .51 and 1 for the number per device for the AP and .49 for the MB. That way I will run two MBs or one MB and one AP but wont try to start a second AP. I also changed the unroll to four because of some trouble I was having but I think it was more my Over clock than an app problem. I just haven't changed it back to the default ten yet.

Edit: I was just informed I'm an idiot! :-) No, just that the default is actually ten on the unroll. The app I was given early was set at six and I didn't look to see what Raistmer gave out.

edit 2: Ok, I'm back to the default ten. Now to wait for an AP to start to see how it does. That could have been slowing me down some.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1128977 · Report as offensive
halfempty
Avatar

Send message
Joined: 2 Jun 99
Posts: 97
Credit: 35,236,901
RAC: 114
United States
Message 1128991 - Posted: 17 Jul 2011, 22:29:12 UTC

Just installed it on my GTS 450. I changed the unroll to 8 because I think the 450 has 4 processor clusters, so I wanted to make unroll a factor of that. When I ran an ATi I had to change -ffa_block 6144 -ffa_block_fetch 1536 for screen responsiveness, but I'll wait on that one till it starts crunching. That may be several days the way AP units are being rationed these days.
ID: 1128991 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1128997 - Posted: 17 Jul 2011, 22:43:03 UTC - in response to Message 1128991.  

That may be several days the way AP units are being rationed these days.

I've disabled CPU & MB work so I'm only requesting AP for the GPU, but so far no APs. They're in demand since it seems there's no MB work being split at the moment.
ID: 1128997 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1129014 - Posted: 17 Jul 2011, 23:37:33 UTC - in response to Message 1128991.  

Just installed it on my GTS 450. I changed the unroll to 8 because I think the 450 has 4 processor clusters, so I wanted to make unroll a factor of that. When I ran an ATi I had to change -ffa_block 6144 -ffa_block_fetch 1536 for screen responsiveness, but I'll wait on that one till it starts crunching. That may be several days the way AP units are being rationed these days.



I thought about what you said about unroll set to 8 so I decided to try it. The AP task still ran about the same but I noticed the MB task running with it on my GTS 450 suddenly slowed way down. When I checked it's time to completion guesstimate was double what it had been. I stopped and went back up to 10 and now the MB task is running much faster again.



PROUD MEMBER OF Team Starfire World BOINC
ID: 1129014 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1129023 - Posted: 18 Jul 2011, 0:44:29 UTC

Please educate me. Do you just drop the files into the Seti project folder and modify the app_info? Do you have to do anything with the stock AK_V8 AP app? Like delete the files or remove the relevant section from the app_info? How does the Manager know which application to schedule for using the CUDA Open_CL app?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1129023 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1129045 - Posted: 18 Jul 2011, 2:18:42 UTC - in response to Message 1129023.  

Sorry it took so long to answer. Yes, drop the files in the SETI projects folder then copy and paste the app_ info section into your app_info file. If you already have regular CPU APs or if you still want to run them then leave the existing app section as it is. If you don't have and/or do not want to run anymore CPU APs then you can remove that section. Remember to stop SETI and BOINC before you do anything to your app_info file and to open it in notepad. Also remember to save your changes as .xml.

Once you add the new section to your app_info it will automatically start asking for GPU AP work the same way it asks for the MB cuda work..


PROUD MEMBER OF Team Starfire World BOINC
ID: 1129045 · Report as offensive
1 · 2 · 3 · 4 . . . 30 · Next

Message boards : Number crunching : Public beta for nVidia AstroPulse, rev 521


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.