New ATI OpenCL-based AstroPulse (rev516) released

Message boards : Number crunching : New ATI OpenCL-based AstroPulse (rev516) released
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 1096459 - Posted: 12 Apr 2011, 8:58:24 UTC - in response to Message 1096451.  
Last modified: 12 Apr 2011, 9:01:09 UTC

Any reason why the Pendings from the MB WU's on my rig with this app don't appear in Pendings but the ones from my Dual-Core do?

Quite possibly because your Dual-Core computer is running a much older version of BOINC (v5.10.41). It would be interesting if somebody else could confirm that - it's not been mentioned so far in all the discussions about changes to the pendings list.

Normal behaviour here with 5.10.45 and 5.8.16: pending list is empty though there are pending tasks for both hosts.

Gruß,
Gundolf
[edit]@Blurf: are all pending tasks shown or only very old ones?[/edit]
ID: 1096459 · Report as offensive
aad

Send message
Joined: 3 Apr 99
Posts: 101
Credit: 204,131,099
RAC: 26
Netherlands
Message 1096477 - Posted: 12 Apr 2011, 11:03:04 UTC

My oldest pending is this one;
http://setiathome.berkeley.edu/result.php?resultid=1763087319

Pending credit: 15,982.68
ID: 1096477 · Report as offensive
Profile dskagcommunity
Volunteer tester
Avatar

Send message
Joined: 24 Feb 11
Posts: 43
Credit: 2,901,049
RAC: 0
Austria
Message 1098126 - Posted: 16 Apr 2011, 18:37:35 UTC
Last modified: 16 Apr 2011, 18:47:32 UTC

Im a little noob in BOINC and config things with it ^^

But i cant get any work on my WorkPC for my 4350 Card with Cat11.3 with SDK(OpenCL). I checked my preferences to accept other applications but BOINC says that there is no GPU work for me. Hmm.

Celeron E3200 2*2,4Ghz
2GB RAM
WinXP SP3 32bit
HD4350 512MB
BOINC 6.10.58

In a thread i saw that i possibly change something to GPU_MAX_HEAP_SIZE=256
but where? O.o

Thanks for any halps ^^
ID: 1098126 · Report as offensive
terencewee*

Send message
Joined: 10 Oct 09
Posts: 53
Credit: 7,022,510
RAC: 0
Malaysia
Message 1098127 - Posted: 16 Apr 2011, 18:43:16 UTC
Last modified: 16 Apr 2011, 19:12:31 UTC

For a couple of days I left 1 of my cruncher (TREX, C2Duo E6600g0,8GB, ATI5850) running 2 instances after updated from r456 to r516 - then I noticed slow report/credits at 2x CPU-projects (PG+AQUA-fp).

I also noticed that after running 2 AP instances from start, BOINC will only run 1 AP instance some time later.

I took a look at Tasks-Manager, ap_5.06_win_x86_SSE2_OpenCL_ATI_r516.exe is taking ~80mb RAM, 30-45% CPU.

I'm experiencing high CPU load, no wonder!

So far, I've tried:
1] Switching -hp on and off
2] Changing -ffa block & fetch sizes
3] Running only 1 instance
4] Initially was using Catalyst APP 11.2, updated to 11.3 - same problem
5] Rebooting
6] Ran Collatz & MilkyWay WU - both ran fine with valid WUs, 0-4% CPU load

Processed AP ran and reported without problem.

The cruncher is running BOINC 6.12.15.
There was 1 validated AP (already removed from DB) and there are 20+ completed AP waiting validation.


I've checked another cruncher, 0-4% CPU load - lots of validated/pending AP. Normal...

Please help me to resolve this problem - the cruncher has got some 60+ AP WUs to process.

Failing which, I will rename r456.exe as r516.exe and replace back the old .cl file to trick BOINC to use r456 instead - won't want to lose these AP WUs. :P

Thank you in advance.

Edit: 7] Installing SDK 2.3 does not help.
ID: 1098127 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1098132 - Posted: 16 Apr 2011, 18:58:57 UTC - in response to Message 1098126.  

Im a little noob in BOINC and config things with it ^^

But i cant get any work on my WorkPC for my 4350 Card with Cat11.3 with SDK(OpenCL). I checked my preferences to accept other applications but BOINC says that there is no GPU work for me. Hmm.

Celeron E3200 2*2,4Ghz
2GB RAM
WinXP SP3 32bit
HD4350 512MB
BOINC 6.10.58

In a thread i saw that i possibly change something to GPU_MAX_HEAP_SIZE=256
but where?
O.o

Thanks for any halps ^^

I think you saw that in the Lunatics release notes. Also in the release notes is the location of where to change the value.

With AP work you may not get tasks every time you request them.

SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1098132 · Report as offensive
terencewee*

Send message
Joined: 10 Oct 09
Posts: 53
Credit: 7,022,510
RAC: 0
Malaysia
Message 1098133 - Posted: 16 Apr 2011, 19:01:57 UTC - in response to Message 1098126.  
Last modified: 16 Apr 2011, 19:16:06 UTC

Im a little noob in BOINC and config things with it ^^

But i cant get any work on my WorkPC for my 4350 Card with Cat11.3 with SDK(OpenCL). I checked my preferences to accept other applications but BOINC says that there is no GPU work for me. Hmm.

Celeron E3200 2*2,4Ghz
2GB RAM
WinXP SP3 32bit
HD4350 512MB
BOINC 6.10.58

In a thread i saw that i possibly change something to GPU_MAX_HEAP_SIZE=256
but where? O.o

Thanks for any halps ^^


GPU_MAX_HEAP_SIZE is a Windows system environment variable, I think - off the top of my head, in XP - right click on the Computer icon on your desktop > Properties > Advanced > Environment Variables. Create a new variable in your user variable section.


Also, go thru this check list - may yield some AP:
1] Look at BOINC-Manage Event Log
a+ Message: ATI GPU 0: ATI Radeon HD 4xxxx xxx xxx xxx

b+ Message: SETI@home | Found app_info.xml; using anonymous platform

Post your app_info.xml file if you don't see 1b.

2] Set *only* SETI@Home to accept more work

3] Set cache size to 10 days

4] Hit update every 5+mins (or automate it)

5] You'll catch an AP (hopefully), let it process and report

6] You'll start to get more AP units after that.

Remember to set your cache to 2days after catching enough AP WUs.

I have 60+ AP WUs each on 2 crunchers.

Best of luck!
ID: 1098133 · Report as offensive
Profile dskagcommunity
Volunteer tester
Avatar

Send message
Joined: 24 Feb 11
Posts: 43
Credit: 2,901,049
RAC: 0
Austria
Message 1098134 - Posted: 16 Apr 2011, 19:15:51 UTC
Last modified: 16 Apr 2011, 19:23:39 UTC

Thx for first Infos.

I know where i can set variables..but..öhm i need to give the variable a name. How is it named? and the value is then this GPU HEAP size thing?

Second problem i dont know where to find this app file. loked at project order, boinc main dir and User directory where boinc is also with his projects.

BOINC finds the Graphiccard, so point A is correct.
Point B is negative. dont find any app file.


I created now a file with this content (from the link from threadstarter) but where i must save it?:

<app>
<name>astropulse_v505</name>
</app>
<file_info>
<name>ap_5.06_win_x86_SSE2_OpenCL_ATI_r516.exe</name>
<executable/>
</file_info>
<file_info>
<name>AstroPulse_Kernels.cl</name>
<executable/>
</file_info>
<app_version>
<app_name>astropulse_v505</app_name>
<version_num>506</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>0.04</avg_ncpus>
<max_ncpus>0.20</max_ncpus>
<plan_class>ati13ati</plan_class>
<cmdline>-instances_per_device 1 -hp -unroll 10 -ffa_block 4096 -ffa_block_fetch 2048</cmdline>
<flops>30987654321</flops>
<file_ref>
<file_name>ap_5.06_win_x86_SSE2_OpenCL_ATI_r516.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>AstroPulse_Kernels.cl</file_name>
<copy_file/>
</file_ref>
<coproc>
<type>ATI</type>
<count>1</count>
</coproc>
</app_version>
ID: 1098134 · Report as offensive
terencewee*

Send message
Joined: 10 Oct 09
Posts: 53
Credit: 7,022,510
RAC: 0
Malaysia
Message 1098140 - Posted: 16 Apr 2011, 19:28:47 UTC - in response to Message 1098134.  
Last modified: 16 Apr 2011, 19:33:46 UTC

Thx for first Infos.

I know where i can set variables..but..öhm i need to give the variable a name. How is it named? and the value is then this GPU HEAP size thing?

Second problem i dont know where to find this app file. loked at project order, boinc main dir and User directory where boinc is also with his projects.

BOINC finds the Graphiccard, so point A is correct.
Point B is negative. dont find any app file.


Aso-deska...

GPU_MAX_HEAP_SIZE is the variable name, the value is 256.


0] Exit from BOINC or stop the BOINC service

1] Unpack the RAR file you downloaded from lunatic site

2] Copy/move those 2 files (.exe and .cl) into (YOUR-BOINC-DATA-FOLDER)\projects\setiathome.berkeley.edu

3] Open NotePad, copy and paste below code
<app_info>
	<app>
		<name>astropulse_v505</name>
	</app>
	<file_info>
		<name>ap_5.06_win_x86_SSE2_OpenCL_ATI_r516.exe</name>
		<executable/>
	</file_info>
	<file_info>
		<name>AstroPulse_Kernels.cl</name>
		<executable/>
	</file_info>
	<app_version>
		<app_name>astropulse_v505</app_name>
		<version_num>506</version_num>
		<flops>48000000000</flops>
		<avg_ncpus>0.04</avg_ncpus>
		<max_ncpus>0.20</max_ncpus>
		<plan_class>ati13ati</plan_class>
		<cmdline>-instances_per_device 1 -hp -unroll 10 -ffa_block 4096 -ffa_block_fetch 2048</cmdline>
<--
Other suggested options posted by Claggy
-ffa_block 6144 -ffa_block_fetch 1536
-ffa_block 4096 -ffa_block_fetch 1024
-ffa_block 2048 -ffa_block_fetch 512
-->
		<coproc>
			<type>ATI</type>
			<count>1</count>
		</coproc>
		<file_ref>
			<file_name>ap_5.06_win_x86_SSE2_OpenCL_ATI_r516.exe</file_name>
			<main_program/>
		</file_ref>
		<file_ref>
			<file_name>AstroPulse_Kernels.cl</file_name>
			<copy_file/>
		</file_ref>
	</app_version>
</app_info>


4] Save as app_info.xml into (YOUR-BOINC-DATA-FOLDER)\projects\setiathome.berkeley.edu
++ Ensure you are using ANSI encoding and the file extension is XML not txt

5] Start BOINC-manager/service

AP-go-go-go! :)
ID: 1098140 · Report as offensive
Profile dskagcommunity
Volunteer tester
Avatar

Send message
Joined: 24 Feb 11
Posts: 43
Credit: 2,901,049
RAC: 0
Austria
Message 1098149 - Posted: 16 Apr 2011, 19:47:30 UTC
Last modified: 16 Apr 2011, 19:48:31 UTC

Thx!!!!!!! :)

Seems ok, need to rename the exe filename in app config file cos he dont find it with your past.. perhaps 1 letter is wrong. so now he founds everything. Now i know what anybody means with this app info thing and custom exe´s :)

Got no work to do, so i restarted boinc twice and now i got work!! veryyyyyyyyyyyyyyyyyyyy slow download but doesnt matter ^^

Again very big thx to ya!! Happy crunching and a good afternoon :)
*continous looking nasaTV while downloading :)*
ID: 1098149 · Report as offensive
terencewee*

Send message
Joined: 10 Oct 09
Posts: 53
Credit: 7,022,510
RAC: 0
Malaysia
Message 1098350 - Posted: 17 Apr 2011, 9:24:10 UTC - in response to Message 1098149.  
Last modified: 17 Apr 2011, 9:33:18 UTC


Again very big thx to ya!! Happy crunching and a good afternoon :)
*continous looking nasaTV while downloading :)*


I'm very glad to know S@H got another AP GPU cruncher; you are welcome.

Now, back to my problem of high CPU-load - anybody (using r516) having the same problem? Is there a solution?

Thanks in advance.

Edit: AP WUs by the cruncher - validated successfully.
ID: 1098350 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1098356 - Posted: 17 Apr 2011, 9:59:04 UTC - in response to Message 1098350.  


Again very big thx to ya!! Happy crunching and a good afternoon :)
*continous looking nasaTV while downloading :)*


I'm very glad to know S@H got another AP GPU cruncher; you are welcome.

Now, back to my problem of high CPU-load - anybody (using r516) having the same problem? Is there a solution?

Thanks in advance.

Edit: AP WUs by the cruncher - validated successfully.


Did you check AP timings graph I posted recently? Do you see higher CPU times than on graph?
ID: 1098356 · Report as offensive
aad

Send message
Joined: 3 Apr 99
Posts: 101
Credit: 204,131,099
RAC: 26
Netherlands
Message 1098364 - Posted: 17 Apr 2011, 10:56:06 UTC

Results of my HD6970 Win 7 64 cat 11.4
With 3 wu's together on the GPU
http://setiathome.berkeley.edu/results.php?hostid=5750222&offset=0&show_names=0&state=0&appid=5
ID: 1098364 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1098367 - Posted: 17 Apr 2011, 12:35:30 UTC - in response to Message 1098356.  


Did you check AP timings graph I posted recently? Do you see higher CPU times than on graph?


Sorry - where is this graph?
ID: 1098367 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1098369 - Posted: 17 Apr 2011, 12:52:53 UTC - in response to Message 1098367.  

ID: 1098369 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1098373 - Posted: 17 Apr 2011, 13:21:13 UTC

Raistmer:

I don't see any CPU utilization on these graphs; am I nuts?

As I mentioned in another thread, I, like the fellow up above here, was seeing very high CPU utilization for my ATI rig (4 HD 5550s).
ID: 1098373 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1098377 - Posted: 17 Apr 2011, 13:37:17 UTC - in response to Message 1098373.  
Last modified: 17 Apr 2011, 13:38:52 UTC

Raistmer:

I don't see any CPU utilization on these graphs; am I nuts?

As I mentioned in another thread, I, like the fellow up above here, was seeing very high CPU utilization for my ATI rig (4 HD 5550s).


CPU means CPU-only time while elapsed means wall-clock time spent for both CPU and GPU processing.

[For GPU apps CPU time usually open symbol while elapsed is filled symbol of same color, only early versions don't follow this rule. But in any case there is legend on graph]
ID: 1098377 · Report as offensive
terencewee*

Send message
Joined: 10 Oct 09
Posts: 53
Credit: 7,022,510
RAC: 0
Malaysia
Message 1098381 - Posted: 17 Apr 2011, 13:44:10 UTC - in response to Message 1098356.  


Did you check AP timings graph I posted recently? Do you see higher CPU times than on graph?


*red-faced* No, I turn off image in my community preferences - but since corrected.

Base on 4 validated AP:
http://setiathome.berkeley.edu/result.php?resultid=1872783622
FFA thread block override value:8192
FFA thread fetchblock override value:4096
Run time 4,607.49
CPU time 1,046.74
percent blanked: 7.18

http://setiathome.berkeley.edu/result.php?resultid=1872769917
FFA thread block override value:8192
FFA thread fetchblock override value:4096
Run time 2,848.05
CPU time 689.87
percent blanked: 2.78

http://setiathome.berkeley.edu/result.php?resultid=1872135847
FFA thread block override value:8192
FFA thread fetchblock override value:128
Run time 4,403.89
CPU time 710.10
percent blanked: 2.54

http://setiathome.berkeley.edu/result.php?resultid=1871594876
FFA thread block override value:4096
FFA thread fetchblock override value:2048
Run time 9,880.01
CPU time 643.08
percent blanked: 0.00


I presume I should look at (CPU512x2 - green triagle, not-filled) as there is no 516 in the apgraph.png. Is this correct?

A guesstimate is the CPU time is almost similar.



Update:
I've since tested with config of:
<cmdline>-instances_per_device 2 -unroll 12 -ffa_block 16384 -ffa_block_fetch 8192</cmdline>

There is no lag at these values.

I notice CPU-load is down to a more acceptable level - there is a tick-tock of 0-4%, then 8-20% (most of the time). Some tocks will shoot up to 36%.

Previously CPU-load is in the 30%-45% range (constantly).

I surmise having higher ffa_blocks actually help in my case. :)
... and having lower ffa-blocks gives higher CPU-load?? Interesting, no? :)

<cmdline>-instances_per_device 2 -unroll 12 -ffa_block 32768 -ffa_block_fetch 8192</cmdline>
Crashes the app - so no go.



ID: 1098381 · Report as offensive
terencewee*

Send message
Joined: 10 Oct 09
Posts: 53
Credit: 7,022,510
RAC: 0
Malaysia
Message 1098382 - Posted: 17 Apr 2011, 13:45:11 UTC - in response to Message 1098373.  
Last modified: 17 Apr 2011, 13:48:44 UTC

Raistmer:

I don't see any CPU utilization on these graphs; am I nuts?

As I mentioned in another thread, I, like the fellow up above here, was seeing very high CPU utilization for my ATI rig (4 HD 5550s).


by "the fellow up above here" you mean moi? :)
(and not Him)
ID: 1098382 · Report as offensive
terencewee*

Send message
Joined: 10 Oct 09
Posts: 53
Credit: 7,022,510
RAC: 0
Malaysia
Message 1098386 - Posted: 17 Apr 2011, 13:52:49 UTC
Last modified: 17 Apr 2011, 13:59:14 UTC

@ Raistmer:

I got a new validated AP with the new (higher) ffa-block:
http://setiathome.berkeley.edu/result.php?resultid=1872783764
Run time 3,721.21
CPU time 541.29
percent blanked: 0.00
Number of app instances per device setted to:2
DATA_CHUNK_UNROLL setted to:12
FFA thread block override value:16384
FFA thread fetchblock override value:8192

Compared to the previous 0.00 blanked AP, the CPU time is down - but could just mean because Wall-time is down too.

Edit:
CPU down by 1min 42secs
GPU down by 1hour 42mins
ID: 1098386 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1098388 - Posted: 17 Apr 2011, 14:01:43 UTC - in response to Message 1098381.  


I surmise having higher ffa_blocks actually help in my case. :)
... and having lower ffa-blocks gives higher CPU-load?? Interesting, no? :)



Dependance non-linear and not monotonic so each GPU probably has its own sweet spot in parameter space (that's why I introduced param tuning, in earlier versions all params were hardwired with defaults measured as good for my own GPU).

But too low block values mean too many kernel calls so CPU overhead should increase. From other side, too big values mean waste of thread load (cause many thread will process unused data, different periods require different array lengths and threads block always process rectangular data block). So, at too high values overhead (not CPU this time but GPU one, elapsed time) should increase again.
Best value somewhere in between.
ID: 1098388 · Report as offensive
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : New ATI OpenCL-based AstroPulse (rev516) released


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.