Update for GPU AP (both NV and ATi) to rev560

Message boards : Number crunching : Update for GPU AP (both NV and ATi) to rev560
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1215569 - Posted: 8 Apr 2012, 9:27:49 UTC - in response to Message 1215318.  

I also saw that the sbs can be set in <cmdline>-instances_per_device 2 -unroll 9 -ffa_block 8192 -ffa_block_fetch 4096 -sbs 256 -hp</cmdline>

default is 128

I saw somewhere that it isn't implemented yet.

But in my stderr it looks like it is set to 256....
Can someone explain this please?



It was no effect on build included into installer. In r560 and later it will check amount of needed memory (based on unroll value) vs limits imposed with this -sbs switch and issue warning if requested memory exceeds imposed limit. Memory will be allocated anyway for now, but user at least will know that app usus more memory for single buffer than he/she thought.
ID: 1215569 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1215570 - Posted: 8 Apr 2012, 9:30:06 UTC - in response to Message 1215322.  

You should increase unroll factor to 10 at least to get better speed on your card.
Also FFA_block and FFA_block_fetch.

It doesn't happen often that someone talks to me in riddles, but now it's happened. :-)

Care to elaborate, please?
Just think I am a newbie. And be gentle with that whip. Nice....


Please look into app_info's <cmdline> tag.
It contains mentioned params with some default values. Increase them as Mike suggested. For more info please look into app's release notes.
ID: 1215570 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1215571 - Posted: 8 Apr 2012, 9:31:31 UTC - in response to Message 1215323.  


Unroll 10 - 12 should work on high end cards.
Some cards can cope with 16.


I use 16 for my HD6950 for few months already w/o issues.
Perhaps bigger possible too but still to test.
ID: 1215571 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1215572 - Posted: 8 Apr 2012, 9:33:51 UTC - in response to Message 1215553.  



Although, it's "set", not "setted".
The verb goes set, set, set. Very easy that one. :P


Thanks for correction. As non-native english speaker I tend to do mistakes :D
ID: 1215572 · Report as offensive
Profile TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 505
Credit: 69,523,653
RAC: 10
Sweden
Message 1215578 - Posted: 8 Apr 2012, 9:54:12 UTC - in response to Message 1215316.  
Last modified: 8 Apr 2012, 9:58:07 UTC

I've read somewhere that DATA_CHUNK_UNROLL should be setted to: half of the Max compute number units:

Can anyone confirm this please??

//TRuEQ



Yes, it is supposed to be "set" and not "setted".

I wonder why I didn't see that.
It is often(offen) after I've written something and then click ok to post it I'v(e) need to click on the "edit" button to fix(correct) something I wrote(I've written) incorrectly or just just missed to write something(did I wrote that?)...

I like (to) write stuff in English and then see the errors I make.
It's kinda fun.

It took me several seconds to understand "smth" was actually something.
But I only read that.
:)

Thank you for all the (i)nfo(rmation) here.
It makes sence.

//TRuEQ
//TRuEQ
TRuEQ & TuVaLu
ID: 1215578 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1215587 - Posted: 8 Apr 2012, 10:56:12 UTC - in response to Message 1215572.  



Although, it's "set", not "setted".
The verb goes set, set, set. Very easy that one. :P


Thanks for correction. As non-native english speaker I tend to do mistakes :D

It's in there among cast, cost, cut, hit, hurt, let, set, shut and slit. Possibly some others too. Just check the list.

And here ended my grammar lessons. :)
ID: 1215587 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1215665 - Posted: 8 Apr 2012, 15:32:56 UTC

Well, I finished one.http://setiathome.berkeley.edu/result.php?resultid=2388692084 Waiting for my wingman to see how I did. Looks okay but I'm a little worried about a message I got. Did I mess up putting everything in place?

INFO: can't open binary kernel file, continue with recompile...
Info : Building Program (source, clBuildProgram):main kernels: OK code 0


INFO: binary kernel file created
WARNING: can't open binary kernel file for oclFFT plan, continue with recompile...



PROUD MEMBER OF Team Starfire World BOINC
ID: 1215665 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1215668 - Posted: 8 Apr 2012, 15:40:39 UTC - in response to Message 1215665.  
Last modified: 8 Apr 2012, 16:00:27 UTC

Well, I finished one.http://setiathome.berkeley.edu/result.php?resultid=2388692084 Waiting for my wingman to see how I did. Looks okay but I'm a little worried about a message I got. Did I mess up putting everything in place?

INFO: can't open binary kernel file, continue with recompile...
Info : Building Program (source, clBuildProgram):main kernels: OK code 0


INFO: binary kernel file created
WARNING: can't open binary kernel file for oclFFT plan, continue with recompile...

The latest app's compile a couple of files the first time they run, you should have similar files to these (But with your GPU or CPU name instead):

AstroPulse_Kernels_r560.cl_GeForce GTX 460.bin_V6

oclFFTplan_GeForce GTX 460_32768_r560.bin

r560_IntelRCoreTMi72600KCPU340GHz.wisdom

Claggy
ID: 1215668 · Report as offensive
Profile X-Files 27
Avatar

Send message
Joined: 17 May 99
Posts: 104
Credit: 111,191,433
RAC: 0
Canada
Message 1215689 - Posted: 8 Apr 2012, 16:27:24 UTC - in response to Message 1215567.  

This system doesn't suffer from excessive CPU usage:
Win7, i5-2500k
GTX 570
296.35 (quadro driver using modded inf)




I think at most its using 20% CPU.


Good to know this!
Now it's interesting to know what caused that, your driver version, your Os version or your GPU version... or smth we even can't imagine ;)


Whats bugling me is that I can't replicate this low cpu usage on my other rig - always settled at 85%.
The good thing is that there's no more sleep bug in this NV driver version.
ID: 1215689 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1215699 - Posted: 8 Apr 2012, 17:40:53 UTC - in response to Message 1215590.  


What is "CPU affinity ajustment"?

May I suggest the word "ADJUSTMENT"?


Hehe, one more typo noticed, thanks :)
ID: 1215699 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1215765 - Posted: 8 Apr 2012, 19:29:56 UTC

My first one validated this morning, so it is looking good. Thought it interesting that it validated against an ATI 2600 with only 256MByte memory.

Workunit is http://setiathome.berkeley.edu/workunit.php?wuid=966607530 -- won't be there long.

Glad to see something go right after a rough week having to replace the power supply & Motherboard. Looks like another winner, Raistmer. Hope the Nvidia driver issues get resolved someday soon.

If there are typos here, notify me at autodiscard@trash.com. :=)
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1215765 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1215794 - Posted: 8 Apr 2012, 20:03:04 UTC - in response to Message 1215699.  


What is "CPU affinity ajustment"?

May I suggest the word "ADJUSTMENT"?


Hehe, one more typo noticed, thanks :)

If that typo is the worst thing you do this week you will have an excellent week!
ID: 1215794 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1215820 - Posted: 8 Apr 2012, 21:15:34 UTC - in response to Message 1215765.  

My first one validated this morning, so it is looking good. Thought it interesting that it validated against an ATI 2600 with only 256MByte memory.

Workunit is http://setiathome.berkeley.edu/workunit.php?wuid=966607530 -- won't be there long.

Glad to see something go right after a rough week having to replace the power supply & Motherboard. Looks like another winner, Raistmer. Hope the Nvidia driver issues get resolved someday soon.

If there are typos here, notify me at autodiscard@trash.com. :=)


She is running Brook+ version.



With each crime and every kindness we birth our future.
ID: 1215820 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1216213 - Posted: 9 Apr 2012, 17:29:25 UTC
Last modified: 9 Apr 2012, 17:30:05 UTC

Hi,

If there were an APV6-executable for NV560ti448 on 64bit linux what kind of performance could I expect from it?

My rig uses CPU for APV6(AVX) and setiathomeEnhanced(ssse3) and GPU for setiathomeEnhanced.

The APV6 with AVX gives this kind of performance (the host 5643864): http://setiathome.berkeley.edu/workunit.php?wuid=966928760

petri33
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1216213 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1216223 - Posted: 9 Apr 2012, 18:29:48 UTC - in response to Message 1216213.  

Hi,

If there were an APV6-executable for NV560ti448 on 64bit linux what kind of performance could I expect from it?

My rig uses CPU for APV6(AVX) and setiathomeEnhanced(ssse3) and GPU for setiathomeEnhanced.

The APV6 with AVX gives this kind of performance (the host 5643864): http://setiathome.berkeley.edu/workunit.php?wuid=966928760

petri33

The GPU performance should not be affected much by what OS the CPU is running, perhaps these valid AP v6 tasks on a 560ti under Windows give a good enough approximation. As you know, the 560ti448 should outperform that regular 560ti somewhat.
                                                                  Joe
ID: 1216223 · Report as offensive
Horacio

Send message
Joined: 14 Jan 00
Posts: 536
Credit: 75,967,266
RAC: 0
Argentina
Message 1217184 - Posted: 12 Apr 2012, 7:26:48 UTC

I have a question about the NV 560, Process Lasso show this app as having a default affinity to cores 0,1 while all the other tasks running from boinc have the default afinity set to all cores...

Is this on pourpouse? I mean, can I extend the affinity to all cores so it dosnt get restricted? Can this improove (or break) the performance?
ID: 1217184 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1217206 - Posted: 12 Apr 2012, 8:57:59 UTC - in response to Message 1217184.  


<cmdline>-no_cpu_lock -instances_per_device 1 ......</cmdline>


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1217206 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1217225 - Posted: 12 Apr 2012, 11:50:20 UTC

So what's the advantage/disadvantage using that option or not using it?
ID: 1217225 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1217245 - Posted: 12 Apr 2012, 12:44:57 UTC - in response to Message 1217225.  

So what's the advantage/disadvantage using that option or not using it?

If your GPU app is limited to CPUs 0 & 1 which are also running tasks then when you are loading work into the GPU it has to share CPU time with the other process while loading. With very fast GPUs this could add up to a significant amount of time in the long run. Even though it may only be a few seconds per task.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1217245 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1217272 - Posted: 12 Apr 2012, 13:55:09 UTC

Correct in principle.

It helps very much on high blanked units.
Since blanking is calculated by CPU its switching between cores while computing.
Blanking sometimes is fragmented so it uses different cores.



With each crime and every kindness we birth our future.
ID: 1217272 · Report as offensive
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Update for GPU AP (both NV and ATi) to rev560


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.