r503 OpenCL AstroPulse for ATI GPUs beta testing

Message boards : Number crunching : r503 OpenCL AstroPulse for ATI GPUs beta testing
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 9 · Next

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1066370 - Posted: 14 Jan 2011, 0:21:56 UTC - in response to Message 1066357.  

r504 has increased unroll factor (12 instead of 10) and memory requirements also.


IMHO it would be good to make this a command line parameter. E.G. my little ION can handle only 3 (more than 3 causes -5 errors). Whereas my 6970 with it's 2GB could handle a lot more than 12.


Maybe. Will look at it.
ID: 1066370 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1066393 - Posted: 14 Jan 2011, 1:12:16 UTC - in response to Message 1066370.  

that workunit appears to have restarted quite a bit. about every 5% or so. THis seems like odd behavior


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1066393 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1066405 - Posted: 14 Jan 2011, 1:38:52 UTC - in response to Message 1066369.  

Raistmer wrote:
Then it's worth to update. New SDK should make use all available memory. AP params will change kernel length but not memory requirements.

OK, so my new plan is to run a few more hours to get the clear trend of %completion vs. elapsed for this result in current config.

Then install the 10.12 APP ATI drivers, then install the 2.3 SDK, then reboot and resume running r504 on the same WU.

To keep things clear and clean, I'd rather just install the SDK to see if that alone makes a material difference, but the compatibility list says a minimum of 10.11 driver is required, and as I am at 10.7 I'll just jump to current 10.12 APP.

One other possible concern--my low-end Graphics card HD 4550 is only listed under "beta compatibility" for the SDK, so if I get trouble it could be an SDK incompatibility with my card. Seems worth trying, however.

ID: 1066405 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1066407 - Posted: 14 Jan 2011, 1:42:03 UTC - in response to Message 1066393.  

skildude wrote:
that workunit appears to have restarted quite a bit. about every 5% or so. THis seems like odd behavior

archae86 wrote:
I am set to suspend GPU work while computer in use


archae86 wrote:
I have the throttling parameters back at default after trying and failing a few months ago to back them down far enough to give acceptable interactive graphics performance without enabling the BOINC option to disable GPU processing during interactive use.


In other words, normal behavior on this host with the config parameters in use. Actually, this WU had far fewer such events than my norm, as this is my daily driver PC, and I've been staying away from it during this test, mostly.


ID: 1066407 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1066492 - Posted: 14 Jan 2011, 5:37:19 UTC - in response to Message 1066405.  

Raistmer wrote:
Then it's worth to update. New SDK should make use all available memory. AP params will change kernel length but not memory requirements.

OK, so my new plan is to run a few more hours to get the clear trend of %completion vs. elapsed for this result in current config.

Then install the 10.12 APP ATI drivers, then install the 2.3 SDK, then reboot and resume running r504 on the same WU.

To keep things clear and clean, I'd rather just install the SDK to see if that alone makes a material difference, but the compatibility list says a minimum of 10.11 driver is required, and as I am at 10.7 I'll just jump to current 10.12 APP.

One other possible concern--my low-end Graphics card HD 4550 is only listed under "beta compatibility" for the SDK, so if I get trouble it could be an SDK incompatibility with my card. Seems worth trying, however.

You don't need the SDK, that's the whole point of the APP drivers, installing the SDK is only going to give OpenCL support to your CPU,

Claggy
ID: 1066492 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1066494 - Posted: 14 Jan 2011, 6:04:40 UTC - in response to Message 1066492.  

Claggy wrote:
You don't need the SDK, that's the whole point of the APP drivers, installing the SDK is only going to give OpenCL support to your CPU,
As a neophyte, it was confusing that in the list of things to be installed, the driver listed the SDK (but was unclear to me as to which revision), and the SDK listed the driver?!?! I did both uninstalls (driver and SDK, including PATH patching) and both installs, and the ap is up and making progress. Initial signs seem ambiguous as to whether the rate of progress has improved. I should have a much better idea of progress improvement if any by morning (USA mountain time)

ID: 1066494 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1066510 - Posted: 14 Jan 2011, 7:20:41 UTC - in response to Message 1066405.  
Last modified: 14 Jan 2011, 7:21:23 UTC


One other possible concern--my low-end Graphics card HD 4550 is only listed under "beta compatibility" for the SDK, so if I get trouble it could be an SDK incompatibility with my card. Seems worth trying, however.

1) Catalyst 10.12 APP should be enough w/o SDK at all.
2) Unfortunately, all released until now ATI SDKs have "beta" status for whole HD4xxx family and it looks it will be forever now. ATI produced appropriate GPU for GPGPU computing only from HD5xxx, quite behind NV...
ID: 1066510 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34272
Credit: 79,922,639
RAC: 80
Germany
Message 1066531 - Posted: 14 Jan 2011, 13:24:32 UTC

When i deinstalled SDK 2.3 a few weeks ago all units errored out.

With SDK installed again no furhter problems.

????



With each crime and every kindness we birth our future.
ID: 1066531 · Report as offensive
Oleg
Volunteer tester
Avatar

Send message
Joined: 23 Jul 09
Posts: 18
Credit: 235,786
RAC: 0
Ukraine
Message 1066532 - Posted: 14 Jan 2011, 13:28:57 UTC
Last modified: 14 Jan 2011, 13:43:40 UTC

При запуске 2-х заданий комп ужасно тормозит(фризы по пол секунды),загрузка ГПУ ~60%, запускал при:
-ffa_block 6144 -ffa_block_fetch 1536 -hp -instances_per_device 2
-ffa_block 4096 -ffa_block_fetch 1024 -hp -instances_per_device 2
-ffa_block 2048 -ffa_block_fetch 512 -hp -instances_per_device 2
(count 0.5)
Любое изменение в cmdline дает загрузку ГПУ 97% и 0% прогресса, после перезагрузки или выхода из системы работает нормально(трапециевидный график загрузки ГПУ от 60 до 98%)
Возможно ли уменьшить использования цпу(как в милкивее)?
ID: 1066532 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1066539 - Posted: 14 Jan 2011, 14:04:06 UTC - in response to Message 1066531.  
Last modified: 14 Jan 2011, 14:57:29 UTC

When i deinstalled SDK 2.3 a few weeks ago all units errored out.

With SDK installed again no furhter problems.

????

Yep, When i uninstalled SDK_2.2 (when i had Cat 10.7 installed), i also lost OpenCL support from my Nvidia GPU, as well as my ATI GPU,
so it ment if i wanted to upgrade a Catalyst driver, i also had to reinstall my Nvidia driver to get it's OpenCL support back,

Just means when you uninstall SDK_2.3, the OpenCL files from Cat 10.12 APP get removed too, reinstall of Cat 10.12 APP then required, (or SDK)

Claggy
ID: 1066539 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1066541 - Posted: 14 Jan 2011, 14:30:38 UTC - in response to Message 1066532.  

При запуске 2-х заданий комп ужасно тормозит(фризы по пол секунды),загрузка ГПУ ~60%, запускал при:
-ffa_block 6144 -ffa_block_fetch 1536 -hp -instances_per_device 2
-ffa_block 4096 -ffa_block_fetch 1024 -hp -instances_per_device 2
-ffa_block 2048 -ffa_block_fetch 512 -hp -instances_per_device 2
(count 0.5)
Любое изменение в cmdline дает загрузку ГПУ 97% и 0% прогресса, после перезагрузки или выхода из системы работает нормально(трапециевидный график загрузки ГПУ от 60 до 98%)
Возможно ли уменьшить использования цпу(как в милкивее)?


ЦПУ должен использоваться меньше при бОльших значениях -ffa_*
Кроме этого ЦПУ регулировать никак нельзя (он и так используется минимально на текущий уровень развития алгоритма).
Видимо к запуску нескольких копий Астропульс относится негативно (возможно, нехватка памяти на ГПУ, памяти он ест много довольно).
ID: 1066541 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1066553 - Posted: 14 Jan 2011, 15:46:48 UTC - in response to Message 1066405.  

Raistmer wrote:
Then it's worth to update. New SDK should make use all available memory. AP params will change kernel length but not memory requirements.

OK, so my new plan is to run a few more hours to get the clear trend of %completion vs. elapsed for this result in current config.

Then install the 10.12 APP ATI drivers, then install the 2.3 SDK, then reboot and resume running r504 on the same WU.

1. So the clear trend on the new WU with my old ATI drivers and old SDK was like the previous one (suggesting about 36 hours to finish, compared to 13 to 14 expected on my host for r456). Specifically in reported 5:45:36 "elapsed time" (really means task active time, not wall clock elapsed time) it reported 15.315% complete.

2. I did installs of both the 2.3 SDK and of the 10.12 APP ATI drivers (I had not yet seen the post indicating I could or should dispense with the SDK).

I now have enough subsequent run time to say that the rate of progress seems very close to my previous slow r504 progress. Current elapsed time is 14:41:11 with 39.639% reported progress.

If there is a further configuration tweak I should attempt with r504, I am happy to try it, but without a good idea of my own, my current plan is to wait a couple of hours for a suggestion from here, then revert the host to r456, but retaining the current SDK and ATI driver configuration. If I am successful in making a "hot change" on the same WU, and progress immediately speeds up over a factor of two, it would make it more clear than there is a key difference in the ap itself, rather than an accidental coincidental change in some issue on my host.

The WU currently being processed, for which I reported the above progress points, is again a zero blanking WU according to the two indicators Joe Segur mentioned in this thread.

Any suggestions? My priority at this point is anything that might be helpful to Raistmer, whether it is something to fix in the ap, something to tell users about proper system config, just stating that my grade of card is not suitable, or figuring out what I've done wrong so I can stop being a distraction (assuming I've done something wrong).

ID: 1066553 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1066557 - Posted: 14 Jan 2011, 15:52:04 UTC - in response to Message 1066553.  

Just do what you suggested, it all sounds reasonable.
Need to check if r504 is slower than r456 for low-end cards indeed.
ID: 1066557 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1066629 - Posted: 14 Jan 2011, 18:15:09 UTC - in response to Message 1066557.  

Just do what you suggested, it all sounds reasonable.
Need to check if r504 is slower than r456 for low-end cards indeed.

OK, I succeeded in the hot changeover to r456 continuing with the same WU I was running on r504. I made no other changes save for the copying of .exe and .CL to the SETI and the slot directories, the change to the SETI App_info.xml, and of course stopping boincmgr before and resuming after. No reboot, no changes to driver or SDK...

With about an hour of execution after the change, it is already quite clear that the progress on this WU on my host on r456 is considerably faster than it was on r504. I'll post a graph of %done vs. elapsed time with the version change marked a little later, when the amount of data is more adequate.

ID: 1066629 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1066631 - Posted: 14 Jan 2011, 18:32:15 UTC - in response to Message 1066629.  

It's interesting data!
Could you run both apps under Stream Profiler that comes with SDK?
What ypu need to do for this:
1) locate directory with sprofile.exe file
2) copy both app exe and CL file there.
3) copy init_data.xml there
4) copy one of AP tasks ther and rename it to in.dat
run profiler with app (sprofile <full_app_path>)
no options required.

wait ~30 min (hope it will be enough for first try) then upload file Session1.csv

(it will be located in your documents under Stream SDK directory)

Do the same with other app build.

ID: 1066631 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1066637 - Posted: 14 Jan 2011, 18:44:22 UTC - in response to Message 1066631.  

1) locate directory with sprofile.exe file

The system itself is 64-bit Windows 7 Pro.

I find the profiler installation within Program Files (x86). But I find two subdirectories, x64, and x86. The x86 one has sprofile.exe, while the x64 one has sprofile-x64.exe.

For this purpose, shall I follow your direction exactly, copying the required files to the x86 directory, or should I use the x64 version?

Also, would it be best during the test for me to suspend SETI in BOINC, but leave BOINC running--this would mean the system load was 8 Einstein tasks, which would match the usual workload on my system? Or is it better just to exit BOINC and let the system be quiet?
ID: 1066637 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1066641 - Posted: 14 Jan 2011, 18:47:29 UTC - in response to Message 1066637.  

use x86 one, app is x86 in cpu part.
about BOINC - do as you wish (but suspend GPU computation at least!). Just keep in mind that ATI profiler is very memory hungry, don't know how it will work with 8 einstein processes in flight (einstein uses lot of memory too).
I usually just disable GPU activity in BOINc while running profile.
ID: 1066641 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1066643 - Posted: 14 Jan 2011, 18:51:34 UTC - in response to Message 1066641.  

use x86 one, app is x86 in cpu part.
about BOINC - do as you wish (but suspend GPU computation at least!). Just keep in mind that ATI profiler is very memory hungry, don't know how it will work with 8 einstein processes in flight (einstein uses lot of memory too).
I usually just disable GPU activity in BOINc while running profile.
I have 6Gbyte of RAM installed, and Einstein working set is measured at 250M per task according to Process Explorer, so allowing about 1G for Windows should have about 3G available for the profiler--I'll kill of BOINC entirely if the profiler grows beyond 2G.

ID: 1066643 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1066645 - Posted: 14 Jan 2011, 18:58:39 UTC - in response to Message 1066643.  

Also, pleae, check what you see in stderr.txt in this line:
Global memory size: 268435456

will it show 512MB now or just as here, 256Mb ?
ID: 1066645 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1066650 - Posted: 14 Jan 2011, 19:10:58 UTC

raistmer wrote:
Also, pleae, check what you see in stderr.txt in this line:
Global memory size: 268435456

For the stderr.txt in the slot directory where the current task was executing the global memory size still shows as 256M.

This is also true for the stderr.txt in the profiler directory. I am currently running profile on r504, and the working set reported by Process Explorer is 1.3G.

By the way, after the 30 minutes, what is the proper way for me to stop the profiler run? It appears that session1.csv is accumulating incrementally, with 10 minutes gone on r504 it is currently about 370 kbytes.
ID: 1066650 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 9 · Next

Message boards : Number crunching : r503 OpenCL AstroPulse for ATI GPUs beta testing


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.