Are we ATI Guys being got at?

Author	Message
Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1095400 - Posted: 9 Apr 2011, 20:40:17 UTC - in response to Message 1095396. Well, yu use 4 GPUs i single host... AMD's OpenCL runtime still has troubles with such configs. Try either update to recently released SDK 2.4 or try to find info (I can't recall exact wording now) about special environment variable that change synching stype for ATI OpenCL runtime. [EDIT: Morten use many GPUs in single host too and AFAIK found that setting helpful, look/seqarch into lunatics ATi-related threads] The environment variable I set in for better usage of both GPUs in HD 5970 is "GPU_USE_SYNC_OBJECTS=1". This enables about 85% GPU usage for both GPUs, opposed to very low and erratic usage without the setting. This is environment variable is undocumented, and can therefore be removed by AMD in a future release. That sounds very similar to the SWAN_SYNC=0 environment variable used by GPUGrid to invoke polling mode. In their case, the increased GPU usage comes at the expense of increased CPU usage too. jravin seems to have got the increased CPU usage already, and (as mentioned in message 1095331) 80%-95% GPU usage. Err, you couldn't have set the environment deliberately at some point in the past, could you? Morten, what CPU usage are you getting? ID: 1095400 ·

Morten Ross Volunteer tester Send message Joined: 30 Apr 01 Posts: 183 Credit: 385,664,915 RAC: 0	Message 1095415 - Posted: 9 Apr 2011, 21:10:04 UTC - in response to Message 1095400. Last modified: 9 Apr 2011, 21:20:20 UTC I'm currently running 2 tasks per GPU, and most tasks are around 2% cpu utilization, while about 3 tasks are around 6-7% cpu utilization. EDIT: looking at this for a while, it seems to level out at all tasks around 5%, and 3 out of 8 CPU cores are fully servicing these tasks. Morten Ross ID: 1095415 ·

Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340	Message 1095520 - Posted: 10 Apr 2011, 0:01:40 UTC Well, with my 4 HD 5550s, I am getting the following: (all AP, but I believe applies to MB also): 4 tasks running, CPU 100%; 3, 75%; 2, 50% and 1 task (WU) running, CPU = 25%. That is on a quad core 3GHz Phenom II processor. So 1 GPU requires 1 CPU. Gah! Interestingly, it takes several hours (4 - 8) to run an AP, so I am getting roughly 100 credits/hour, or about 2000-2500 a day. About 1/2 what my GT 240s get. This is with BOINC saying GT 240 - max 257 GFlops, HD 5550 - max 352 GFlops. So still no solution to my "overCPUing"; I will try backing off to Cat 11.2 tomorrow; right now, I have to make sure 3 old Athlon dual MP macines are working, as I am giving them away tomorrow morning... ID: 1095520 ·

JLConawayII Send message Joined: 2 Apr 02 Posts: 188 Credit: 2,840,460 RAC: 0	Message 1095521 - Posted: 10 Apr 2011, 0:23:56 UTC I was thrilled to get 486 WUs from MW@home. I thought, finally they got rid of the core-based limitation. Then the project went splat, and I was sad. :o( ID: 1095521 ·

unnefer Send message Joined: 23 Jan 11 Posts: 5 Credit: 83,513 RAC: 0	Message 1095590 - Posted: 10 Apr 2011, 6:34:47 UTC - in response to Message 1095336. Last modified: 10 Apr 2011, 6:53:20 UTC unnefer, your app_info and preferences are fine, you just have to wait, AP is fairly rare, you might ask for work 20 times, and get none, then get 20 AP tasks on the 21st attempt. Ah ok thanks for that. I just added the Multibeam 6.10 r177 gpu app as well now and at least it's downloading the multibeam enhanced GPU tasks :) I changed the "count" to 0.5 so it would crunch 2 workunits per GPU, and for me, that seems to offer the best estimated speed per instance. I also had an issue with it using 100% CPU usage and only ~70% GPU usage, so I changed "instances per device" to 1 and "period_iterations_num" to 4 and that dropped CPU usage down to 55% and increased GPU usage to ~94% :D For anyone else who wanted to crunch GPU only and never quite got it working, here is my app_info.xml file for reference: <app_info> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>MB_6.10_win_SSE3_ATI_HD5_r177.exe</name> <executable/> </file_info> <file_info> <name>MultiBeam_Kernels.cl</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>610</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.05</avg_ncpus> <max_ncpus>0.05</max_ncpus> <plan_class>ati13ati</plan_class> <cmdline>-period_iterations_num 4 -instances_per_device 1</cmdline> <flops>20987654321</flops> <file_ref> <file_name>MB_6.10_win_SSE3_ATI_HD5_r177.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>MultiBeam_Kernels.cl</file_name> <copy_file/> </file_ref> <coproc> <type>ATI</type> <count>0.5</count> </coproc> </app_version> <app> <name>astropulse_v505</name> </app> <file_info> <name>ap_5.06_win_x86_SSE2_OpenCL_ATI_r516.exe</name> <executable/> </file_info> <file_info> <name>AstroPulse_Kernels.cl</name> <executable/> </file_info> <app_version> <app_name>astropulse_v505</app_name> <version_num>506</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.05</avg_ncpus> <max_ncpus>0.05</max_ncpus> <plan_class>ati13ati</plan_class> <cmdline>-instances_per_device 1 -hp -unroll 10 -ffa_block 4096 -ffa_block_fetch 2048</cmdline> <flops>30987654321</flops> <file_ref> <file_name>ap_5.06_win_x86_SSE2_OpenCL_ATI_r516.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>AstroPulse_Kernels.cl</file_name> <copy_file/> </file_ref> <coproc> <type>ATI</type> <count>0.5</count> </coproc> </app_version> </app_info> For it to work, you'll need to download download the r177 multibeam GPU only app files from HERE and the r516 astropulse GPU only app files from HERE, then add them to your seti project folder. Also note, I have ATI HD5850 GPUs and so I am using the HD5 version ap (MB_6.10_win_SSE3_ATI_HD5_r177.exe)for multibeam. If you don't have a HD5 series or HD69XX series GPU, or you experience errors or driver restarts, then use the default ap (MB_6.10_win_SSE3_ATI_r177.exe) for multibeam instead. make sure you have this line in your startup messages: 09/04/2011 19:42:34 SETI@home Found app_info.xml; using anonymous platform and that you're actually asking for GPU work, Claggy Cheers Claggy. yeah I was getting that message, but nothing ever downloaded. But as soon as I added the r177 multibeam GPU app it started downloading tasks and crunching :) How long should a multibeam GPU workunit take on average using a HD5850? Cheers, Leslie ID: 1095590 ·

unnefer Send message Joined: 23 Jan 11 Posts: 5 Credit: 83,513 RAC: 0	Message 1095597 - Posted: 10 Apr 2011, 7:47:27 UTC - in response to Message 1095590. Last modified: 10 Apr 2011, 7:48:57 UTC I changed the "count" to 0.5 so it would crunch 2 workunits per GPU, and for me, that seems to offer the best estimated speed per instance. I also had an issue with it using 100% CPU usage and only ~70% GPU usage, so I changed "instances per device" to 1 and "period_iterations_num" to 4 and that dropped CPU usage down to 55% and increased GPU usage to ~94% :D I made three errors with the above comments. It seems like the main thing that drops CPU usage is the "instances per device" value. When set to 1, CPU usage drops to ~27% per GPU, I have 2x GPU's, so total CPU usage ends up ~55%. When set to 2, CPU usage sat at 100% and GPU usage actually dropped down to ~70% ! Changing "period_iterations_num" to 4 did not do anything. I also tried values from 1 through to 10 and again, I didn't notice anything. Even though I set "count" to 0.5 and 4 workunits looked to be crunching, because I also had "instances per device" to 1 only 1 workunit per GPU was actually crunched. So anyhoo, if anyone knows how to decrease CPU usage so that I can run 2x instances on each GPU without topping out at 100% CPU usage, that would be great. ID: 1095597 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1095605 - Posted: 10 Apr 2011, 8:30:27 UTC - in response to Message 1095597. Last modified: 10 Apr 2011, 8:34:26 UTC I changed the "count" to 0.5 so it would crunch 2 workunits per GPU, and for me, that seems to offer the best estimated speed per instance. I also had an issue with it using 100% CPU usage and only ~70% GPU usage, so I changed "instances per device" to 1 and "period_iterations_num" to 4 and that dropped CPU usage down to 55% and increased GPU usage to ~94% :D I made three errors with the above comments. It seems like the main thing that drops CPU usage is the "instances per device" value. When set to 1, CPU usage drops to ~27% per GPU, I have 2x GPU's, so total CPU usage ends up ~55%. When set to 2, CPU usage sat at 100% and GPU usage actually dropped down to ~70% ! Changing "period_iterations_num" to 4 did not do anything. I also tried values from 1 through to 10 and again, I didn't notice anything. Even though I set "count" to 0.5 and 4 workunits looked to be crunching, because I also had "instances per device" to 1 only 1 workunit per GPU was actually crunched. So anyhoo, if anyone knows how to decrease CPU usage so that I can run 2x instances on each GPU without topping out at 100% CPU usage, that would be great. Getting two Wu's to run on a device is a two part operation, the count value and the -instances per device per device both need to be changed, The period iterations splits single longest PulseFind kernels calls, if you get a laggy GUI or driver restarts increase this value, there is no need to change to the non HD5 version if you get driver restarts. there is no way of lowering CPU usage by adjusting settings, the app will use what it needs, the high CPU usage might be a byproduct of the speedups of SDK_2.4_RC1, that's why i suggested to jravin to try Cat 11.2/SDK_2.3 to see if that reduced his CPU usage, Claggy ID: 1095605 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1095644 - Posted: 10 Apr 2011, 12:51:59 UTC Last modified: 10 Apr 2011, 12:52:44 UTC Would make sense. New version is called APP instead of SDK AMD parallel processing With each crime and every kindness we birth our future. ID: 1095644 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1095646 - Posted: 10 Apr 2011, 13:08:32 UTC - in response to Message 1095644. Would make sense. New version is called APP instead of SDK AMD parallel processing SDK normally stands for Software Development Kit. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1095646 ·

unnefer Send message Joined: 23 Jan 11 Posts: 5 Credit: 83,513 RAC: 0	Message 1095647 - Posted: 10 Apr 2011, 13:10:44 UTC Last modified: 10 Apr 2011, 13:15:08 UTC np Claggy. I'm happy with a single workunit per GPU and ~55% CPU usage. The workunits are being processed pretty quick - my HD5850's complete a workunit in 30-40mins. Interestingly, I also have a HD4870 running the HD5 app and it is working fine - it completes a workunit in 60-80mins - I didn't think it would run the HD5 app tbh, but it's working without any errors or driver restarts for now. I'm not too bothered with the higher CPU usage and can still crunch some CPU only workunits without too much impact now. ID: 1095647 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1095648 - Posted: 10 Apr 2011, 13:37:35 UTC - in response to Message 1095644. Would make sense. New version is called APP instead of SDK AMD parallel processing Yep, it's called APP (Accelerated Parallel Processing) now, but it's AMD APP SDK now, instead of ATI Steam SDK: Name: Juniper Vendor: Advanced Micro Devices, Inc. Driver version: CAL 1.4.1332 Version: OpenCL 1.1 AMD-APP-SDK-v2.4-rc1 (595.9) Claggy ID: 1095648 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1095676 - Posted: 10 Apr 2011, 15:54:48 UTC - in response to Message 1095602. C'mon Mark, you know me better than that, no doubt you saw the smiley after my first post! I just though we might get some interesting comments ;-) I note that also a third ATI project went down yesterday for a while, namely Primegrid. Not a good week for ATI folks, but I expect it'll sort itself out. Anyhow these are the Seti boards, so I guess we'd better stick to Seti stuff. But of course.... I was just poking too. No doubt the ATI cards will come into their own on Seti as development work on the apps for them continues. The main reason that NV has a leg up on them right now was due to the development work that NV helped with to get things rolling. Many claim that the ATI cards are superior on other projects. Only a matter of time until they are able to show what they can do here. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1095676 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1095703 - Posted: 10 Apr 2011, 16:53:09 UTC @unnefer change count back to 1 or you will get -177 error eventually. For now you tell BOINC to run 2 instances per GPU while tell app to process only single task at once. That is, 2 tasks launched, one processes, second sitting suspended. BOINC increase elapsed time for both so you will get incorrect run times at least or -177 errors if second will be suspended too long time. ID: 1095703 ·

unnefer Send message Joined: 23 Jan 11 Posts: 5 Credit: 83,513 RAC: 0	Message 1095994 - Posted: 11 Apr 2011, 6:23:04 UTC - in response to Message 1095703. @unnefer change count back to 1 or you will get -177 error eventually. For now you tell BOINC to run 2 instances per GPU while tell app to process only single task at once. That is, 2 tasks launched, one processes, second sitting suspended. BOINC increase elapsed time for both so you will get incorrect run times at least or -177 errors if second will be suspended too long time. Cheers, yeah I noticed that after I initially set up my app_info.xml file as per my original post - I have since changed that so my "count" is '1" and my "instances_per_device" is 1 :) So now it only works on a single workunit per GPU and everything is working ok now :) I tried to edit my earlier post and correct those couple of things but it will not let me edit it? ID: 1095994 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1096009 - Posted: 11 Apr 2011, 7:10:46 UTC You can only edit it 1 hour after it was created. With each crime and every kindness we birth our future. ID: 1096009 ·

Fred J. Verster Volunteer tester Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0	Message 1096979 - Posted: 13 Apr 2011, 22:37:40 UTC - in response to Message 1096009. Last modified: 13 Apr 2011, 22:47:18 UTC Is it beter to run 2 WU's on 1x HD5870, do have 2 and i7-2600? (rev.177?) Gettin little late, 'summer-time', nice in the summer, not now! Will take a look at LUNATICs and try to choose the right apps, should be no trouble only time. And a complete app_info.xml file, running the 0.37 Installer give the optimized apps, also wrong app for OpenCL HD5870, but will add this info. I run BOINC 6.10.60 (64BIT), a vlar marked WU goes to CPU, but if it's an older WU and doesn't have it, what happens? Or can r177 and a 5000 series GPU, also crunch a vlar? ID: 1096979 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1096980 - Posted: 13 Apr 2011, 22:41:33 UTC - in response to Message 1096979. Is it beter to run 2 WU's on 1x HD5870, do have 2 and i7-2600? Gettin little late, 'summer-time', nice in the summer, not now! Will take a look at LUNATICs and try to choose the right apps, should be no trouble only time. I would say so Fred. I run 2 on my 5850 with no problems. With each crime and every kindness we birth our future. ID: 1096980 ·

Fred J. Verster Volunteer tester Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0	Message 1096983 - Posted: 13 Apr 2011, 22:51:04 UTC - in response to Message 1096980. Well then their should be no single reason, to put these apps on their proper place. I'll put Collatz C. on N.N.T. Sometimes one has to set his priorities ;). ID: 1096983 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.