GPU not running at 100%


log in

Advanced search

Message boards : Number crunching : GPU not running at 100%

1 · 2 · Next
Author Message
Profile David Steel
Volunteer tester
Avatar
Send message
Joined: 12 Jul 01
Posts: 52
Credit: 17,340,196
RAC: 11,172
Australia
Message 1219786 - Posted: 18 Apr 2012, 3:27:41 UTC

Installed Lunatics on a second PC to crunch on a HD6970....

Getting tasks ok but the GPU is only running in bursts from zero to about 25%.

Can anyone help with this please?
____________

Profile Slavac
Volunteer tester
Avatar
Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1219795 - Posted: 18 Apr 2012, 3:39:17 UTC - in response to Message 1219786.

How many tasks per card?

Are you OCing at all that may be causing downclocks?
____________


Executive Director GPU Users Group Inc. -
brad@gpuug.org

Profile David Steel
Volunteer tester
Avatar
Send message
Joined: 12 Jul 01
Posts: 52
Credit: 17,340,196
RAC: 11,172
Australia
Message 1219804 - Posted: 18 Apr 2012, 3:48:38 UTC - in response to Message 1219795.

Only 1 task per card, I haven't changed anything in Lunatics.

No OCing, everything is at standard settings...


____________

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24017
Credit: 32,972,797
RAC: 23,389
Germany
Message 1219809 - Posted: 18 Apr 2012, 3:54:21 UTC

Thats the low GPU usage bug in the drivers.
Suspend the task a few seconds and resume again.
That should help.

____________

Profile David Steel
Volunteer tester
Avatar
Send message
Joined: 12 Jul 01
Posts: 52
Credit: 17,340,196
RAC: 11,172
Australia
Message 1219829 - Posted: 18 Apr 2012, 4:41:43 UTC - in response to Message 1219809.

Is there a fix for this bug?

Do I have to suspend and resume every new task or every time I reboot?

Is it a bug in the Catalyst drivers? I have version 12.3
____________

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24017
Credit: 32,972,797
RAC: 23,389
Germany
Message 1219857 - Posted: 18 Apr 2012, 7:06:06 UTC

Yes, its a bug in catalyst drivers.

Sometimes suspending GPU cures for a few units in a row.
Sometims only for 1 unit.
You need to check every once in a while.

____________

Profile David Steel
Volunteer tester
Avatar
Send message
Joined: 12 Jul 01
Posts: 52
Credit: 17,340,196
RAC: 11,172
Australia
Message 1219946 - Posted: 18 Apr 2012, 14:43:34 UTC - in response to Message 1219857.

Um... not exactly the way I want to crunch.

Who wants to babysit their PC, suspending and resuming every 15 mins...

Is this what you and everyone else does with their ATI card?

What version Catalyst did it start from and are we stuck with this crap forever?


____________

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,274
RAC: 0
Korea, North
Message 1219993 - Posted: 18 Apr 2012, 18:18:26 UTC - in response to Message 1219946.

you won't get 100% usage out of an ATI 6970 running 1 WU.

Try running 3 at a time.
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24017
Credit: 32,972,797
RAC: 23,389
Germany
Message 1220036 - Posted: 18 Apr 2012, 20:20:07 UTC - in response to Message 1219946.
Last modified: 18 Apr 2012, 20:21:52 UTC

Um... not exactly the way I want to crunch.

Who wants to babysit their PC, suspending and resuming every 15 mins...

Is this what you and everyone else does with their ATI card?

What version Catalyst did it start from and are we stuck with this crap forever?



This bug started with 11.12. driver iirc.
I really hope this will be fixed when OpenCL 1.2 is fully implemented.

Like skildude mentioned you can get better performace out of your card running 2 or 3 units at the same time.
I´m running 2 atm on my 5850.
You need to modify your appinfo.

Edit
For 2 instances
<cmdline>-help -period_iterations_num 20 -instances_per_device 2</cmdline>
<count>0.5</count>
For 3 its of course count 0.33.

Dont forget to shut down Boinc before you change it.
____________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4167
Credit: 113,962,446
RAC: 144,483
United States
Message 1220045 - Posted: 18 Apr 2012, 20:48:21 UTC

I suspended my home crunching with the warmer weather coming in. however I was using 12.1 on my 6870 without down clocking. I did get a task stuck now and then where the time continued to increment without any progress being done. I think there were 3 or 4 over 6 months or so.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24017
Credit: 32,972,797
RAC: 23,389
Germany
Message 1220047 - Posted: 18 Apr 2012, 20:53:17 UTC

Its not down clocking HAL.
Its just low GPU utilisation.

____________

Profile David Steel
Volunteer tester
Avatar
Send message
Joined: 12 Jul 01
Posts: 52
Credit: 17,340,196
RAC: 11,172
Australia
Message 1220147 - Posted: 19 Apr 2012, 4:19:49 UTC - in response to Message 1220047.

Ok this is from my appinfo...

Mike, do you have to do the suspend/resume thing on your 5850 or does it run at 100% by itself?

Do I need to change the <count>1</count> to .50 and the -instances_per_device from 1 to 2, or just the count?

Also this is in there twice for seti enhanced, x2 for astropulse v6 and x4 for astropulse v505, do I have to change them all?


<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>610</version_num>
<platform>windows_x86_64</platform>
<avg_ncpus>0.05</avg_ncpus>
<max_ncpus>0.05</max_ncpus>
<plan_class>ati13ati</plan_class>
<cmdline>-period_iterations_num 20 -instances_per_device 1</cmdline>
<coproc>
<type>ATI</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>MB6_win_x86_SSE3_OpenCL_ATi_HD5_r390.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>MultiBeam_Kernels_r390.cl</file_name>
<copy_file/>
</file_ref>
</app_version>
</app_info>

____________

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24017
Credit: 32,972,797
RAC: 23,389
Germany
Message 1220179 - Posted: 19 Apr 2012, 5:47:31 UTC
Last modified: 19 Apr 2012, 5:51:15 UTC

If you want to run 2 instances you need to change both instances_per_device to 2 and count to 0.5.
Thats different to cuda app because ATI is OpenCL.

I need to suspend the app also every once in while.

But running 2 or 3 instances is not so bad because mostly at least 1 unit gets enough utilisation.
I´d strongly suggest 3 instances on your 6970.
So number of instances 3 and count 0.33.

You can change it as well on the astropulse apps.
____________

Profile David Steel
Volunteer tester
Avatar
Send message
Joined: 12 Jul 01
Posts: 52
Credit: 17,340,196
RAC: 11,172
Australia
Message 1220201 - Posted: 19 Apr 2012, 7:11:01 UTC - in response to Message 1220179.

Ok thanks Mike,

I have changed Lunatics to run 3 WU's at once, the GPU is running at about 90% and sometimes up to 98%.

When its doing 3 work units (seti@home enhanced 6.10), the GPU seems to be giving 1 work unit about 80% and the other 2 about 5% each.... one finishes in about 3 mins and the others take about 30 mins.

Is this normal?
____________

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24017
Credit: 32,972,797
RAC: 23,389
Germany
Message 1220210 - Posted: 19 Apr 2012, 8:42:27 UTC

Yes, thats normal.

But under this conditions you get better output.


____________

Profile David Steel
Volunteer tester
Avatar
Send message
Joined: 12 Jul 01
Posts: 52
Credit: 17,340,196
RAC: 11,172
Australia
Message 1220228 - Posted: 19 Apr 2012, 10:55:20 UTC - in response to Message 1220210.

Well I just ran 3 x AstroPulse v6.

All started at the same time,

First: 53 mins
Second: 150 mins
Third: 260 mins

Doesn't like these as much, the GPU would only run at 65%...

____________

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24017
Credit: 32,972,797
RAC: 23,389
Germany
Message 1220249 - Posted: 19 Apr 2012, 12:14:38 UTC - in response to Message 1220228.
Last modified: 19 Apr 2012, 12:20:45 UTC

Well I just ran 3 x AstroPulse v6.

All started at the same time,

First: 53 mins
Second: 150 mins
Third: 260 mins

Doesn't like these as much, the GPU would only run at 65%...


Those are not representative because 3 are overflow units.
They stopp working after 30 pulses were found.

At least you did 4 APs in 4 hours.
Not so shabby i would say.

Next step is to increase ffa_block and ffa_block_fetch.
Change comandline params in astropulse section.

<cmdline>-instances_per_device 3 -unroll 12 -ffa_block 8192 -ffa_block_fetch 4096</cmdline>

This will speed up your card further.

Keep in mind on astropulses low GPU usage is not always caused by the low GPU usage bug.
Go to your task manager and check how much CPU is in use by the AP app.
If its above 3% its a high blanked unit so low GPU usage is normal because blanking is calculated by the CPU only.
____________

Profile David Steel
Volunteer tester
Avatar
Send message
Joined: 12 Jul 01
Posts: 52
Credit: 17,340,196
RAC: 11,172
Australia
Message 1220270 - Posted: 19 Apr 2012, 13:03:56 UTC - in response to Message 1220249.

My CPU's are normally always at 100% because I'm crunching Primegrid on them.

But I will suspend PG for a while and see what happens with the AstroPulse tasks.

This is what I currently have but I will change it:

<cmdline>-instances_per_device 3 -unroll 4 -ffa_block 2048 -ffa_block_fetch 1024 -sbs 128</cmdline>

What does that do to make the card faster?
____________

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24017
Credit: 32,972,797
RAC: 23,389
Germany
Message 1220273 - Posted: 19 Apr 2012, 13:12:57 UTC - in response to Message 1220270.

My CPU's are normally always at 100% because I'm crunching Primegrid on them.

But I will suspend PG for a while and see what happens with the AstroPulse tasks.

This is what I currently have but I will change it:

<cmdline>-instances_per_device 3 -unroll 4 -ffa_block 2048 -ffa_block_fetch 1024 -sbs 128</cmdline>

What does that do to make the card faster?


I know what you are using.

Higher unroll will simply load more data at once in one kernel call.
More VRAM in use of course.
Higher ffa block will compute them faster.

Of course Raistmer can explain better for sure.

____________

Profile David Steel
Volunteer tester
Avatar
Send message
Joined: 12 Jul 01
Posts: 52
Credit: 17,340,196
RAC: 11,172
Australia
Message 1224206 - Posted: 28 Apr 2012, 3:03:02 UTC - in response to Message 1220273.

Hey Mike, do I need to remove -sbs 128 from the <cmdline> ?

Also, I have Lunatics installed on another PC running a GTX 590.

That was about a year ago, is there a newer version I need to update to or is the old one ok?

And do you know what to change in the app_info to maximise the 590? I think its only running 2 WU's at a time at about 80%. Thanks
____________

1 · 2 · Next

Message boards : Number crunching : GPU not running at 100%

Copyright © 2014 University of California