How many tasks can a GPU do at once?

Message boards : Number crunching : How many tasks can a GPU do at once?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Super Nova Nerd
Avatar

Send message
Joined: 12 Feb 16
Posts: 73
Credit: 2,915,787
RAC: 0
United States
Message 1768239 - Posted: 27 Feb 2016, 23:48:16 UTC
Last modified: 27 Feb 2016, 23:48:54 UTC

I have an MSI R9 270, overclocked from 955 to 1100 on the GPU and the memory OC's from 1400 to 1500. I let it run one task at a time to test and it was stable for 2 full days with no errors. I then upped it to 2 tasks and it has been like that for about 5 days with zero problems. I was wanting to see if I could try to up that to 3. I know there is a point of diminishing returns and this is not exactly a cutting-edge GPU, but I want to see how far I can push it.
ID: 1768239 · Report as offensive
spitfire_mk_2
Avatar

Send message
Joined: 14 Apr 00
Posts: 563
Credit: 27,306,885
RAC: 0
United States
Message 1768323 - Posted: 28 Feb 2016, 7:27:22 UTC - in response to Message 1768239.  

Use GPU-z to monitor how busy your card is. Look at GPU Load graph. Most people try to keep it at 90-95%.
ID: 1768323 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22204
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1768332 - Posted: 28 Feb 2016, 8:34:13 UTC

There are two things to consider.
First is GPU loading, this will increase to just below 100%, and then sit there as the number of tasks increases.
Second is task throughput (the number of tasks completed in a given time period) this will improve as the number of concurrent tasks increases to a point and will then drop rapidly.

For most GPUs in most systems the sweet-spot for throughput is either two or three. However the number at which GPU usage maximizes is probably one or two higher than that.
Needless to say the "type" of task being done will have an influence, so let the system stabilise for several days (a week at least) before making each step, this will allow a "representative" selection of tasks and not just hit a pile of one type which may skew the data.

As far as I'm aware the reason being on most GPUs the bottleneck isn't the processing but that data transfer within the GPU and between the GPU and the host.

It will be interesting to see how newer APIs like "Vulkan" and compute technologies like "SoG" have on these seet-spots.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1768332 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1768335 - Posted: 28 Feb 2016, 9:33:33 UTC
Last modified: 28 Feb 2016, 9:34:07 UTC

3 instances on a 270 is too much.
On my 380 its only a small benefit.
So i stick with 2 instances for now.


With each crime and every kindness we birth our future.
ID: 1768335 · Report as offensive
iwajabitw
Avatar

Send message
Joined: 18 Mar 00
Posts: 19
Credit: 9,285,966
RAC: 0
United States
Message 1768371 - Posted: 28 Feb 2016, 16:30:58 UTC

Running 3 on a GTX 980.
ID: 1768371 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1768379 - Posted: 28 Feb 2016, 17:35:05 UTC - in response to Message 1768371.  

Running 3 on a GTX 980.


On my 2 systems, both of which have dual GTX 980s. I find that running 3/GPU gives slightly more RAC than 2. It's hard to tell how much because of the varying character of the workloads (I do v8 and AP on mine), but barely 10% better, if that. My CPUs are i7-4790K and i7-4820K, so not underpowered there, either.
ID: 1768379 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1768380 - Posted: 28 Feb 2016, 17:38:34 UTC - in response to Message 1768371.  

Running 3 on a GTX 980.


980 is high end GPU.
270 is mid range at best.
You can`t compare those.


With each crime and every kindness we birth our future.
ID: 1768380 · Report as offensive
Joe Januzzi
Volunteer tester
Avatar

Send message
Joined: 13 Apr 03
Posts: 54
Credit: 307,134,110
RAC: 492
United States
Message 1768402 - Posted: 28 Feb 2016, 18:48:33 UTC

Both my systems run v8 and AP. Doing Cuda or SoG v8 can have a impact of Wu's per card, depending on CPU.

First System i7-3770K with a mix of 4 GPUs (not using IGPU). CPU only feeds GPUs.
Note: I run 3 @ a time, either running Cuda or SoG. On this system, SoG wins easily:-)

GTX 980 3 @ a time.
GTX 980 3 @ a time.
GTX 780 3 @ a time.
GTX 960 3 @ a time.
Total of 12 Wu's crunching.

Running SoG at times, adds high CPU usage. At brief times my CPU is at 100% (starving GPUs?). It's mostly between 40-90% depending on AR. On this system it was my GTX 960 keeping me from trying more Wu's @ a time, doing Cuda. Now doing SoG I add the lack of CPU cores to the mix. Once my RAC stops climbing, I might pull the 960 out and try more Wu's at a time on the other GPUs. Time will tell. If we could control Wu's per card, the 960 would be doing 2.

Second System Intel Q9550 with 1 GPU.
GTX 560 Ti 2 @ a time. (Cuda)
CPU 1 @ a time.
Total of 3 wu's crunching.

Once RAC stops climbing, I might try 2 CPU wu's. Then try SoG.

I also run AP's when we get them. Usually when I run them, it's only 1 Wu per system, keeping the crunching total the same.

YMMV
Joe

Real Join Date:
Joe Januzzi (ID 253343) 29 Sep 1999, 22:30:36 UTC
Try to learn something new everyday.
ID: 1768402 · Report as offensive
Profile Super Nova Nerd
Avatar

Send message
Joined: 12 Feb 16
Posts: 73
Credit: 2,915,787
RAC: 0
United States
Message 1768515 - Posted: 29 Feb 2016, 2:42:29 UTC
Last modified: 29 Feb 2016, 2:51:23 UTC

Thank You. I think I will stick with 2. This card is no 980 for sure. With the OC I have it is more in the 760 ti range. I may just get another of the same cards and run 2 on that system. Then 4 would not be a problem.

My system runs 2 GPU and 8 CPU tasks at a time. The CPU sits at 100% and the GPU at about 95%.
ID: 1768515 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1768688 - Posted: 29 Feb 2016, 22:29:50 UTC - in response to Message 1768515.  

what CPU do you have ?

Remember to leave 2 cores free if doing 2 AP's on the GPU and at least 1 core free for MB's
ID: 1768688 · Report as offensive
Profile Super Nova Nerd
Avatar

Send message
Joined: 12 Feb 16
Posts: 73
Credit: 2,915,787
RAC: 0
United States
Message 1768707 - Posted: 29 Feb 2016, 23:47:12 UTC - in response to Message 1768688.  
Last modified: 29 Feb 2016, 23:50:34 UTC

I have an FX 8350 running at 4.7ghz.

I wasn't worried about AP units because I never seem to get them on this one. Very few anyway. Maybe I have it set up wrong. I'm very new to this.

I have the GPU set to .5 and .1 CPU for each task in the app config file and I have the max seti can use set to 100% CPU on the computing preferences.

Here is my app_config file-

<app_config>
<app>
<name>setiathome_v8</name>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.1</cpu_usage>
</gpu_versions>
</app>

<app>
<name>astropulse_v7</name>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.1</cpu_usage>
</gpu_versions>
</app>
</app_config>
ID: 1768707 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1768716 - Posted: 1 Mar 2016, 0:47:02 UTC
Last modified: 1 Mar 2016, 0:47:39 UTC

IIRC the FX-8350, while having 8 cores, has only 4 Floating Point units, so you might want to not be running so many threads (WUs) simultaneously.

Does anyone out there have a feel - or some data - that would either make me wrong, or quantify the use of cores vs. FPUs on this CPU?
ID: 1768716 · Report as offensive
Profile Super Nova Nerd
Avatar

Send message
Joined: 12 Feb 16
Posts: 73
Credit: 2,915,787
RAC: 0
United States
Message 1768720 - Posted: 1 Mar 2016, 1:15:30 UTC - in response to Message 1768716.  
Last modified: 1 Mar 2016, 1:16:11 UTC

The 8350 has four modules with two distinct processors in each module that can work on two threads simultaneously. Yes there is only one Floating point scheduler, but it has two 128 bit FPU processing units so it can execute two Floating point operations at the same time.
ID: 1768720 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1768722 - Posted: 1 Mar 2016, 1:31:10 UTC - in response to Message 1768707.  

I have an FX 8350 running at 4.7ghz.

I wasn't worried about AP units because I never seem to get them on this one. Very few anyway. Maybe I have it set up wrong. I'm very new to this.

I have the GPU set to .5 and .1 CPU for each task in the app config file and I have the max seti can use set to 100% CPU on the computing preferences.

Here is my app_config file-

<app_config>
<app>
<name>setiathome_v8</name>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.1</cpu_usage>
</gpu_versions>
</app>

<app>
<name>astropulse_v7</name>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.1</cpu_usage>
</gpu_versions>
</app>
</app_config>

If you are only running tasks on the GPU the value for CPU_usage doesn't really matter. However if you are also running tasks on the CPU it's generally a good idea to allocate more CPU for the GPU app(s). Otherwise the GPU app(s) may run slower due to lack of sufficient CPU resources.
Ideally you would want to use:
<gpu_usage>0.5</gpu_usage>
<cpu_usage>1.0</cpu_usage>
However you may be able to get away with:
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.5</cpu_usage>
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1768722 · Report as offensive
Profile Super Nova Nerd
Avatar

Send message
Joined: 12 Feb 16
Posts: 73
Credit: 2,915,787
RAC: 0
United States
Message 1768723 - Posted: 1 Mar 2016, 1:42:35 UTC - in response to Message 1768722.  
Last modified: 1 Mar 2016, 1:43:03 UTC

Thanks Hal. I was kind of thinking the same thing with the .5. I don't think I need to go to 1 with this GPU anyway.

I just noticed I have 2 AP units running. Neither on GPU. I have 2 GPU tasks and the 2 AP units running on CPU only. I have never noticed that before. When they started it was showing 3 days for time remaining, but looks like they will be take about 8 hours. One is almost done.
ID: 1768723 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1768726 - Posted: 1 Mar 2016, 1:55:35 UTC - in response to Message 1768723.  

If you check your results you should see the AMD Multibeams require More CPU than the AMD APs. Assuming you only need to free CPU cores for APs is Not correct when using ATI/AMD cards. You need to have free cores for All GPU tasks with AMD cards, more so with the MBs than APs. Back with APv6 the type of blanking used would require almost a complete CPU core at times. That changed with APv7 which happened quite a while ago. Ever since APv7 the AMD MBs require more CPU on AMD cards.
I would suggest freeing at least one CPU core per GPU task, especially on an AMD CPU.
ID: 1768726 · Report as offensive
Profile Super Nova Nerd
Avatar

Send message
Joined: 12 Feb 16
Posts: 73
Credit: 2,915,787
RAC: 0
United States
Message 1768729 - Posted: 1 Mar 2016, 2:09:17 UTC - in response to Message 1768726.  
Last modified: 1 Mar 2016, 2:13:23 UTC

I am very new to this and know very little about it. What is multibeam? How do I know if a task was multibeam? AP is easy. It says it is ;)

All I see is this-

SETI@home v8
Anonymous platform (ATI GPU)

This-

SETI@home v8
Anonymous platform (CPU)

or this-

AstroPulse v7
Anonymous platform (ATI GPU)
ID: 1768729 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1768731 - Posted: 1 Mar 2016, 2:19:48 UTC - in response to Message 1768729.  

There are 2 work units at Seti.

AP which you know because it says so.

MB are all the others, seti@home v 7 and v 8
ID: 1768731 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1768733 - Posted: 1 Mar 2016, 2:20:56 UTC - in response to Message 1768729.  

Anything that says seti is a multibeam task.
ID: 1768733 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1768735 - Posted: 1 Mar 2016, 2:23:27 UTC - in response to Message 1768729.  

Look here, http://setiathome.berkeley.edu/sah_status.html
Splitter status:
         Progress
Multibeam        Astropulse

There are only Two types of tasks, therefore, if it's not an AP it has to be a MB.
SETI@home is just another name for MultiBeam.
ID: 1768735 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : How many tasks can a GPU do at once?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.