More than one WU per GPU at the same time ?

Questions and Answers : GPU applications : More than one WU per GPU at the same time ?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Profile Sunny129
Avatar

Send message
Joined: 7 Nov 00
Posts: 190
Credit: 3,163,755
RAC: 0
United States
Message 1088726 - Posted: 20 Mar 2011, 5:30:06 UTC

ah, i forgot about that other line :). clearly i've only attained "Jedi Knight" status, and am far from being a Master lol. i edited the app info file, but apparently i'll have to wait to test it out again, b/c i'm currently all out of work and the server seems to have been having problems for the last 12 hours now. but i'll update and let you know how it goes when more work is available.
ID: 1088726 · Report as offensive
Profile TOM
Volunteer tester
Avatar

Send message
Joined: 5 Apr 01
Posts: 53
Credit: 65,422,234
RAC: 86
Germany
Message 1088762 - Posted: 20 Mar 2011, 9:41:45 UTC - in response to Message 1088530.  
Last modified: 20 Mar 2011, 9:45:30 UTC

GPU-Z is a good Tool to view Memory and GPU utilisation. I've testet around with 3 or 4 WU's and i found that 3 WU are sufficent per GPU. The workload on the GPU is then around 70 to 95 %, but depends on the WU. You can also use monitoring tools that come with your GPU like EVGA Precision or MSI Afterburner and others. If you have a keyboard that comes with an LC-Display these Tools can be pluged in the display and you can easy monitor the workload on the GPU even without turning on the PC-Monitor.



TOM
ID: 1088762 · Report as offensive
Profile Sunny129
Avatar

Send message
Joined: 7 Nov 00
Posts: 190
Credit: 3,163,755
RAC: 0
United States
Message 1105891 - Posted: 14 May 2011, 4:18:41 UTC

ok, so now that the S@H server issues have been [mostly] resolved, i've been able to do some "simultaneous task" testing. right now i'm just focusing on Multibeam tasks. as i mentioned previously, i have an ATI GPU, whose memory bandwidth consumption is not a parameter displayed by GPU-Z or any other software that i know of (unlike it does for some nVidia GPUs). not being able to use GPU memory bandwidth consumption as a guideline for how many simultaneous tasks my 5870 2GB GPU can handle, i naturally looked to GPU load per Bill's advice.

GPU load is good indicator - if you see 60% with 2 tasks you may try 3 to get 90-95% GPU load (but if you see 90% with 2 tasks the GPU is unlikely to handle more tasks)

but right away i noticed that GPU load wasn't scaling linearly with the number of tasks running. rather i was seeing GPU utilization in the low to mid 90's with one task running, and full GPU utilization (upper 90's) with both two and three simultaneous tasks running (as opposed to ~30% GPU load w/ 1 task, ~60% w/ 2 tasks, and ~90% w/ 3 tasks). based on this observation, one would think that if 1 solitary MB task uses over 90% of my GPU, then there is no GPU compute power left on tap for an additional task or two on top of the one that's already running. in other words, if one MB task consumes almost 100% of my GPU, then 2 simultaneous MB tasks should take just about twice as long to complete, and 3 tasks should take about 3 times as long to complete ...assuming the GPU memory hasn't yet become a bottleneck as well.

and yet here's the wierd part. i've been running MB tasks 1 at a time, 2 at a time, and 3 at a time, recording run times, and comparing them to see what kind of efficiency increases i'm seeing. and although i haven't recorded near enough tasks yet to call my findings conclusive, it appears that despite a single MB task consuming a full "GPU core," 2 simultaneous tasks do not seem to take nearly as long to complete as two consecutive tasks would. ditto for 3 simultaneous tasks. i don't know why this is, and perhaps i'm misreading that data b/c the sample population is still yet far too small, but already i'm seeing trends. to be specific, it appears that simultaneously run tasks take no longer to run that individually run tasks, suggesting that running 2 simultaneous MB tasks results is doubling my efficiency, and that running 3 simultaneous tasks is tripling my efficiency. i'm skeptical to say the least...but i'll keep monitoring tasks and see if my GPU is actually seeing improved efficnecies.
ID: 1105891 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1105991 - Posted: 14 May 2011, 5:32:38 UTC - in response to Message 1105891.  


Do you check the <true_angle_range> of the running tasks?
Compare only the times for tasks which have close "enough" true_angle_ranges.

To check the angle ranges of the running tasks:
Go to slots directory and look in stderr.txt files for
"WU true angle range is : ..."


Or check the Completed and reported tasks:
http://setiathome.berkeley.edu/results.php?hostid=5800349&offset=0&show_names=1&state=2&appid=

Click in the left column to see the task's Stderr output:
http://setiathome.berkeley.edu/result.php?resultid=1908493734

and find/search (Ctrl-F or F3) for "range" to find:
"WU true angle range is : 0.432713"

(I see it also reported as "ar=0.432713" by the "OpenCL version by Raistmer, rev177"
so you can search for "ar=" to find it.
)


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1105991 · Report as offensive
Profile Sunny129
Avatar

Send message
Joined: 7 Nov 00
Posts: 190
Credit: 3,163,755
RAC: 0
United States
Message 1106105 - Posted: 14 May 2011, 12:23:26 UTC - in response to Message 1105991.  
Last modified: 14 May 2011, 12:26:13 UTC


Do you check the <true_angle_range> of the running tasks?
Compare only the times for tasks which have close "enough" true_angle_ranges.

To check the angle ranges of the running tasks:
Go to slots directory and look in stderr.txt files for
"WU true angle range is : ..."


Or check the Completed and reported tasks:
http://setiathome.berkeley.edu/results.php?hostid=5800349&offset=0&show_names=1&state=2&appid=

Click in the left column to see the task's Stderr output:
http://setiathome.berkeley.edu/result.php?resultid=1908493734

and find/search (Ctrl-F or F3) for "range" to find:
"WU true angle range is : 0.432713"

(I see it also reported as "ar=0.432713" by the "OpenCL version by Raistmer, rev177"
so you can search for "ar=" to find it.
)


thanks for the tip Bill...but exactly how close is "close enough?" does the true angle range of tasks have to match for example up to 3 decimal places in order to be worthy of comparison with each other?

i did notice that tasks with similar names generally have matching (or very similar) angle ranges, and that tasks with vastly different names generally have very different angle ranges. so already i see some natural grouping...again, i just want to be clear on how close is "close enough."
ID: 1106105 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1106372 - Posted: 15 May 2011, 4:43:29 UTC - in response to Message 1106105.  

but exactly how close is "close enough?"

Can't tell "exactly" but maybe 10% difference is OK


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1106372 · Report as offensive
Profile Sunny129
Avatar

Send message
Joined: 7 Nov 00
Posts: 190
Credit: 3,163,755
RAC: 0
United States
Message 1106503 - Posted: 15 May 2011, 17:39:32 UTC - in response to Message 1106372.  

but exactly how close is "close enough?"

Can't tell "exactly" but maybe 10% difference is OK


ok thanks. what i'm doing at this point is suspending all MB tasks except for specific groups of MB tasks that have similar names (thus increasing the probability that they came from the same data set and have similar angle ranges). then i'm taking those specific groups of tasks and running some of them one at a time, some of them 2 at a time, and some of them 3 at a time. so far the grouping of tasks with similar names has resulted in identical (or at least very similar) angle ranges, the greatest variation between any two observed angle ranges so far being %0.06 - well below the error margin of 10%.

i believe that with this methodology i'm starting to see differences in run times between single tasks, 2 simultaneous tasks, and 3 simultaneous tasks. a bit more testing and it should start becoming clear as to what kind of improvements in efficiency i'm seeing by crunching more than one task simultaneously.
ID: 1106503 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1106704 - Posted: 16 May 2011, 5:08:47 UTC - in response to Message 1106503.  

but exactly how close is "close enough?"

Can't tell "exactly" but maybe 10% difference is OK


ok thanks. what i'm doing at this point is suspending all MB tasks except for specific groups of MB tasks that have similar names (thus increasing the probability that they came from the same data set and have similar angle ranges). then i'm taking those specific groups of tasks and running some of them one at a time, some of them 2 at a time, and some of them 3 at a time. so far the grouping of tasks with similar names has resulted in identical (or at least very similar) angle ranges, the greatest variation between any two observed angle ranges so far being %0.06 - well below the error margin of 10%.

i believe that with this methodology i'm starting to see differences in run times between single tasks, 2 simultaneous tasks, and 3 simultaneous tasks. a bit more testing and it should start becoming clear as to what kind of improvements in efficiency i'm seeing by crunching more than one task simultaneously.

Very good methodology! :)


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1106704 · Report as offensive
Profile Sunny129
Avatar

Send message
Joined: 7 Nov 00
Posts: 190
Credit: 3,163,755
RAC: 0
United States
Message 1110978 - Posted: 29 May 2011, 6:21:39 UTC

ok, so not long after i started my experiment i realized that, despite being relatively easy, it would take a good deal of time and a great deal of patience. i started by increasing my cache size to several days worth in order to increase the probability that the bulk of downloaded work would be from the same "group" of tasks (i.e. similar task names, and therefore probably similar angle ranges). so far i've been testing tasks from the group 23se10ae.xxxxx.xxxx.xx.xx.xxx_x, whose AR's seem to fall in the 0.42xxxx-0.43xxxx range (a ~2.4% margin of error). i've run several tasks 1 at a time, 2 simultaneously, 3 simultaneously, and 4 simultaneously now, and quite frankly i'm not sure the results are conclusive yet by any means. more specifically, running 2 tasks simultaneously seems to take approx. 91.5% of the time it takes for 2 tasks to crunch individually, running 3 tasks simultaneously seems to take approx. 89.5% of the time it takes for 3 tasks to crunch individually, and running 4 tasks simultaneously seems to take approx. 90.5% of the time it takes for 4 tasks to crunch individually.

if these results are conclusive, then running 3 tasks simultaneously is clearly the best choice. however, given that the increases in efficiency are so similar for running 2, 3, and 4 tasks simultaneously, i'm skeptical. perhaps i need to test tasks in a slightly different angle range, or perhaps i just need to build up a larger sample size. in conclusion to this leg of the testing, i'm going to say its inconclusive at this point, and will continue testing...
ID: 1110978 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1111359 - Posted: 30 May 2011, 13:06:13 UTC - in response to Message 1110978.  

I think we already had this testing done while it was in beta it was discovered that anything more than 2 WU's began to slow the GPU down so that the WU's weren't being done efficiently.

The newer 6900 series should theoretically be able to handle 3-4 WU's at a time because of the 2gb of ram onboard. I've not been able to get my Windows stable to the point that I'd be comfortable doing that many without crashing the box

BTW I assume you have a 5800 series gpu and not a 2XXX that your account shows.




In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1111359 · Report as offensive
Profile Lint trap

Send message
Joined: 30 May 03
Posts: 871
Credit: 28,092,319
RAC: 0
United States
Message 1111374 - Posted: 30 May 2011, 13:49:20 UTC - in response to Message 1088530.  

i'm hoping TOM will chime in, b/c i'm not sure how he was able to tell that his GPU memory consumption was 998MB.


GPU-Z shows "Memory Used" but maybe not on all video cards:





Also:
http://www.overclock.net/attachments/software-news/156434d1274481904-tpu-gpu-z-v0-4-3-gpuz.jpg


Try older version if GPU-Z v0.5.1 do not show it:
TechPowerUp GPU-Z v0.4.9
http://www.techpowerup.com/downloads/1907/TechPowerUp%20GPU-Z%20v0.4.9.html

TechPowerUp GPU-Z v0.4.2
http://www.techpowerup.com/downloads/1788/TechPowerUp%20GPU-Z%20v0.4.2.html




Take a look at this. It supposedly works with either Nvidia or ATI/AMD.

I've been running it for over a month and am happy with it. Putting temps in task bar is a nice touch, and there are other options available.

Martin
ID: 1111374 · Report as offensive
Profile Sunny129
Avatar

Send message
Joined: 7 Nov 00
Posts: 190
Credit: 3,163,755
RAC: 0
United States
Message 1111395 - Posted: 30 May 2011, 15:21:00 UTC - in response to Message 1111359.  

BTW I assume you have a 5800 series gpu and not a 2XXX that your account shows.

that is correct. if you'll recall a while back my month long hurdle of trying to figure out how to use my discrete GPU strictly for crunching and my integrated GPU strictly for display purposes, i was eventually successful...and since my primary (display) GPU is the integrated GPU, my SETI account shows the HD 2XXX series in my host's profile.

I think we already had this testing done while it was in beta it was discovered that anything more than 2 WU's began to slow the GPU down so that the WU's weren't being done efficiently.

well if it was done during beta testing, i've been unable to find the results with the search engine. so i figured i could at least work toward conclusive results with my particular video card just in case someone with the same card (or a different one for that matter) would like to draw comparisons or make estimates.

The newer 6900 series should theoretically be able to handle 3-4 WU's at a time because of the 2gb of ram onboard. I've not been able to get my Windows stable to the point that I'd be comfortable doing that many without crashing the box

well the thing is, my 5870 is a 2GB model, not a 1GB model. perhaps that is the reason my testing has been relatively inconclusive up to this point. that is, perhaps the reason running 3 or 4 simultaneous tasks doesn't seem any slower than running 2 simultaneous or 2 consecutive solitary tasks on my 5870 is b/c it has an extra 1GB of memory compared to most 5870s. mind you this is without any hiccups, let alone actual system crashes.



Take a look at this. It supposedly works with either Nvidia or ATI/AMD.

I've been running it for over a month and am happy with it. Putting temps in task bar is a nice touch, and there are other options available.

Martin

much appreciated...i'll give it a shot when i get a chance.
ID: 1111395 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1111585 - Posted: 31 May 2011, 1:30:44 UTC - in response to Message 1111395.  


This is how Open Hardware Monitor (only 240 KB) looks on my PC:
http://openhardwaremonitor.org/

Main window:




"Gadget" window (Gadget works on Win XP from which is the screenshot):




(this is integrated GPU "NVIDIA GeForce 6150SE nForce 430" using 128 MB of the "normal" memory as video memory so not much info shown about the GPU)

( 3 months ago I posted the link here:
http://setiathome.berkeley.edu/forum_thread.php?id=62044&nowrap=true#1075235
... but forget about it till now ;)

You can also try GPU Caps Viewer posted in the same place.
)


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1111585 · Report as offensive
Profile Sunny129
Avatar

Send message
Joined: 7 Nov 00
Posts: 190
Credit: 3,163,755
RAC: 0
United States
Message 1111603 - Posted: 31 May 2011, 2:45:17 UTC - in response to Message 1111585.  
Last modified: 31 May 2011, 3:10:03 UTC

looks promising Bill. the only thing i have reservations about is whether or not the GPU memory usage will show up while the software is detecting an ATI GPU. i remember i tried GPU-Z b/c folks said that it monitors GPU memory usage...but i came to find out that it only did so for nVidia GPUs. at any rate, i'm gonna give it a try and see how it works for me.


*EDIT* - gave Open Hardware Monitor a try, and unfortunately like GPU-Z, it doesn't display GPU memory consumption for ATI GPUs. not the end of the world though...plus i like the interface, so i may end up using it. i'm starting to think that ATI's Catalyst drivers just aren't designed with such a parameter for GPU utilities to display, unlike nVidia's GeForce drivers.
ID: 1111603 · Report as offensive
Profile Lint trap

Send message
Joined: 30 May 03
Posts: 871
Credit: 28,092,319
RAC: 0
United States
Message 1111622 - Posted: 31 May 2011, 3:50:29 UTC

I googled "ati monitor tool" and ended up here http://forums.legitreviews.com/about8827.html.

I see temp readings, but not much else in the way of card details.

BUT, there's an ATI GPU memory viewer on this page, along with some other things...maybe it works for your video/s??

http://www.amd.com/us/products/workstation/graphics/tools/pages/tools.aspx

Martin
ID: 1111622 · Report as offensive
Profile Sunny129
Avatar

Send message
Joined: 7 Nov 00
Posts: 190
Credit: 3,163,755
RAC: 0
United States
Message 1111631 - Posted: 31 May 2011, 4:10:44 UTC - in response to Message 1111622.  

BUT, there's an ATI GPU memory viewer on this page, along with some other things...maybe it works for your video/s??

http://www.amd.com/us/products/workstation/graphics/tools/pages/tools.aspx

Martin

this is exactly what i'm looking for. good find! unfortunately it doesn't have an option to toggle between multiple GPUs. i have a 5870 2GB card that i use strictly for crunching, and an integrated 3300 that i use strictly for display purposes, and is therefore automatically listed as my primary GPU. as such, ATI Memory Viewer only displays the memory usage of my integrated GPU...oh well
ID: 1111631 · Report as offensive
Profile Lint trap

Send message
Joined: 30 May 03
Posts: 871
Credit: 28,092,319
RAC: 0
United States
Message 1111644 - Posted: 31 May 2011, 4:37:07 UTC - in response to Message 1111631.  


...snip...as such, ATI Memory Viewer only displays the memory usage of my integrated GPU...oh well


If you disable the onboard video and connect your monitor to your 5870, does the viewer see the 5870 then? or does it only display the onboard video?? (maybe because that is the first video it finds when it scans the system)...

Does your motherboard Bios have an option for which video to initialize first??

Martin
ID: 1111644 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1111655 - Posted: 31 May 2011, 5:57:56 UTC - in response to Message 1111631.  


Try to press the ATI logo (it is button but what its function?)


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1111655 · Report as offensive
Profile Sunny129
Avatar

Send message
Joined: 7 Nov 00
Posts: 190
Credit: 3,163,755
RAC: 0
United States
Message 1111732 - Posted: 31 May 2011, 13:41:02 UTC - in response to Message 1111644.  

If you disable the onboard video and connect your monitor to your 5870, does the viewer see the 5870 then? or does it only display the onboard video?? (maybe because that is the first video it finds when it scans the system)...

i'm sure it would, b/c then the only active and recognizable GPU in the system would be the 5870. but i'm not going to verify this, b/c it defeats the purpose of my current setup...for more details, see below...

Does your motherboard Bios have an option for which video to initialize first??

yes, i believe there is an option in the BIOS to choose which displays first - PCIe or onboard. but changing it to make my discrete PCIe GPU the primary GPU wouldn't work with my current display preferences. you see, i don't actually have both the discrete GPU and the integrated GPU hooked up to the monitor - only the integrated GPU is hooked up to the monitor, while the discrete GPU is simply attached to a dummy plug. by using my integrated 3300 GPU strictly for display purposes, and by using my discrete 5870 GPU strictly for crunching, i more or less eliminate the GUI lag so many people complain about when they try to crunch DC work on the same GPU that runs his/her display. so even if i did switch the primary display device to the 5870 in the BIOS, i wouldn't see anything on the monitor until unplugged the dummy plug and plugged the monitor into the 5870, which again defeats the purpose of the separation of duties between GPUs.


Try to press the ATI logo (it is button but what its function?)

i did that, but the button seems to be a dud - it doesn't do anything.
ID: 1111732 · Report as offensive
Profile Lint trap

Send message
Joined: 30 May 03
Posts: 871
Credit: 28,092,319
RAC: 0
United States
Message 1111779 - Posted: 31 May 2011, 16:24:44 UTC - in response to Message 1111732.  
Last modified: 31 May 2011, 16:25:57 UTC

I understand most of what you are saying because I used to have an 8800 GTS solely for crunching and a 7600 GT for video.

Every time I switched them around, testing performance or trying to improve throughput, (main pci-e is 16x 2.0, secondary is wired for just 4x 1.0) I went thru the song and dance with XP to get video working again. Ah, the good ole' days...

It seems incredible to me that ATI/AMD doesn't have any better tools available (at least not widely known) for what you want. It's not like you're looking for something only you might benefit from.

Martin
ID: 1111779 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Questions and Answers : GPU applications : More than one WU per GPU at the same time ?


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.