Questions and Answers :
GPU applications :
More than one WU per GPU at the same time ?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
| Author | Message |
|---|---|
Sunny129 Send message Joined: 7 Nov 00 Posts: 190 Credit: 3,163,755 RAC: 0
|
ah, i forgot about that other line :). clearly i've only attained "Jedi Knight" status, and am far from being a Master lol. i edited the app info file, but apparently i'll have to wait to test it out again, b/c i'm currently all out of work and the server seems to have been having problems for the last 12 hours now. but i'll update and let you know how it goes when more work is available.
|
TOM Send message Joined: 5 Apr 01 Posts: 53 Credit: 65,422,234 RAC: 86
|
GPU-Z is a good Tool to view Memory and GPU utilisation. I've testet around with 3 or 4 WU's and i found that 3 WU are sufficent per GPU. The workload on the GPU is then around 70 to 95 %, but depends on the WU. You can also use monitoring tools that come with your GPU like EVGA Precision or MSI Afterburner and others. If you have a keyboard that comes with an LC-Display these Tools can be pluged in the display and you can easy monitor the workload on the GPU even without turning on the PC-Monitor. TOM |
Sunny129 Send message Joined: 7 Nov 00 Posts: 190 Credit: 3,163,755 RAC: 0
|
ok, so now that the S@H server issues have been [mostly] resolved, i've been able to do some "simultaneous task" testing. right now i'm just focusing on Multibeam tasks. as i mentioned previously, i have an ATI GPU, whose memory bandwidth consumption is not a parameter displayed by GPU-Z or any other software that i know of (unlike it does for some nVidia GPUs). not being able to use GPU memory bandwidth consumption as a guideline for how many simultaneous tasks my 5870 2GB GPU can handle, i naturally looked to GPU load per Bill's advice. GPU load is good indicator - if you see 60% with 2 tasks you may try 3 to get 90-95% GPU load (but if you see 90% with 2 tasks the GPU is unlikely to handle more tasks) but right away i noticed that GPU load wasn't scaling linearly with the number of tasks running. rather i was seeing GPU utilization in the low to mid 90's with one task running, and full GPU utilization (upper 90's) with both two and three simultaneous tasks running (as opposed to ~30% GPU load w/ 1 task, ~60% w/ 2 tasks, and ~90% w/ 3 tasks). based on this observation, one would think that if 1 solitary MB task uses over 90% of my GPU, then there is no GPU compute power left on tap for an additional task or two on top of the one that's already running. in other words, if one MB task consumes almost 100% of my GPU, then 2 simultaneous MB tasks should take just about twice as long to complete, and 3 tasks should take about 3 times as long to complete ...assuming the GPU memory hasn't yet become a bottleneck as well. and yet here's the wierd part. i've been running MB tasks 1 at a time, 2 at a time, and 3 at a time, recording run times, and comparing them to see what kind of efficiency increases i'm seeing. and although i haven't recorded near enough tasks yet to call my findings conclusive, it appears that despite a single MB task consuming a full "GPU core," 2 simultaneous tasks do not seem to take nearly as long to complete as two consecutive tasks would. ditto for 3 simultaneous tasks. i don't know why this is, and perhaps i'm misreading that data b/c the sample population is still yet far too small, but already i'm seeing trends. to be specific, it appears that simultaneously run tasks take no longer to run that individually run tasks, suggesting that running 2 simultaneous MB tasks results is doubling my efficiency, and that running 3 simultaneous tasks is tripling my efficiency. i'm skeptical to say the least...but i'll keep monitoring tasks and see if my GPU is actually seeing improved efficnecies.
|
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0
|
Do you check the <true_angle_range> of the running tasks? Compare only the times for tasks which have close "enough" true_angle_ranges. To check the angle ranges of the running tasks: Go to slots directory and look in stderr.txt files for "WU true angle range is : ..." Or check the Completed and reported tasks: http://setiathome.berkeley.edu/results.php?hostid=5800349&offset=0&show_names=1&state=2&appid= Click in the left column to see the task's Stderr output: http://setiathome.berkeley.edu/result.php?resultid=1908493734 and find/search (Ctrl-F or F3) for "range" to find: "WU true angle range is : 0.432713" (I see it also reported as "ar=0.432713" by the "OpenCL version by Raistmer, rev177" so you can search for "ar=" to find it. ) Â - ALF - "Find out what you don't do well ..... then don't do it!" :)Â |
Sunny129 Send message Joined: 7 Nov 00 Posts: 190 Credit: 3,163,755 RAC: 0
|
thanks for the tip Bill...but exactly how close is "close enough?" does the true angle range of tasks have to match for example up to 3 decimal places in order to be worthy of comparison with each other? i did notice that tasks with similar names generally have matching (or very similar) angle ranges, and that tasks with vastly different names generally have very different angle ranges. so already i see some natural grouping...again, i just want to be clear on how close is "close enough."
|
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0
|
but exactly how close is "close enough?" Can't tell "exactly" but maybe 10% difference is OK Â - ALF - "Find out what you don't do well ..... then don't do it!" :)Â |
Sunny129 Send message Joined: 7 Nov 00 Posts: 190 Credit: 3,163,755 RAC: 0
|
but exactly how close is "close enough?" ok thanks. what i'm doing at this point is suspending all MB tasks except for specific groups of MB tasks that have similar names (thus increasing the probability that they came from the same data set and have similar angle ranges). then i'm taking those specific groups of tasks and running some of them one at a time, some of them 2 at a time, and some of them 3 at a time. so far the grouping of tasks with similar names has resulted in identical (or at least very similar) angle ranges, the greatest variation between any two observed angle ranges so far being %0.06 - well below the error margin of 10%. i believe that with this methodology i'm starting to see differences in run times between single tasks, 2 simultaneous tasks, and 3 simultaneous tasks. a bit more testing and it should start becoming clear as to what kind of improvements in efficiency i'm seeing by crunching more than one task simultaneously.
|
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0
|
but exactly how close is "close enough?" Very good methodology! :) Â - ALF - "Find out what you don't do well ..... then don't do it!" :)Â |
Sunny129 Send message Joined: 7 Nov 00 Posts: 190 Credit: 3,163,755 RAC: 0
|
ok, so not long after i started my experiment i realized that, despite being relatively easy, it would take a good deal of time and a great deal of patience. i started by increasing my cache size to several days worth in order to increase the probability that the bulk of downloaded work would be from the same "group" of tasks (i.e. similar task names, and therefore probably similar angle ranges). so far i've been testing tasks from the group 23se10ae.xxxxx.xxxx.xx.xx.xxx_x, whose AR's seem to fall in the 0.42xxxx-0.43xxxx range (a ~2.4% margin of error). i've run several tasks 1 at a time, 2 simultaneously, 3 simultaneously, and 4 simultaneously now, and quite frankly i'm not sure the results are conclusive yet by any means. more specifically, running 2 tasks simultaneously seems to take approx. 91.5% of the time it takes for 2 tasks to crunch individually, running 3 tasks simultaneously seems to take approx. 89.5% of the time it takes for 3 tasks to crunch individually, and running 4 tasks simultaneously seems to take approx. 90.5% of the time it takes for 4 tasks to crunch individually. if these results are conclusive, then running 3 tasks simultaneously is clearly the best choice. however, given that the increases in efficiency are so similar for running 2, 3, and 4 tasks simultaneously, i'm skeptical. perhaps i need to test tasks in a slightly different angle range, or perhaps i just need to build up a larger sample size. in conclusion to this leg of the testing, i'm going to say its inconclusive at this point, and will continue testing...
|
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60
|
I think we already had this testing done while it was in beta it was discovered that anything more than 2 WU's began to slow the GPU down so that the WU's weren't being done efficiently. The newer 6900 series should theoretically be able to handle 3-4 WU's at a time because of the 2gb of ram onboard. I've not been able to get my Windows stable to the point that I'd be comfortable doing that many without crashing the box BTW I assume you have a 5800 series gpu and not a 2XXX that your account shows. In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
Lint trap Send message Joined: 30 May 03 Posts: 871 Credit: 28,092,319 RAC: 0
|
i'm hoping TOM will chime in, b/c i'm not sure how he was able to tell that his GPU memory consumption was 998MB. Take a look at this. It supposedly works with either Nvidia or ATI/AMD. I've been running it for over a month and am happy with it. Putting temps in task bar is a nice touch, and there are other options available. Martin |
Sunny129 Send message Joined: 7 Nov 00 Posts: 190 Credit: 3,163,755 RAC: 0
|
BTW I assume you have a 5800 series gpu and not a 2XXX that your account shows. that is correct. if you'll recall a while back my month long hurdle of trying to figure out how to use my discrete GPU strictly for crunching and my integrated GPU strictly for display purposes, i was eventually successful...and since my primary (display) GPU is the integrated GPU, my SETI account shows the HD 2XXX series in my host's profile. I think we already had this testing done while it was in beta it was discovered that anything more than 2 WU's began to slow the GPU down so that the WU's weren't being done efficiently. well if it was done during beta testing, i've been unable to find the results with the search engine. so i figured i could at least work toward conclusive results with my particular video card just in case someone with the same card (or a different one for that matter) would like to draw comparisons or make estimates. The newer 6900 series should theoretically be able to handle 3-4 WU's at a time because of the 2gb of ram onboard. I've not been able to get my Windows stable to the point that I'd be comfortable doing that many without crashing the box well the thing is, my 5870 is a 2GB model, not a 1GB model. perhaps that is the reason my testing has been relatively inconclusive up to this point. that is, perhaps the reason running 3 or 4 simultaneous tasks doesn't seem any slower than running 2 simultaneous or 2 consecutive solitary tasks on my 5870 is b/c it has an extra 1GB of memory compared to most 5870s. mind you this is without any hiccups, let alone actual system crashes. Take a look at this. It supposedly works with either Nvidia or ATI/AMD. much appreciated...i'll give it a shot when i get a chance.
|
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0
|
This is how Open Hardware Monitor (only 240 KB) looks on my PC: http://openhardwaremonitor.org/ Main window: "Gadget" window (Gadget works on Win XP from which is the screenshot): (this is integrated GPU "NVIDIA GeForce 6150SE nForce 430" using 128 MB of the "normal" memory as video memory so not much info shown about the GPU) ( 3 months ago I posted the link here: http://setiathome.berkeley.edu/forum_thread.php?id=62044&nowrap=true#1075235 ... but forget about it till now ;) You can also try GPU Caps Viewer posted in the same place. ) Â - ALF - "Find out what you don't do well ..... then don't do it!" :)Â |
Sunny129 Send message Joined: 7 Nov 00 Posts: 190 Credit: 3,163,755 RAC: 0
|
looks promising Bill. the only thing i have reservations about is whether or not the GPU memory usage will show up while the software is detecting an ATI GPU. i remember i tried GPU-Z b/c folks said that it monitors GPU memory usage...but i came to find out that it only did so for nVidia GPUs. at any rate, i'm gonna give it a try and see how it works for me. *EDIT* - gave Open Hardware Monitor a try, and unfortunately like GPU-Z, it doesn't display GPU memory consumption for ATI GPUs. not the end of the world though...plus i like the interface, so i may end up using it. i'm starting to think that ATI's Catalyst drivers just aren't designed with such a parameter for GPU utilities to display, unlike nVidia's GeForce drivers.
|
Lint trap Send message Joined: 30 May 03 Posts: 871 Credit: 28,092,319 RAC: 0
|
I googled "ati monitor tool" and ended up here http://forums.legitreviews.com/about8827.html. I see temp readings, but not much else in the way of card details. BUT, there's an ATI GPU memory viewer on this page, along with some other things...maybe it works for your video/s?? http://www.amd.com/us/products/workstation/graphics/tools/pages/tools.aspx Martin |
Sunny129 Send message Joined: 7 Nov 00 Posts: 190 Credit: 3,163,755 RAC: 0
|
BUT, there's an ATI GPU memory viewer on this page, along with some other things...maybe it works for your video/s?? this is exactly what i'm looking for. good find! unfortunately it doesn't have an option to toggle between multiple GPUs. i have a 5870 2GB card that i use strictly for crunching, and an integrated 3300 that i use strictly for display purposes, and is therefore automatically listed as my primary GPU. as such, ATI Memory Viewer only displays the memory usage of my integrated GPU...oh well
|
Lint trap Send message Joined: 30 May 03 Posts: 871 Credit: 28,092,319 RAC: 0
|
If you disable the onboard video and connect your monitor to your 5870, does the viewer see the 5870 then? or does it only display the onboard video?? (maybe because that is the first video it finds when it scans the system)... Does your motherboard Bios have an option for which video to initialize first?? Martin |
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0
|
Try to press the ATI logo (it is button but what its function?) Â - ALF - "Find out what you don't do well ..... then don't do it!" :)Â |
Sunny129 Send message Joined: 7 Nov 00 Posts: 190 Credit: 3,163,755 RAC: 0
|
If you disable the onboard video and connect your monitor to your 5870, does the viewer see the 5870 then? or does it only display the onboard video?? (maybe because that is the first video it finds when it scans the system)... i'm sure it would, b/c then the only active and recognizable GPU in the system would be the 5870. but i'm not going to verify this, b/c it defeats the purpose of my current setup...for more details, see below... Does your motherboard Bios have an option for which video to initialize first?? yes, i believe there is an option in the BIOS to choose which displays first - PCIe or onboard. but changing it to make my discrete PCIe GPU the primary GPU wouldn't work with my current display preferences. you see, i don't actually have both the discrete GPU and the integrated GPU hooked up to the monitor - only the integrated GPU is hooked up to the monitor, while the discrete GPU is simply attached to a dummy plug. by using my integrated 3300 GPU strictly for display purposes, and by using my discrete 5870 GPU strictly for crunching, i more or less eliminate the GUI lag so many people complain about when they try to crunch DC work on the same GPU that runs his/her display. so even if i did switch the primary display device to the 5870 in the BIOS, i wouldn't see anything on the monitor until unplugged the dummy plug and plugged the monitor into the 5870, which again defeats the purpose of the separation of duties between GPUs. Try to press the ATI logo (it is button but what its function?) i did that, but the button seems to be a dud - it doesn't do anything.
|
Lint trap Send message Joined: 30 May 03 Posts: 871 Credit: 28,092,319 RAC: 0
|
I understand most of what you are saying because I used to have an 8800 GTS solely for crunching and a 7600 GT for video. Every time I switched them around, testing performance or trying to improve throughput, (main pci-e is 16x 2.0, secondary is wired for just 4x 1.0) I went thru the song and dance with XP to get video working again. Ah, the good ole' days... It seems incredible to me that ATI/AMD doesn't have any better tools available (at least not widely known) for what you want. It's not like you're looking for something only you might benefit from. Martin |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.