Message boards :
Number crunching :
Getting back into it, advice appreciated
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next
Author | Message |
---|---|
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
FP32, and to some extent memory speeds if available. Compote efficiency single instance, for most GPU applications here, works out to about 5% of theoretical peak (which for GPUs is about 'half-optimised'), therefore you can reasonably estimate expected GFlops at 1/20th of theoretical peak, and give +/- 20% slop to allow for system and running conditions and application. Multiple instances will increase the total efficiency within reason, to the region of 10% [With current app technologies]. Thanks for the detailed reply, Jason! I haven't had my 1st cup of coffee yet though, so I think I'll need to re-read this again after one or two... And maybe a couple times... Or maybe it's just the translation from Aus to American english ;-) |
rob smith Send message Joined: 7 Mar 03 Posts: 22258 Credit: 416,307,556 RAC: 380 |
32 & 64 bit. However I would exercise great caution in following these figures with too much enthusiasm as in the real world (SETI) there are so many other factors that come into play, not the least of which are the drivers employed, the application being run and the task being run. As Glenn says, it is very hard to beat the GTX750ti in terms of credit/watt, but sadly these cards are getting harder to get - I guess there is some gearing up for the forthcoming release of the next generation of Nvidia GPUs, which is due in a couple of months time if there is any truth in the rumours. Next up would be a GTX970, but they are a notch up on price (and performance). Beyond that you are into GTX980, 980ti & Titan territory, the capital cost goes up, the power consumption jumps for only small increases in performance. Just take a look at the top 20 computers on SETI (even if one excludes Petri's four GTX980 with a very highly optimised application) there are no AMD/ATI GPUs until one gets to 8, and then that is in a heterogeneous mix with Nvidia. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
I have installed a Geforce GTX 750 OC on my Windows 10 PC and it crunches Einstein@home binary radio pulsars tasks with data from both Arecibo and Parkes with good results, even if with some computation errors. But I don't get any SETI@home task, even if I have allowed them. Why? Tullio |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Rob, so you are saying that the 750 is still a better choice than the 970 in the real world, the guesstimates in the comparison I made above don't really hold true? I would of course try to use the bestest, most optimized client, and driver version for whatever card I get, but the 4x, 3x, and 2x+ seem to be a pretty good way to go, if you can get past the initial purchase price, of course. But, obviously I don't know, that's why I come here and as others who do! Maybe in a few months like was mentioned, when the newest latest and greatest cards come out, there will be a flood of 970's on the secondary market from people upgrading, and I'll be able to find some for substantially less then the approx $300 they are getting now new, hopefully around half of that, give or take? One can always hope! |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
From the top hosts list You can find a machine with 3 GTX970's and a machine with 3 GTX750Ti's. Their performance is at about 115000 RAC and 65000 RAC with the best current alpha software. That should give some indication of what to expect in near future. He's running one task at a time on each GPU. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
From the top hosts list You can find a machine with 3 GTX970's and a machine with 3 GTX750Ti's. Their performance is at about 115000 RAC and 65000 RAC with the best current alpha software. That should give some indication of what to expect in near future. He's running one task at a time on each GPU. So the system with three 970's has about 1.77 times as much output for about 2.42 times as much power used compared to the system with three 750TI's. EDIT: It made me sad when I looked at the MB run times on the three 750Ti system. They run MB work in about the same time my R9 390x does. However looking at the AP run times on the 750Ti's and the 970's made me happy again. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
From the top hosts list You can find a machine with 3 GTX970's and a machine with 3 GTX750Ti's. Their performance is at about 115000 RAC and 65000 RAC with the best current alpha software. That should give some indication of what to expect in near future. He's running one task at a time on each GPU. And three times the upfront cost... |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
I think he runs AP's two at a time. I'm not sure. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
Yeah, there is still a lot of unrealized potential in the AMD cards. Hopefully some of the streams/stream concurrency methods used in Cuda will translate to some extent to the OpenCL apps. Probably wishful thinking but hopefully not lol. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
I think he runs AP's two at a time. I'm not sure. They would need to run 3-4 AP at a time to average the same kind of output. I'm not sure the 970 could manage that. It's only ~3500 GFLOP vs ~5900 GFLOP. I tried looking at some 980Ti and Titan X systems but they had run times in the 45min to 1hr range. Which if they are running 4 then it puts them about on par with my ~12 min run times. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
A lot depends on customization of those systems beyond just plugging in a GPU and telling it to run 3. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
OpenCL is pretty high level compared to CUDA from my understanding. Which might be a limiting factor. If that is true perhaps Vulkan has the potential to improve on any shortcomings of OpenCL. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Petri, what's your best guesstimate about when your optimized applications make it to Main? Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Hmm, this is good information. If this example of real world performance, I have a bit of thinking to do. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
K, thanks Jason. Quick question, I just got my new system fired up and test running and tuning for ultimate performance, in the tweeking stage right now. I need some advice on App_info/config files (I presume the app_info, because I found that one - which I can post if you'd like - but not the config one). Which would be necessary, and the best way to config them to utilize this setup the most effectively? It is ID 7949285, the one with the i7-5930K CPU running currently @ 4.50GHz using a H100 water cooler (coretemp says that they are fluctuating between 61-73C - usually 66-69 - while crunching, which I think is less than impressive, anyones thoughts on this?) and a 1600 watt EVGA PSU. It is running 12 CPU and one GPU tasks, and I'd think I'd be best to tone down the CPU's a little and up the GPU's to 3, but it's been so long, I've forgotten how to configure it, and which file would be best to modify to do so. If anyone could toss me a bone and post a file that I could use, or even suggestions on what to modify to get it configed correctly, I'd appreciate it, as I would really like to see what this baby can do. It's the highest performance system I've ever built, and have been fighting it for a couple weeks now with the M.2 slot and the Samsung V-NAND 950 Pro card, on the original Asus X99-Deluxe board (not the most current one, it's been sitting on the shelf till I could spare the extra $ for the CPU, they sure aren't cheap!). In fact, I tossed in the towel on it (M.2 boot drive), and will be returning it to MicroCenter, because the technology is still kind of kludgey when trying to install Win 7(UEFI bios changes that didn't work, etc, and yes I did the latest BIOS, etc, even spent an hour on the phone with ASUS, they couldn't figure it out either), and I don't want to play around with it any more, I just want to use the computer. Would have been nice, but not critical, and can save the $ for something else down the road. Installed an Intel 850 SSD, that will be good enough for me. Thanks for any suggestions guys! Oh, one last thing, I have a couple GTX670 cards laying around, waiting for me to complete a build, and was wondering what I would have to do differently with those files if I wanted to toss 2 of them in there with the 980? Would one config cover all 3 cards, or would I need to break out configs for each card or model? Thanks! |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Petri, what's your best guesstimate about when your optimized applications make it to Main? I'm sorry I have no idea. That is all in jason_gee's hands. He's busy and hard working. I'd let him do his magic in peace and quiet. TBar has done some testing in Mac environment and from what I have seen it looks promising on his 750Ti. fawkesguy is running fine with his 980, 970 and 750Ti under Linux. The source is available and someone with enough knowledge how to compile for windows could do that and test in BETA. I myself have problems with my 980's giving different results to 780's. It might be a driver or compiler related thing or a feature that shows itself only when run in different internal GPU configuration: number of cores, number of logical or special instructions units, execution order, queues, ... A hidden bug, something. And now I'm going out with my children to a snowy hill with these: To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Hello Al, Nice selection on your system. Closely parallels my own. First, M.2 My understanding is that only Win 10 can be used to boot from the M.2 slot. Something to do with how windows is set up. (Personally I perfer Win 7 so that is why I never upgraded) Even if you did figure it out, the video that I found showing the install took forever..like 2 weeks. So you aren't missing anything. I'm assuming you are running Lunatics? You installed the cuda 5.0 I hope for that 980? Easiest way to configure your system is with the app_config.xml I will supply you with one here with an edit so check back for that. You are going to get the most RAC with your GPUs. Since it's only 1 GPU you don't really need an extra core to support it. If you place a second GPU in there you will need at least 2 cores free to support the GPUs and any OS function. I would start with 3 work units at a time on the GPU and see how it does, Temp wise and time to finish. After about a day, you can change the number of work on it to 4 and see if the times improve or get worse. (you do this by taking the average time it takes to complete and divide by the number of instances per card. If the number is going up instead of down, then it's counter productive and you should go down 1) 4.5 is a pretty impressive number, is the Mobo adjusting the voltage or are you doing it manually? Keep an eye on the CPU temps. Once you get the GPU up and running at speed, you may find that you have to back down on the CPU speed. (these are the tweaks you have to deal with with these things) Ok, i'm going to post this and then get the app_config.xml for you. Do you know how to make the xml file? I'm guessing you probably do. Edit coming.... <app_config> <app_version> <app_name>setiathome_v8</app_name> <plan_class>cuda50</plan_class> <avg_ncpus>0.35</avg_ncpus> <ngpus>0.33</ngpus> </app_version> <app_version> <app_name>astropulse_v7</app_name> <plan_class>opencl_nvidia_100</plan_class> <avg_ncpus>0.35</avg_ncpus> <ngpus>0.33</ngpus> </app_version> </app_config> you can change the CPU ratio to whatever you like. I just use that value Zalster |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
In principle, sounds close to my 2009 Mac Pro rather than my conventional PC hosts. I'd go app_config.xml to add as many instances as seem efficient to the GPUs as possible, and free some CPU cores accordingly. Others can help with settings, as I use customised old Boinc that has no app_config FWIW, my 680, 780 and 980 are showing similar performance, and that's mostly due to the feeding systems being crap (along with using them to do stuff), So for exemplars best to look at top hosts rather than mine. That said, with v8, and then with new technology introductions, a lot is changing. That makes things pretty murky and frustrating for a little while as far as choosing the best way to run. Next generation applications will likely include tools to help you work that out. [Edit: Thanks for the detail Zalster :D] "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Looks like fun @Petri :D, yes I didn't get to the roadmap this weekend as hoped. The 'safest' streaming optimisations (straight from yours) come right after Mac and Linux buildsystems pull into line with Windows (documentation and experience lacking at the moment). These then form the interface template for the plugin architecture needed for x42..... but a roadmap is warranted to illustrate that. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Thanks guys. I got a wild hair up my rear after I posted that request, because I said what the heck, shut it down and tossed in those 2 670's, figuring How Hard Could It Be?, and it has taken until now to get my system up and running.... It Really didn't like those older cards, I had no video and had to hard boot it a couple times. I had to drive a stake in everything UEFI in the BIOS to allow it to finally boot. Unfortunately, as it had corrupted something which couldn't be repaired and I had to do a restore to a point quite a bit earlier today, that meant backing up the BOINC data dir after the restore, and then started reinstalling much of what I had installed this morning. Man I love computers! Well, after reinstalling the latest video drivers and BOINC again, and attaching it to the project, closing it and installing the optimized app, and then copying the data dir back, it seems to be running ok, though I am still only running one task on the video card, and on only one card, the 980. So, I guess I would need the magic config to run multiple tasks on multiple cards, do you think 2 or 3 would be good for the 670's? I planned on 3 for the 980, that shouldn't stress it too much, at least I wouldn't hope it would...? Oh, and I am with you on Win 7, it will be my "New XP", till they pry it out of my cold dead hands like they finally did to me with XP, at least on my main machines ... lol |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.