Getting back into it, advice appreciated

Message boards : Number crunching : Getting back into it, advice appreciated
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next

AuthorMessage
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1768989 - Posted: 2 Mar 2016, 13:30:52 UTC - in response to Message 1768987.  
Last modified: 2 Mar 2016, 13:32:37 UTC

FP32, and to some extent memory speeds if available. Compote efficiency single instance, for most GPU applications here, works out to about 5% of theoretical peak (which for GPUs is about 'half-optimised'), therefore you can reasonably estimate expected GFlops at 1/20th of theoretical peak, and give +/- 20% slop to allow for system and running conditions and application. Multiple instances will increase the total efficiency within reason, to the region of 10% [With current app technologies].

Thanks for the detailed reply, Jason! I haven't had my 1st cup of coffee yet though, so I think I'll need to re-read this again after one or two... And maybe a couple times... Or maybe it's just the translation from Aus to American english ;-)

ID: 1768989 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22258
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1768991 - Posted: 2 Mar 2016, 13:40:31 UTC

32 & 64 bit.
However I would exercise great caution in following these figures with too much enthusiasm as in the real world (SETI) there are so many other factors that come into play, not the least of which are the drivers employed, the application being run and the task being run.

As Glenn says, it is very hard to beat the GTX750ti in terms of credit/watt, but sadly these cards are getting harder to get - I guess there is some gearing up for the forthcoming release of the next generation of Nvidia GPUs, which is due in a couple of months time if there is any truth in the rumours.
Next up would be a GTX970, but they are a notch up on price (and performance).
Beyond that you are into GTX980, 980ti & Titan territory, the capital cost goes up, the power consumption jumps for only small increases in performance.

Just take a look at the top 20 computers on SETI (even if one excludes Petri's four GTX980 with a very highly optimised application) there are no AMD/ATI GPUs until one gets to 8, and then that is in a heterogeneous mix with Nvidia.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1768991 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1768996 - Posted: 2 Mar 2016, 14:59:18 UTC
Last modified: 2 Mar 2016, 15:00:06 UTC

I have installed a Geforce GTX 750 OC on my Windows 10 PC and it crunches Einstein@home binary radio pulsars tasks with data from both Arecibo and Parkes with good results, even if with some computation errors. But I don't get any SETI@home task, even if I have allowed them. Why?
Tullio
ID: 1768996 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1769019 - Posted: 2 Mar 2016, 17:39:44 UTC - in response to Message 1768991.  
Last modified: 2 Mar 2016, 18:27:29 UTC

Rob, so you are saying that the 750 is still a better choice than the 970 in the real world, the guesstimates in the comparison I made above don't really hold true? I would of course try to use the bestest, most optimized client, and driver version for whatever card I get, but the 4x, 3x, and 2x+ seem to be a pretty good way to go, if you can get past the initial purchase price, of course. But, obviously I don't know, that's why I come here and as others who do!

Maybe in a few months like was mentioned, when the newest latest and greatest cards come out, there will be a flood of 970's on the secondary market from people upgrading, and I'll be able to find some for substantially less then the approx $300 they are getting now new, hopefully around half of that, give or take? One can always hope!

ID: 1769019 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1769027 - Posted: 2 Mar 2016, 18:42:58 UTC

From the top hosts list You can find a machine with 3 GTX970's and a machine with 3 GTX750Ti's. Their performance is at about 115000 RAC and 65000 RAC with the best current alpha software. That should give some indication of what to expect in near future. He's running one task at a time on each GPU.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1769027 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1769034 - Posted: 2 Mar 2016, 19:00:16 UTC - in response to Message 1769027.  
Last modified: 2 Mar 2016, 19:07:39 UTC

From the top hosts list You can find a machine with 3 GTX970's and a machine with 3 GTX750Ti's. Their performance is at about 115000 RAC and 65000 RAC with the best current alpha software. That should give some indication of what to expect in near future. He's running one task at a time on each GPU.

So the system with three 970's has about 1.77 times as much output for about 2.42 times as much power used compared to the system with three 750TI's.

EDIT: It made me sad when I looked at the MB run times on the three 750Ti system. They run MB work in about the same time my R9 390x does. However looking at the AP run times on the 750Ti's and the 970's made me happy again.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1769034 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1769037 - Posted: 2 Mar 2016, 19:07:46 UTC - in response to Message 1769034.  

From the top hosts list You can find a machine with 3 GTX970's and a machine with 3 GTX750Ti's. Their performance is at about 115000 RAC and 65000 RAC with the best current alpha software. That should give some indication of what to expect in near future. He's running one task at a time on each GPU.

So the system with three 970's has about 1.77 times as much output for about 2.42 times as much power used compared to the system with three 750TI's.


And three times the upfront cost...
ID: 1769037 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1769040 - Posted: 2 Mar 2016, 19:12:42 UTC - in response to Message 1769034.  

I think he runs AP's two at a time. I'm not sure.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1769040 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1769042 - Posted: 2 Mar 2016, 19:16:53 UTC - in response to Message 1769034.  


EDIT: It made me sad when I looked at the MB run times on the three 750Ti system. They run MB work in about the same time my R9 390x does. However looking at the AP run times on the 750Ti's and the 970's made me happy again.


Yeah, there is still a lot of unrealized potential in the AMD cards. Hopefully some of the streams/stream concurrency methods used in Cuda will translate to some extent to the OpenCL apps. Probably wishful thinking but hopefully not lol.
ID: 1769042 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1769045 - Posted: 2 Mar 2016, 19:34:42 UTC - in response to Message 1769040.  

I think he runs AP's two at a time. I'm not sure.

They would need to run 3-4 AP at a time to average the same kind of output. I'm not sure the 970 could manage that. It's only ~3500 GFLOP vs ~5900 GFLOP. I tried looking at some 980Ti and Titan X systems but they had run times in the 45min to 1hr range. Which if they are running 4 then it puts them about on par with my ~12 min run times.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1769045 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1769049 - Posted: 2 Mar 2016, 19:45:16 UTC - in response to Message 1769045.  

A lot depends on customization of those systems beyond just plugging in a GPU and telling it to run 3.
ID: 1769049 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1769050 - Posted: 2 Mar 2016, 19:46:36 UTC - in response to Message 1769042.  


EDIT: It made me sad when I looked at the MB run times on the three 750Ti system. They run MB work in about the same time my R9 390x does. However looking at the AP run times on the 750Ti's and the 970's made me happy again.


Yeah, there is still a lot of unrealized potential in the AMD cards. Hopefully some of the streams/stream concurrency methods used in Cuda will translate to some extent to the OpenCL apps. Probably wishful thinking but hopefully not lol.

OpenCL is pretty high level compared to CUDA from my understanding. Which might be a limiting factor. If that is true perhaps Vulkan has the potential to improve on any shortcomings of OpenCL.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1769050 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1769052 - Posted: 2 Mar 2016, 19:59:48 UTC - in response to Message 1769040.  

Petri, what's your best guesstimate about when your optimized applications make it to Main?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1769052 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1769254 - Posted: 3 Mar 2016, 16:37:00 UTC - in response to Message 1769037.  


So the system with three 970's has about 1.77 times as much output for about 2.42 times as much power used compared to the system with three 750TI's.


And three times the upfront cost...

Hmm, this is good information. If this example of real world performance, I have a bit of thinking to do.

ID: 1769254 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1769733 - Posted: 5 Mar 2016, 15:08:21 UTC - in response to Message 1768987.  
Last modified: 5 Mar 2016, 15:27:59 UTC

K, thanks Jason. Quick question, I just got my new system fired up and test running and tuning for ultimate performance, in the tweeking stage right now. I need some advice on App_info/config files (I presume the app_info, because I found that one - which I can post if you'd like - but not the config one). Which would be necessary, and the best way to config them to utilize this setup the most effectively?

It is ID 7949285, the one with the i7-5930K CPU running currently @ 4.50GHz using a H100 water cooler (coretemp says that they are fluctuating between 61-73C - usually 66-69 - while crunching, which I think is less than impressive, anyones thoughts on this?) and a 1600 watt EVGA PSU. It is running 12 CPU and one GPU tasks, and I'd think I'd be best to tone down the CPU's a little and up the GPU's to 3, but it's been so long, I've forgotten how to configure it, and which file would be best to modify to do so.

If anyone could toss me a bone and post a file that I could use, or even suggestions on what to modify to get it configed correctly, I'd appreciate it, as I would really like to see what this baby can do. It's the highest performance system I've ever built, and have been fighting it for a couple weeks now with the M.2 slot and the Samsung V-NAND 950 Pro card, on the original Asus X99-Deluxe board (not the most current one, it's been sitting on the shelf till I could spare the extra $ for the CPU, they sure aren't cheap!).

In fact, I tossed in the towel on it (M.2 boot drive), and will be returning it to MicroCenter, because the technology is still kind of kludgey when trying to install Win 7(UEFI bios changes that didn't work, etc, and yes I did the latest BIOS, etc, even spent an hour on the phone with ASUS, they couldn't figure it out either), and I don't want to play around with it any more, I just want to use the computer. Would have been nice, but not critical, and can save the $ for something else down the road. Installed an Intel 850 SSD, that will be good enough for me.

Thanks for any suggestions guys!

Oh, one last thing, I have a couple GTX670 cards laying around, waiting for me to complete a build, and was wondering what I would have to do differently with those files if I wanted to toss 2 of them in there with the 980? Would one config cover all 3 cards, or would I need to break out configs for each card or model? Thanks!

ID: 1769733 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1769745 - Posted: 5 Mar 2016, 15:49:04 UTC - in response to Message 1769052.  

Petri, what's your best guesstimate about when your optimized applications make it to Main?


I'm sorry I have no idea. That is all in jason_gee's hands. He's busy and hard working. I'd let him do his magic in peace and quiet.

TBar has done some testing in Mac environment and from what I have seen it looks promising on his 750Ti.

fawkesguy is running fine with his 980, 970 and 750Ti under Linux.

The source is available and someone with enough knowledge how to compile for windows could do that and test in BETA.

I myself have problems with my 980's giving different results to 780's. It might be a driver or compiler related thing or a feature that shows itself only when run in different internal GPU configuration: number of cores, number of logical or special instructions units, execution order, queues, ... A hidden bug, something.

And now I'm going out with my children to a snowy hill with these:
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1769745 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1769748 - Posted: 5 Mar 2016, 15:49:28 UTC - in response to Message 1769733.  
Last modified: 5 Mar 2016, 15:55:50 UTC

Hello Al,

Nice selection on your system. Closely parallels my own.

First, M.2 My understanding is that only Win 10 can be used to boot from the M.2 slot.
Something to do with how windows is set up. (Personally I perfer Win 7 so that is why I never upgraded) Even if you did figure it out, the video that I found showing the install took forever..like 2 weeks. So you aren't missing anything.

I'm assuming you are running Lunatics? You installed the cuda 5.0 I hope for that 980?

Easiest way to configure your system is with the app_config.xml

I will supply you with one here with an edit so check back for that.

You are going to get the most RAC with your GPUs. Since it's only 1 GPU you don't really need an extra core to support it. If you place a second GPU in there you will need at least 2 cores free to support the GPUs and any OS function.

I would start with 3 work units at a time on the GPU and see how it does, Temp wise and time to finish. After about a day, you can change the number of work on it to 4 and see if the times improve or get worse. (you do this by taking the average time it takes to complete and divide by the number of instances per card. If the number is going up instead of down, then it's counter productive and you should go down 1)

4.5 is a pretty impressive number, is the Mobo adjusting the voltage or are you doing it manually? Keep an eye on the CPU temps.

Once you get the GPU up and running at speed, you may find that you have to back down on the CPU speed. (these are the tweaks you have to deal with with these things)

Ok, i'm going to post this and then get the app_config.xml for you. Do you know how to make the xml file? I'm guessing you probably do.

Edit coming....

<app_config>
<app_version>
<app_name>setiathome_v8</app_name>
<plan_class>cuda50</plan_class>
<avg_ncpus>0.35</avg_ncpus>
<ngpus>0.33</ngpus>
</app_version>
<app_version>
<app_name>astropulse_v7</app_name>
<plan_class>opencl_nvidia_100</plan_class>
<avg_ncpus>0.35</avg_ncpus>
<ngpus>0.33</ngpus>
</app_version>
</app_config>

you can change the CPU ratio to whatever you like. I just use that value

Zalster
ID: 1769748 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1769751 - Posted: 5 Mar 2016, 15:52:28 UTC - in response to Message 1769733.  
Last modified: 5 Mar 2016, 15:55:28 UTC

In principle, sounds close to my 2009 Mac Pro rather than my conventional PC hosts. I'd go app_config.xml to add as many instances as seem efficient to the GPUs as possible, and free some CPU cores accordingly. Others can help with settings, as I use customised old Boinc that has no app_config

FWIW, my 680, 780 and 980 are showing similar performance, and that's mostly due to the feeding systems being crap (along with using them to do stuff), So for exemplars best to look at top hosts rather than mine.

That said, with v8, and then with new technology introductions, a lot is changing. That makes things pretty murky and frustrating for a little while as far as choosing the best way to run. Next generation applications will likely include tools to help you work that out.

[Edit: Thanks for the detail Zalster :D]
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1769751 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1769759 - Posted: 5 Mar 2016, 16:06:53 UTC - in response to Message 1769745.  
Last modified: 5 Mar 2016, 16:07:40 UTC

Looks like fun @Petri :D, yes I didn't get to the roadmap this weekend as hoped. The 'safest' streaming optimisations (straight from yours) come right after Mac and Linux buildsystems pull into line with Windows (documentation and experience lacking at the moment). These then form the interface template for the plugin architecture needed for x42..... but a roadmap is warranted to illustrate that.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1769759 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1769788 - Posted: 5 Mar 2016, 17:43:37 UTC - in response to Message 1769759.  
Last modified: 5 Mar 2016, 18:07:41 UTC

Thanks guys. I got a wild hair up my rear after I posted that request, because I said what the heck, shut it down and tossed in those 2 670's, figuring How Hard Could It Be?, and it has taken until now to get my system up and running.... It Really didn't like those older cards, I had no video and had to hard boot it a couple times. I had to drive a stake in everything UEFI in the BIOS to allow it to finally boot. Unfortunately, as it had corrupted something which couldn't be repaired and I had to do a restore to a point quite a bit earlier today, that meant backing up the BOINC data dir after the restore, and then started reinstalling much of what I had installed this morning. Man I love computers!

Well, after reinstalling the latest video drivers and BOINC again, and attaching it to the project, closing it and installing the optimized app, and then copying the data dir back, it seems to be running ok, though I am still only running one task on the video card, and on only one card, the 980. So, I guess I would need the magic config to run multiple tasks on multiple cards, do you think 2 or 3 would be good for the 670's? I planned on 3 for the 980, that shouldn't stress it too much, at least I wouldn't hope it would...?

Oh, and I am with you on Win 7, it will be my "New XP", till they pry it out of my cold dead hands like they finally did to me with XP, at least on my main machines ... lol

ID: 1769788 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next

Message boards : Number crunching : Getting back into it, advice appreciated


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.