?'s About Crossfire X: Twice As Good Or Wasted Efforts?

Message boards : Number crunching : ?'s About Crossfire X: Twice As Good Or Wasted Efforts?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1173700 - Posted: 25 Nov 2011, 19:10:10 UTC - in response to Message 1173695.  

To clarify, I'm talking about MB on GTX470 and GTX580 Nvidia Fermi cards using Lunatics apps.

I have checked this thoroughly with VHAR, VLAR units and everything in between. When running 2 units, they take twice as long to within a few seconds as when running one. Either way the GPU is running at around 95% load.

T.A.


It's not that I don't believe, that on your system this holds true. :)

Which Lunatics app? x38g?
I'm not sure throughput got reevaluated after x32f, so it would be intresting to hear from other people running optimised on Fermis, what they see.
Maybe we can discern a pattern on when/if increasing the count increases throughput.

edit we'll have x41g out next week, then would be a good time for everybody to reevaluate.

NB When it comes to counts, the advice has always been to go and check what gives the best throughput.

He's running x39e on the 580, and x38g on the 470 - that's what I was looking for when I spotted the odd pending list.

He's also running two 580s in the box, and three 470s (if the BOINC host report is to be relied on - remember it can only ever display one card type per manufacturer). It's just possible that heavily-loaded hosts might give different timings, because of PCI-e bus contention and suchlike.
ID: 1173700 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1173701 - Posted: 25 Nov 2011, 19:13:32 UTC - in response to Message 1173700.  

To clarify, I'm talking about MB on GTX470 and GTX580 Nvidia Fermi cards using Lunatics apps.

I have checked this thoroughly with VHAR, VLAR units and everything in between. When running 2 units, they take twice as long to within a few seconds as when running one. Either way the GPU is running at around 95% load.

T.A.


It's not that I don't believe, that on your system this holds true. :)

Which Lunatics app? x38g?
I'm not sure throughput got reevaluated after x32f, so it would be intresting to hear from other people running optimised on Fermis, what they see.
Maybe we can discern a pattern on when/if increasing the count increases throughput.

edit we'll have x41g out next week, then would be a good time for everybody to reevaluate.

NB When it comes to counts, the advice has always been to go and check what gives the best throughput.

He's running x39e on the 580, and x38g on the 470 - that's what I was looking for when I spotted the odd pending list.

He's also running two 580s in the box, and three 470s (if the BOINC host report is to be relied on - remember it can only ever display one card type per manufacturer). It's just possible that heavily-loaded hosts might give different timings, because of PCI-e bus contention and suchlike.


AND most importantly he is running XP...
IIRC most Fermis run under Win7.
Jason doesn't stop pointing out different driver models under XP and Vista/Win7, that may be an effect.
ID: 1173701 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1173732 - Posted: 25 Nov 2011, 22:24:27 UTC

I installed a GT 450 on a friends machine running Vista 32.
Just changing from 2 to 3 instances increased RAC from 7000 to 10000.
If thats no benefit i dont know.

Maybe WinXP is the problem.



With each crime and every kindness we birth our future.
ID: 1173732 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1173756 - Posted: 25 Nov 2011, 23:41:56 UTC - in response to Message 1173732.  
Last modified: 26 Nov 2011, 0:10:01 UTC

I installed a GT 450 on a friends machine running Vista 32.
Just changing from 2 to 3 instances increased RAC from 7000 to 10000.
If thats no benefit i dont know.

Maybe WinXP is the problem.


XP's driver model is inherently 'Lower Latency', since it uses physical VRAM. All Vista/Win7 VRAM (WDDM driver) memory is 'virtualised', so as such there is added latency to be hidden that isn't there under XP. Multiple simultaneous tasks is one effective technique to do so, and there are others yet to be tried deeper in applications. In the end, when it comes down to underlying infrastructure, the newer models are more capable & end up more efficient, though the transition is most certainly full of pitfalls & side effects as well.

Technologically speaking XP Driver Model is dead, but not completely buried yet, just because the better replacement methods for doing things are still under continual ongoing refinement by all involved, including with the applications. Not all of the changes are directed at performance either, but at making GPUs 'better' in the long term (reliability, security, flexibility etc, none of which are small changes, or guaranteed to have no performance impact. )

[Edit:] Don't forget to throw in the usual heavy dose of 'your mileage may vary'. On my systems running 3 tasks at a time per GPU amounts to a clear 20%-30% throughput increase, but those are Win7x64 & different cards as well.

Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1173756 · Report as offensive
Horacio

Send message
Joined: 14 Jan 00
Posts: 536
Credit: 75,967,266
RAC: 0
Argentina
Message 1173870 - Posted: 26 Nov 2011, 14:29:48 UTC - in response to Message 1173687.  

To clarify, I'm talking about MB on GTX470 and GTX580 Nvidia Fermi cards using Lunatics apps.

I have checked this thoroughly with VHAR, VLAR units and everything in between. When running 2 units, they take twice as long to within a few seconds as when running one. Either way the GPU is running at around 95% load.

T.A.



If the bolded text means that when running one task the GPU load is also at 95% then I wouldnt expect any increase in the throughput by running 2 tasks.

And, may be we found the common ground here: running more concurrent tasks will be effective only if your GPU load is lower than a certain value (needed condition, but probably not enough).



ID: 1173870 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1174016 - Posted: 27 Nov 2011, 8:20:17 UTC - in response to Message 1173693.  

Edit - your GTX 470 host is showing some odd results at the moment - Pending tasks for computer 5467867. Many inconclusive validations for CPU apps, and NVidia-allocated tasks run on CPU.

Thanks for the tip Richard. I've checked back and all the inconclusives I found on that box were mostly due to -9's where the wingman and I disagreed on just how they -9'ed or, due to the wingman's card -9'ing while my box found found what appears to be a valid result. There were two or three that I had put down to the increased sensitivty of the Lunatics app over the stock version.

I couldn't find any inconclusive CPU tasks, and the GPU to CPU units you found were probably due to me shunting some excess GPU units over to the CPU's to keep them warm during the recent server outages.

One question, why does this unit show as an inconclusive ? Both the wingman and I appear to have the same result (2, 3, 2, 1).

T.A.
ID: 1174016 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1174025 - Posted: 27 Nov 2011, 9:58:53 UTC - in response to Message 1174016.  

Edit - your GTX 470 host is showing some odd results at the moment - Pending tasks for computer 5467867. Many inconclusive validations for CPU apps, and NVidia-allocated tasks run on CPU.

Thanks for the tip Richard. I've checked back and all the inconclusives I found on that box were mostly due to -9's where the wingman and I disagreed on just how they -9'ed or, due to the wingman's card -9'ing while my box found found what appears to be a valid result. There were two or three that I had put down to the increased sensitivty of the Lunatics app over the stock version.

I couldn't find any inconclusive CPU tasks, and the GPU to CPU units you found were probably due to me shunting some excess GPU units over to the CPU's to keep them warm during the recent server outages.

One question, why does this unit show as an inconclusive ? Both the wingman and I appear to have the same result (2, 3, 2, 1).

T.A.


There is probably a difference in the "power" of 1 or more signals.
Upcoming apps will show strenght of the signals so its easier to compare.





With each crime and every kindness we birth our future.
ID: 1174025 · Report as offensive
Profile SilentObserver64
Volunteer tester
Avatar

Send message
Joined: 21 Sep 05
Posts: 139
Credit: 680,037
RAC: 0
United States
Message 1174534 - Posted: 29 Nov 2011, 14:37:35 UTC
Last modified: 29 Nov 2011, 15:24:46 UTC

Thanks everyone for their input. Sorry it took me so long to get back to everyone. Been a busy last week at work and at home.

So, knowing the variables are very diverse in the sense of hardware, and software, it isn't impossible, but definitely difficult to say for sure what is the best setup, to run the best times, and to see which of these setups would be most productive. So in light of that, I am going to rehash some of my variables on what I am using.

(Edit) - Windows OS: Win 7 Pro 64Bit

Motherboard: MSI 890FXA-GD65 Military Class - Link To MSI 890FXA-GD65 Specs
(May possibly upgrade MB to an AM3+ Gaming MB for the new AMD FX-8150 Zambezi 8 Core Processor) - Link to AMD FX-8150

Processor: AMD Phenom II X6 1090T Black Edition OC'd @ 3.6 GHz (Once I get water cooler, will OC to 4.2GHz Stable)

Video Cards: (2) XFX - ATI Radeon HD 6770 1GB GDDR5 PCI Express 2.1 OC'd to 905MHz Core Clock and 1225Mhz Memory Clock - Link To Video Card Specs

RAM: Since I'm not at my computer at the moment I don't have specifics, but I have 14 Gigs Ram DDR3 OC'ed @ 1402MHz or so. (Using 2 Kingston 4GB Sticks, 1 PNY 4GB stick, and a 2GB stick. I hate mixing them up like that but, money has been tight lately.)

Now I do use GPU-Z, so I can check the load and temps of each core.

How should I run my tests to find out which setup is best for me?

How should I mod the appinfo? (I can't tell you currently what I have set it to, because I am not at home, but I do know how to make the changes. What I need are the numbers to change it to with each test.)

I am using Lunatics App Version 0.38 currently.

I am not extreme technical savvy when it comes to BOINC or it's projects, because I have yet to study how it works 100%, but I am pretty smart with computers, so forgive me if I seem to ask simple questions as I learn this stuff. We all have to start somewhere.

If you need any other info, just let me know. Thanks in advance for your time and patience, as well as your help.

http://www.goodsearch.com/nonprofit/university-of-california-setihome.aspx
ID: 1174534 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1174538 - Posted: 29 Nov 2011, 14:47:12 UTC - in response to Message 1174534.  

I am using Lunatics App Version 0.38 currently.

You might want to run installer version 0.39 (released yesterday) to get the latest apps before you start experimentation.
ID: 1174538 · Report as offensive
Profile SilentObserver64
Volunteer tester
Avatar

Send message
Joined: 21 Sep 05
Posts: 139
Credit: 680,037
RAC: 0
United States
Message 1174542 - Posted: 29 Nov 2011, 15:12:23 UTC - in response to Message 1174538.  

I was actually thinking of doing a before and after with Lunatics App 0.38 and 0.39

http://www.goodsearch.com/nonprofit/university-of-california-setihome.aspx
ID: 1174542 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1174544 - Posted: 29 Nov 2011, 15:14:07 UTC - in response to Message 1173756.  
Last modified: 29 Nov 2011, 15:18:10 UTC

I installed a GT 450 on a friends machine running Vista 32.
Just changing from 2 to 3 instances increased RAC from 7000 to 10000.
If thats no benefit i dont know.

Maybe WinXP is the problem.


XP's driver model is inherently 'Lower Latency', since it uses physical VRAM. All Vista/Win7 VRAM (WDDM driver) memory is 'virtualised', so as such there is added latency to be hidden that isn't there under XP. Multiple simultaneous tasks is one effective technique to do so, and there are others yet to be tried deeper in applications. In the end, when it comes down to underlying infrastructure, the newer models are more capable & end up more efficient, though the transition is most certainly full of pitfalls & side effects as well.

Technologically speaking XP Driver Model is dead, but not completely buried yet, just because the better replacement methods for doing things are still under continual ongoing refinement by all involved, including with the applications. Not all of the changes are directed at performance either, but at making GPUs 'better' in the long term (reliability, security, flexibility etc, none of which are small changes, or guaranteed to have no performance impact. )

[Edit:] Don't forget to throw in the usual heavy dose of 'your mileage may vary'. On my systems running 3 tasks at a time per GPU amounts to a clear 20%-30% throughput increase, but those are Win7x64 & different cards as well.

Jason


I run 2 hosts, 1 with a Q6600+GTX470 and 1 with a QX9650+GTX480, running
2 MB WU's per GPU, is not taking double time, but, a 0.4AR WU
takes 10-15% more time, when doing 2 per GPU. VHAR's time difference
is hard to measure and there's hardly any time-difference, when running 2
per GPU.
But, this heavily depends on Compute Capabillity 2.0/2.1 (FERMI) and
the # of CUDA-Cores, ofcoarse memory should be sufficient for 2 or more
MB WU's!
ID: 1174544 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1174550 - Posted: 29 Nov 2011, 15:49:32 UTC
Last modified: 29 Nov 2011, 15:50:29 UTC

Run the Lunatics installer 0.39 first and finnish a couple work units.
Will look into your results afterwards and tell you what to change.

Mike


With each crime and every kindness we birth our future.
ID: 1174550 · Report as offensive
Profile SilentObserver64
Volunteer tester
Avatar

Send message
Joined: 21 Sep 05
Posts: 139
Credit: 680,037
RAC: 0
United States
Message 1174551 - Posted: 29 Nov 2011, 15:52:27 UTC - in response to Message 1174550.  
Last modified: 29 Nov 2011, 15:53:13 UTC

Run the Lunatics installer 0.39 first and finnish a couple work units.
Will look into your results afterwards and tell you what to change.

Mike


Sounds like a plan. Soon as I get a chance to do that, I'll run a couple WU's and then I'll repost here for input.

http://www.goodsearch.com/nonprofit/university-of-california-setihome.aspx
ID: 1174551 · Report as offensive
Profile SilentObserver64
Volunteer tester
Avatar

Send message
Joined: 21 Sep 05
Posts: 139
Credit: 680,037
RAC: 0
United States
Message 1174558 - Posted: 29 Nov 2011, 16:00:55 UTC

Just read the realease notes, as well, on the new Lunatics App 0.39. Now I am running current 11.11 drivers, so I guess I am going to have to uninstall them completely and then install an older 11.xx Catalyst, but the question is, which one? 11.02 or 11.03? Not sure from quickly browsing through the post. This, however, does raise some concerns for me in terms of stability and the bugs from the older apps. I'm wondering how this is going to effect my machine as a whole, not just with BOINC. I don't wanna fry anything or get constant BSOD's. Another question is, can I use current Crossfire X profiles with an older catalyst version?

http://www.goodsearch.com/nonprofit/university-of-california-setihome.aspx
ID: 1174558 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1174563 - Posted: 29 Nov 2011, 16:49:59 UTC - in response to Message 1174544.  
Last modified: 29 Nov 2011, 17:06:18 UTC

I installed a GT 450 on a friends machine running Vista 32.
Just changing from 2 to 3 instances increased RAC from 7000 to 10000.
If thats no benefit i dont know.

Maybe WinXP is the problem.


XP's driver model is inherently 'Lower Latency', since it uses physical VRAM. All Vista/Win7 VRAM (WDDM driver) memory is 'virtualised', so as such there is added latency to be hidden that isn't there under XP. Multiple simultaneous tasks is one effective technique to do so, and there are others yet to be tried deeper in applications. In the end, when it comes down to underlying infrastructure, the newer models are more capable & end up more efficient, though the transition is most certainly full of pitfalls & side effects as well.

Technologically speaking XP Driver Model is dead, but not completely buried yet, just because the better replacement methods for doing things are still under continual ongoing refinement by all involved, including with the applications. Not all of the changes are directed at performance either, but at making GPUs 'better' in the long term (reliability, security, flexibility etc, none of which are small changes, or guaranteed to have no performance impact. )

[Edit:] Don't forget to throw in the usual heavy dose of 'your mileage may vary'. On my systems running 3 tasks at a time per GPU amounts to a clear 20%-30% throughput increase, but those are Win7x64 & different cards as well.

Jason


I run 2 hosts, 1 with a Q6600+GTX470 and 1 with a QX9650+GTX480, running
2 MB WU's per GPU, is not taking double time, but, a 0.4AR WU
takes 10-15% more time, when doing 2 per GPU. VHAR's time difference
is hard to measure and there's hardly any time-difference, when running 2
per GPU.
But, this heavily depends on Compute Capabillity 2.0/2.1 (FERMI) and
the # of CUDA-Cores, ofcoarse memory should be sufficient for 2 or more
MB WU's!


And I might add, I've not yet seen programs, f.i. MovieMaker or Adobe Photoshop CS5, a game, which clearly shows the benefits of Cross Fire or SLI
, using a single output (monitor).
Only (GPU) BenchMarks :) ( Crunching on GPUs, doesn't need CrossFire or SLI,
or gets disabled).

@ Jason, thanks for your clear explanation of differences between WIN XP &
VISTA and WIN 7 and their related drivers and managment!

Concerning drivers, haven't noticed much difference, as long you can keep your
GPUs loaded. Using Cat. 11.11 drivers, atm. AstroPulse on (ATI) GPU sometimes
gives a too low load. Playing around with the cmd-line setting, unroll; ffa_block, ffa_block_fetch and # of instances, can change this.
Down-Clocking if too little load is present, can become a problem.
ID: 1174563 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1174571 - Posted: 29 Nov 2011, 17:05:48 UTC - in response to Message 1174558.  

Just read the realease notes, as well, on the new Lunatics App 0.39. Now I am running current 11.11 drivers, so I guess I am going to have to uninstall them completely and then install an older 11.xx Catalyst, but the question is, which one? 11.02 or 11.03? Not sure from quickly browsing through the post. This, however, does raise some concerns for me in terms of stability and the bugs from the older apps. I'm wondering how this is going to effect my machine as a whole, not just with BOINC. I don't wanna fry anything or get constant BSOD's. Another question is, can I use current Crossfire X profiles with an older catalyst version?


You can try 11.2. drivers shouldn´t give any trouble.
But i dont think the Crossfire X profiles will work under 11.2.

If anyone knows better i´m no gamer.




With each crime and every kindness we birth our future.
ID: 1174571 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1174573 - Posted: 29 Nov 2011, 17:12:13 UTC - in response to Message 1174571.  

Just read the realease notes, as well, on the new Lunatics App 0.39. Now I am running current 11.11 drivers, so I guess I am going to have to uninstall them completely and then install an older 11.xx Catalyst, but the question is, which one? 11.02 or 11.03? Not sure from quickly browsing through the post. This, however, does raise some concerns for me in terms of stability and the bugs from the older apps. I'm wondering how this is going to effect my machine as a whole, not just with BOINC. I don't wanna fry anything or get constant BSOD's. Another question is, can I use current Crossfire X profiles with an older catalyst version?


You can try 11.2. drivers shouldn´t give any trouble.
But i dont think the Crossfire X profiles will work under 11.2.

If anyone knows better i´m no gamer.



Well you can quickly check this with GPUz, or check this link.


ID: 1174573 · Report as offensive
Profile SilentObserver64
Volunteer tester
Avatar

Send message
Joined: 21 Sep 05
Posts: 139
Credit: 680,037
RAC: 0
United States
Message 1179289 - Posted: 19 Dec 2011, 16:15:50 UTC

In light of a current post, ATI Catalyst Driver 11.12 , despite the initial reports from the Lunatics Windows Installer v0.39 release notes , it would appear that the current drivers seem to be doing better than expected, without the problems they said the ATI drivers would have (not compatable with current Installer Version). So it looks like I can keep my current driver version, which is great, but I still haven't had time, to be honest, to setup and run BOINC yet on my "Big Rig". I did however, setup and installed current Lunatics installer on 2 of the other machines I have it running on. They do initially appear to be running better, with better times, but I haven't again, had time to sit down and compare recent WU's with previous WU's to be sure. With the holidays coming up, I've been working more and more, because my employee's have been taking time off. I am a supervisor for a homeland security classified chemical plant, and in this business, we need 24/7 coverage. I unfortuneatly am bound to my duties here to work any needed hours on top of my regular hours to ensure that coverage is kept, and we are already understaffed as it is, so you can imagine the stress I am under, and why I haven't had time. However, that doesn't mean that I'm still not interested in these projects, because I will eventually find the time to apply these apps and turn them on again for my "Big Rig". So, anyone running ATI HD cards (preferably HD5X and up), please let me know your current setups, or how well they are working for you. If your running the dual ATI setup or some combo of it, please post your current configurations, and results of that configuration, so that I may also do the same, after figuring out, hopefully with your help, what the best setup is. Thanks in advance, and happy holidays to everyone who celebrates them. SilentObserver64.

http://www.goodsearch.com/nonprofit/university-of-california-setihome.aspx
ID: 1179289 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : ?'s About Crossfire X: Twice As Good Or Wasted Efforts?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.