Message boards :
Number crunching :
slow computer reaching its limit
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
arkayn Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0 |
I run lunatics on a Win7 I7 feeding two GTX770's and have my ap_cmdline_win_x86_SSE2_OpenCL_NV.txt set to: I think I would go with the 9500GT being an issue. I run this on both my 670 and 760 with no problems. -use_sleep -unroll 18 -ffa_block 6144 -hp |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
ok now ,,, The strange part is, you say you are running 4 at a time, the times would indicate 4 at a time, but there is only 1 completed task showing in that time period. You appear to be missing a few. Try running just 1 AP at a time and see what that does. I'm confused by your CPU usage statement. Make sure your Computing preferences are set to 'Use at most 100.00% CPU time'. Try 1 at a time with the settings; -unroll 12 -ffa_block 8192 -ffa_block_fetch 4096 -sbs 256 -hp See what that shows for results. BTW. If you are trying to run 4 APs at a time with a 9500, the whole system will probably croak. You can't assign 4 at a time on a system using a 9500. You also appear to have an old ATI card showing up in there; Number of OpenCL platforms: 2 OpenCL Platform Name: ATI Stream http://setiathome.berkeley.edu/result.php?resultid=3533724267 |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
On the 770 you could try: -use_sleep -unroll 12 -ffa_block 12288 -ffa_block_fetch 6144 could be not the best but it´s fast & stable. but be aware, that´s not work with the old models like the 9500GT. You could work with 2 diferent families of GPU´s on the same host but you loose in one side or both, is impossible to optimize the configuration for both arquitectures (from the same manufacteur like NV) at the same time unless you run 2 instances of boinc on the same host. So the best thing to do is take out the 9600GT, optimize for the 770 and put that old GPU on another host. And BTW not forget, the 9600GT will slow down everything if you run 2 WU at the same time like any other pre-fermi models on the other hand the 770 easely handle 2 WU at a time. my 2 Cents |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
It looks as though not much has changed. Hard to believe all those machines/cards have the same mechanical problem(s). It must be some setting(s). Seems your Vista host is running normally, you could compare settings. I would suggest swapping the Vista card with one of the 770s, if possible, and see what happens. |
EdwardPF Send message Joined: 26 Jul 99 Posts: 389 Credit: 236,772,605 RAC: 374 |
I am still very confused ... maybe this will help me ... on a gpu to gpu basis: are the kepler gpu's 1/3 the speed of a fermi gpu?? I.E are 512 fermi gpu's the same AP crunching power as 1536 kepler gpu's?? Ed F |
tbret Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40 |
I'm confused by your question. You asked "on a gpu to gpu basis," and I don't know what that means. Do you mean a 460 (Fermi) vs a 560 (Fermi) vs 660 (Kepler) vs 750 (Maxwell)? If that's what you mean, I might be able to help. The model numbers don't denote crunching equivalence. It's just a "number" they stick on a card to show where it fits in the current line-up. (a 560 should be "better" than a 550Ti, for instance; but it says nothing about whether a 460 is worse than a 560) My experience with the 600-series is that it crunched about like the model number BELOW it in the prior series. That is, a 670 crunches about like a 560Ti and a 660Ti crunches about like a 560. I think that was temporary as they went from a separate "shader" clock to a "unified" clock. The architectural changes are only "improvements" if the application is written to take advantage of whatever new whiz-bang thing the new architecture allows. So, I have little doubt that a GTX 660Ti might play a new game better than the old 560Ti would, but only because the new game was written that way. Therefore... it's hard to answer the question as you asked it. My 660Ti is not as fast as my 560Ti at crunching. My GT 640 was nowhere close to as fast at crunching as my GT 240. My GTX 470 seems to be faster than my GTX 560Ti-448 even though I would have expected them to be substantially the same. Is that what you were asking-about? |
EdwardPF Send message Joined: 26 Jul 99 Posts: 389 Credit: 236,772,605 RAC: 374 |
My 660Ti is not as fast as my 560Ti at crunching. yes (I think) ... My nvidia 770 has 1536 GPU's and it seems to crunch AP's at about the same rate as my older nvidia 560 ti with 448 GPU's ... in my naivete' I expected the nvidia 770 to far outperform the nvidia 560 ti. After some tweaking of the ap_cmdline_win_x86_SSE2_OpenCL_NV file the nvidia 770 is now running about about 20% faster ... but the 560 ti may have run faster with the same tweaks ... I don't know ... Anyway ... perhaps buying the 770 was not worth the money ... or my old rig has just hit its performance limit ... I don't know .. I was just looking for ideas from you master-minds out there ... Sigh ... "naivete'" combined with "assume" will do that to the wallet ... Ed F |
Mike Send message Joined: 17 Feb 01 Posts: 34253 Credit: 79,922,639 RAC: 80 |
Related to AP`s the difference between a 560Ti and a 770 is not very big. I would expect less than 10% with sharpest timings. With each crime and every kindness we birth our future. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13720 Credit: 208,696,464 RAC: 304 |
I.E are 512 fermi gpu's the same AP crunching power as 1536 kepler gpu's?? To clarify- GPU is Graphics Processing unit. That's the chip that is on the video card. Within a GPU there are multiple processing units- in the case of Nvidia CUDA cores (AMD uses different terminology). Different cards have different number of theses. Also different families of cards have different numbers & types of processing units within each CUDA core. For MB, Maxwell is considerably slower at processing shorties than the previous generations were, but about the same for longer running WUs. This is due to changes in the architecture combined with current video driver models. The fact is Maxwell is capable of much, much more, but the application developers need to come to terms with & overcome the limitations of the present video (and operating system) driver implementations. When it comes to the number of WUs crunched per Watt of power used, Maxwell is in a league of it's own & leaves all previous generations of card well behind, even with it's less than optimal performance. Grant Darwin NT |
tbret Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40 |
Don't get discouraged. The Kepler and Maxwell architectures will allow for some improvements in the applications. Unfortunately, we have to wait for someone who knows how to sit down and do that over the course of weeks and months and years, as a volunteer effort, when they have time, when they aren't trying to solve other issues, battling with CUDA and OpenCL changes, driver changes, and whatever it is they do when they aren't doing things for free. But that's why I just keep my old cards crunching at this point. Whatever work they do is work that someone else would have to do if my GTX 460 wasn't still on the job for however many years that has been. The day will come when the newer cards are much faster than the older cards and then I'll have to think about replacing them. And then there are other projects where they can't really even cope with the advances that could be made if they could re-write their software to take advantage of FERMI architecture!!! There are still projects running on very old code. But then there's the other side of things. If suddenly SETI could process things twice as fast, could they handle the data-flow? Would they run out of data to crunch? It'll all work itself out over time. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Some of us notice that some time ago, specialy when the 780 was released. Let´s try to explain that with the common high end NVidia serial models, the ones i test at least. The tests shows the 680 (or the 770 who is let say, is a better 680) produces almost the same output as the 580 does but we all expect to produce a lot more. Then we notice the 780 actualy produces about the same too. The common explanation for that is the lattency problem, in some easy to understand words, simply the GPU is faster than the driver/hardware/software capacity to feed it, so even if the GPU could be more powerfull and have more cores, it "waste" a big portion of the time doing nothing. AFAIK there are some good people working on that problem and we could expect news about in some time, but for now, until this new builds could be avaiable, the reality is simply the 580 is the winner against the others (680/770/780) in SETI terms. Of course there are some overclocked or newer builds (like the 780Ti) modes who actualy are faster but that´s not the point i´m talking about similar basic models. But you can´t forget the power used to crunch, to do the same job the 580 uses a lot more power than the 770/780, and that makes a huge diference when you have a farm with several of them running. And there are another important point, for example, 3x780 produces a lot more, at least in Seti, than 2x780Ti, cost about the same, but uses more power. Almost the same happening 3x770 vs 2x780. So i imagine it´s happening with the mid-range models. But then you need to remember the slot limitation problem, if your MB has only 2 slots and you need more GPU power you actualy are limited to buy the most expensive models, i´m allmost sure Nvidia realy knows that when their put the price on their products. I see some examples of very impressive cruching times with the 750/750Ti/760 models for example, here on Seti, so that could indicates something changes, but not see an actual test who realy proves or not it. What is clear, mid range GPU´s are normaly more easy to handle due the way the works is distributed mainly, than their bigger cousins there are a lot of examples around. Until now there are no High range maxwells avaible to make the comparation but seems like the latency problem still happening with the mid range maxwells avaiable. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
... That's right. The trick is to get the scaling right in advance oF Big Maxwells, and have automatic scaling perfected before Pascal &/or Volta. Some testing to get out of the way behind the scenes with nv, then the Latency issues are getting some hefty glove slap :) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Some testing to get out of the way behind the scenes with nv, then the Latency issues are getting some hefty glove slap :) That i realy like and since the information comming from our "master guru" seems like the end of the tunnel is a lot closer now. I need to be ready with some celebration ciggars. :) |
Wedge009 Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553 |
...the reality is simply the 580 is the winner against the others (680/770/780) in SETI terms. I realise different people have different set-ups, but in my experience the GF110 GPU used in the GTX 580 - considering S@h processing performance only, of course - is comparable to the GK104 GPU used in the GTX 670/680/760/770. (The GTX 670 I used to have wasn't as good as my GTX 580.) But I find the GK110 GPU in the GTX 780 (even the initial non-Ti release) to be quite a bit better than the GTX 580 - for AP at least, even if the improvement for MB is perhaps not as significant. Latency issues aside, this makes sense to me because I understand the Kepler architecture to be much more focused on graphical performance and power efficiency than Fermi, at the expense of less emphasis on GPGPU performance. And the GK104 seems to have been derived from the GF114 used in the GTX 560 Ti, while the GK110 is the actual successor to the GF110 (AKA 'Big' Kepler). Nonetheless, if work is being done to mitigate the latency issues across the GPUs and OSs then this can only be a good thing, so I agree this is something to look forward to. (: Soli Deo Gloria |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Please let me correct my words in order to avoid any missunderstandings, the 770/780 are realy a little faster in SETI than the 580, but not too much as we expect when you look the number of cores, newer processor, higher clock speed, etc. When i buy the first 770/780s i expect a lot more gain in daily production above the 580´s they substitute, what i get was very similar numbers maybe a 10-20% gain only, but something was very clear, they use a lot less power to do the job. Asking why i receive the explanation of the latency problem. We all hope when the latency problem will be out of the equation, as posted by Jason´s, all will going to change and then the real power of this babies will be unleash, specialy in the case of the 780 who actualy runs far from their true potential. Firgers crossed. Actualy i don´t have any 580 or 770 anymore, only few 780FTW´s with ACX air cooling from EVGA and they run quiet and cold (in the range of high 60 to low 70C), even when they crunch 2 or 3 SETI WU at a time in my hot tropical country and i´m very happy with them. |
Wedge009 Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553 |
Okay, that makes more sense. Also, when we are talking about already small MB tasks (order of a few minutes), then we are probably hitting limits like latency anyway. For AP, I find the GTX 780 getting close to 50% faster than the GTX 580. Soli Deo Gloria |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
As allways YMMV and you are using a newer Lunatics builds, not the original stock ones we use when the GPU´s was lunch a year or more ago. In my case the 780FTW´s with MB was only 10-15% faster than the 770SC but cost at that time 50% more. If IIRC the MB app still the same at that time (x41zc cuda50). The latency problem make a host 780 running a single WU give only 50% of GPU usage if my memory works fine, more WU at a time makes the GPU usage increase but still not using all the GPU potential. Jason could explain that with more technical details. BTW at that time i don´t crunch AP, the creditscrew mess was not so known at that time, so i´m not sure about AP crunching times. |
Mike Send message Joined: 17 Feb 01 Posts: 34253 Credit: 79,922,639 RAC: 80 |
The AP apps improved a lot in the last 2 years. When i started AP`s 2 units took over 90 minutes. Now i`m at 20 minutes for zero blanked, of course with some tweaks. With each crime and every kindness we birth our future. |
Wedge009 Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553 |
Just for the record, I'm comparing like with like when talking about my experiences with GTX 580 and GTX 780. 20 minutes for AP is impressive, especially on Tahiti. Better than what I'm getting on average with Hawaii. Soli Deo Gloria |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.