slow computer reaching its limit

Message boards : Number crunching : slow computer reaching its limit
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1520611 - Posted: 24 May 2014, 14:58:14 UTC - in response to Message 1520586.  

I run lunatics on a Win7 I7 feeding two GTX770's and have my ap_cmdline_win_x86_SSE2_OpenCL_NV.txt set to:

-unroll 10 -ffa_block 6144 -ffa_block_fetch 1536 -hp

because the instructions in ReadMe_AstroPulse_OpenCL_NV.txt say that's what to set it to for mid range cards with less than 12 compute units.

The GTX770 only has 8. You may be inducing some memory thrashing because you're trying to feed your 770 too much at a time.

My AP times are averaging around 1,800 seconds while your AP times are around 11,000 seconds.

The only other difference I see is that you have some 9500GTs installed along-side your GTX770's. I have some 9500GTs sitting on a shelf because they're only slightly faster than a core. They're dogs. If I were you, I'd remove them because they may be the biggest factor is slowing down your 770s--but that's only a guess because I'm not absolutely sure how the scheduler deals with two such *widely* different cards in the same system.


I think I would go with the 9500GT being an issue. I run this on both my 670 and 760 with no problems.
-use_sleep -unroll 18 -ffa_block 6144 -hp

ID: 1520611 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1520643 - Posted: 24 May 2014, 17:37:22 UTC - in response to Message 1520575.  
Last modified: 24 May 2014, 18:10:53 UTC

ok now ,,,

I have been running for only 2 days now with the new set up BUT nothing has changed!

I have ap_cmdline_win_x86_SSE2_OpenCL_NV set to


-unroll 12 -ffa_block 8192 -ffa_block_fetch 4096 -hp

and priority is indeed set to "high" and the computer can chew an average
AP in just over 30 min's the computer is still crunching at about 30,000 cobblestones per day ... again ... the nvidia 770 is almost exactly the same as the old nvidia 560 TI!!

I have tried -unroll 24 -ffa_block 16394 -ffa_block_fetch 8192 -hp for a few hours and all the WU in that time ran at the same speed!


Antone out there with a nvidia 770 that can help me know its true potential??

Ed F

The strange part is, you say you are running 4 at a time, the times would indicate 4 at a time, but there is only 1 completed task showing in that time period. You appear to be missing a few. Try running just 1 AP at a time and see what that does. I'm confused by your CPU usage statement. Make sure your Computing preferences are set to 'Use at most 100.00% CPU time'. Try 1 at a time with the settings;
-unroll 12 -ffa_block 8192 -ffa_block_fetch 4096 -sbs 256 -hp
See what that shows for results.

BTW. If you are trying to run 4 APs at a time with a 9500, the whole system will probably croak. You can't assign 4 at a time on a system using a 9500. You also appear to have an old ATI card showing up in there;
Number of OpenCL platforms: 2
OpenCL Platform Name: ATI Stream
http://setiathome.berkeley.edu/result.php?resultid=3533724267
ID: 1520643 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1520647 - Posted: 24 May 2014, 17:53:50 UTC
Last modified: 24 May 2014, 18:09:31 UTC

On the 770 you could try:

-use_sleep -unroll 12 -ffa_block 12288 -ffa_block_fetch 6144

could be not the best but it´s fast & stable.

but be aware, that´s not work with the old models like the 9500GT.

You could work with 2 diferent families of GPU´s on the same host but you loose in one side or both, is impossible to optimize the configuration for both arquitectures (from the same manufacteur like NV) at the same time unless you run 2 instances of boinc on the same host.

So the best thing to do is take out the 9600GT, optimize for the 770 and put that old GPU on another host.

And BTW not forget, the 9600GT will slow down everything if you run 2 WU at the same time like any other pre-fermi models on the other hand the 770 easely handle 2 WU at a time.

my 2 Cents
ID: 1520647 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1521028 - Posted: 25 May 2014, 21:49:21 UTC

It looks as though not much has changed. Hard to believe all those machines/cards have the same mechanical problem(s). It must be some setting(s). Seems your Vista host is running normally, you could compare settings. I would suggest swapping the Vista card with one of the 770s, if possible, and see what happens.
ID: 1521028 · Report as offensive
EdwardPF
Volunteer tester

Send message
Joined: 26 Jul 99
Posts: 389
Credit: 236,772,605
RAC: 374
United States
Message 1523427 - Posted: 2 Jun 2014, 2:17:24 UTC - in response to Message 1521058.  

I am still very confused ... maybe this will help me ...

on a gpu to gpu basis:

are the kepler gpu's 1/3 the speed of a fermi gpu??

I.E are 512 fermi gpu's the same AP crunching power as 1536 kepler gpu's??

Ed F
ID: 1523427 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1523430 - Posted: 2 Jun 2014, 3:24:48 UTC - in response to Message 1523427.  



I am still very confused ... maybe this will help me ...



I'm confused by your question.

You asked "on a gpu to gpu basis," and I don't know what that means. Do you mean a 460 (Fermi) vs a 560 (Fermi) vs 660 (Kepler) vs 750 (Maxwell)?

If that's what you mean, I might be able to help. The model numbers don't denote crunching equivalence.

It's just a "number" they stick on a card to show where it fits in the current line-up. (a 560 should be "better" than a 550Ti, for instance; but it says nothing about whether a 460 is worse than a 560)

My experience with the 600-series is that it crunched about like the model number BELOW it in the prior series. That is, a 670 crunches about like a 560Ti and a 660Ti crunches about like a 560.

I think that was temporary as they went from a separate "shader" clock to a "unified" clock.

The architectural changes are only "improvements" if the application is written to take advantage of whatever new whiz-bang thing the new architecture allows.

So, I have little doubt that a GTX 660Ti might play a new game better than the old 560Ti would, but only because the new game was written that way.

Therefore... it's hard to answer the question as you asked it.

My 660Ti is not as fast as my 560Ti at crunching.
My GT 640 was nowhere close to as fast at crunching as my GT 240.
My GTX 470 seems to be faster than my GTX 560Ti-448 even though I would have expected them to be substantially the same.

Is that what you were asking-about?
ID: 1523430 · Report as offensive
EdwardPF
Volunteer tester

Send message
Joined: 26 Jul 99
Posts: 389
Credit: 236,772,605
RAC: 374
United States
Message 1523435 - Posted: 2 Jun 2014, 3:55:11 UTC - in response to Message 1523430.  

My 660Ti is not as fast as my 560Ti at crunching.
My GT 640 was nowhere close to as fast at crunching as my GT 240.
My GTX 470 seems to be faster than my GTX 560Ti-448 even though I would have expected them to be substantially the same.

Is that what you were asking-about?


yes (I think) ...

My nvidia 770 has 1536 GPU's and it seems to crunch AP's at about the same rate as my older nvidia 560 ti with 448 GPU's ... in my naivete' I expected the nvidia 770 to far outperform the nvidia 560 ti.

After some tweaking of the ap_cmdline_win_x86_SSE2_OpenCL_NV file the nvidia 770 is now running about about 20% faster ... but the 560 ti may have run faster with the same tweaks ... I don't know ...

Anyway ... perhaps buying the 770 was not worth the money ...

or my old rig has just hit its performance limit ... I don't know ..

I was just looking for ideas from you master-minds out there ...

Sigh ... "naivete'" combined with "assume" will do that to the wallet ...

Ed F
ID: 1523435 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1523479 - Posted: 2 Jun 2014, 7:00:36 UTC

Related to AP`s the difference between a 560Ti and a 770 is not very big.
I would expect less than 10% with sharpest timings.


With each crime and every kindness we birth our future.
ID: 1523479 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1523486 - Posted: 2 Jun 2014, 7:11:20 UTC - in response to Message 1523427.  

I.E are 512 fermi gpu's the same AP crunching power as 1536 kepler gpu's??


To clarify- GPU is Graphics Processing unit. That's the chip that is on the video card.

Within a GPU there are multiple processing units- in the case of Nvidia CUDA cores (AMD uses different terminology). Different cards have different number of theses. Also different families of cards have different numbers & types of processing units within each CUDA core.

For MB, Maxwell is considerably slower at processing shorties than the previous generations were, but about the same for longer running WUs. This is due to changes in the architecture combined with current video driver models.
The fact is Maxwell is capable of much, much more, but the application developers need to come to terms with & overcome the limitations of the present video (and operating system) driver implementations.

When it comes to the number of WUs crunched per Watt of power used, Maxwell is in a league of it's own & leaves all previous generations of card well behind, even with it's less than optimal performance.
Grant
Darwin NT
ID: 1523486 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1523529 - Posted: 2 Jun 2014, 9:12:06 UTC - in response to Message 1523435.  



Anyway ... perhaps buying the 770 was not worth the money ...



Don't get discouraged. The Kepler and Maxwell architectures will allow for some improvements in the applications.

Unfortunately, we have to wait for someone who knows how to sit down and do that over the course of weeks and months and years, as a volunteer effort, when they have time, when they aren't trying to solve other issues, battling with CUDA and OpenCL changes, driver changes, and whatever it is they do when they aren't doing things for free.

But that's why I just keep my old cards crunching at this point. Whatever work they do is work that someone else would have to do if my GTX 460 wasn't still on the job for however many years that has been.

The day will come when the newer cards are much faster than the older cards and then I'll have to think about replacing them.

And then there are other projects where they can't really even cope with the advances that could be made if they could re-write their software to take advantage of FERMI architecture!!! There are still projects running on very old code.

But then there's the other side of things. If suddenly SETI could process things twice as fast, could they handle the data-flow? Would they run out of data to crunch?

It'll all work itself out over time.
ID: 1523529 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1523564 - Posted: 2 Jun 2014, 11:21:12 UTC
Last modified: 2 Jun 2014, 11:46:19 UTC

Some of us notice that some time ago, specialy when the 780 was released. Let´s try to explain that with the common high end NVidia serial models, the ones i test at least.

The tests shows the 680 (or the 770 who is let say, is a better 680) produces almost the same output as the 580 does but we all expect to produce a lot more. Then we notice the 780 actualy produces about the same too. The common explanation for that is the lattency problem, in some easy to understand words, simply the GPU is faster than the driver/hardware/software capacity to feed it, so even if the GPU could be more powerfull and have more cores, it "waste" a big portion of the time doing nothing.

AFAIK there are some good people working on that problem and we could expect news about in some time, but for now, until this new builds could be avaiable, the reality is simply the 580 is the winner against the others (680/770/780) in SETI terms.

Of course there are some overclocked or newer builds (like the 780Ti) modes who actualy are faster but that´s not the point i´m talking about similar basic models.

But you can´t forget the power used to crunch, to do the same job the 580 uses a lot more power than the 770/780, and that makes a huge diference when you have a farm with several of them running.

And there are another important point, for example, 3x780 produces a lot more, at least in Seti, than 2x780Ti, cost about the same, but uses more power. Almost the same happening 3x770 vs 2x780. So i imagine it´s happening with the mid-range models.

But then you need to remember the slot limitation problem, if your MB has only 2 slots and you need more GPU power you actualy are limited to buy the most expensive models, i´m allmost sure Nvidia realy knows that when their put the price on their products.

I see some examples of very impressive cruching times with the 750/750Ti/760 models for example, here on Seti, so that could indicates something changes, but not see an actual test who realy proves or not it. What is clear, mid range GPU´s are normaly more easy to handle due the way the works is distributed mainly, than their bigger cousins there are a lot of examples around.

Until now there are no High range maxwells avaible to make the comparation but seems like the latency problem still happening with the mid range maxwells avaiable.
ID: 1523564 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1523570 - Posted: 2 Jun 2014, 11:51:24 UTC - in response to Message 1523564.  

...
Until now there are no High range maxwells avaible to make the comparation but seems like the latency problem still happening with the mid range maxwells avaiable.


That's right. The trick is to get the scaling right in advance oF Big Maxwells, and have automatic scaling perfected before Pascal &/or Volta. Some testing to get out of the way behind the scenes with nv, then the Latency issues are getting some hefty glove slap :)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1523570 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1523575 - Posted: 2 Jun 2014, 12:30:09 UTC - in response to Message 1523570.  
Last modified: 2 Jun 2014, 12:31:23 UTC

Some testing to get out of the way behind the scenes with nv, then the Latency issues are getting some hefty glove slap :)

That i realy like and since the information comming from our "master guru" seems like the end of the tunnel is a lot closer now. I need to be ready with some celebration ciggars. :)
ID: 1523575 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1523601 - Posted: 2 Jun 2014, 13:36:09 UTC - in response to Message 1523564.  

...the reality is simply the 580 is the winner against the others (680/770/780) in SETI terms.

I realise different people have different set-ups, but in my experience the GF110 GPU used in the GTX 580 - considering S@h processing performance only, of course - is comparable to the GK104 GPU used in the GTX 670/680/760/770. (The GTX 670 I used to have wasn't as good as my GTX 580.) But I find the GK110 GPU in the GTX 780 (even the initial non-Ti release) to be quite a bit better than the GTX 580 - for AP at least, even if the improvement for MB is perhaps not as significant.

Latency issues aside, this makes sense to me because I understand the Kepler architecture to be much more focused on graphical performance and power efficiency than Fermi, at the expense of less emphasis on GPGPU performance. And the GK104 seems to have been derived from the GF114 used in the GTX 560 Ti, while the GK110 is the actual successor to the GF110 (AKA 'Big' Kepler).

Nonetheless, if work is being done to mitigate the latency issues across the GPUs and OSs then this can only be a good thing, so I agree this is something to look forward to. (:
Soli Deo Gloria
ID: 1523601 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1523610 - Posted: 2 Jun 2014, 14:25:58 UTC
Last modified: 2 Jun 2014, 14:57:43 UTC

Please let me correct my words in order to avoid any missunderstandings, the 770/780 are realy a little faster in SETI than the 580, but not too much as we expect when you look the number of cores, newer processor, higher clock speed, etc.

When i buy the first 770/780s i expect a lot more gain in daily production above the 580´s they substitute, what i get was very similar numbers maybe a 10-20% gain only, but something was very clear, they use a lot less power to do the job. Asking why i receive the explanation of the latency problem.

We all hope when the latency problem will be out of the equation, as posted by Jason´s, all will going to change and then the real power of this babies will be unleash, specialy in the case of the 780 who actualy runs far from their true potential. Firgers crossed.

Actualy i don´t have any 580 or 770 anymore, only few 780FTW´s with ACX air cooling from EVGA and they run quiet and cold (in the range of high 60 to low 70C), even when they crunch 2 or 3 SETI WU at a time in my hot tropical country and i´m very happy with them.
ID: 1523610 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1523778 - Posted: 2 Jun 2014, 18:56:36 UTC
Last modified: 2 Jun 2014, 18:56:58 UTC

Okay, that makes more sense. Also, when we are talking about already small MB tasks (order of a few minutes), then we are probably hitting limits like latency anyway. For AP, I find the GTX 780 getting close to 50% faster than the GTX 580.
Soli Deo Gloria
ID: 1523778 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1523822 - Posted: 2 Jun 2014, 20:32:26 UTC - in response to Message 1523778.  

As allways YMMV and you are using a newer Lunatics builds, not the original stock ones we use when the GPU´s was lunch a year or more ago.

In my case the 780FTW´s with MB was only 10-15% faster than the 770SC but cost at that time 50% more. If IIRC the MB app still the same at that time (x41zc cuda50). The latency problem make a host 780 running a single WU give only 50% of GPU usage if my memory works fine, more WU at a time makes the GPU usage increase but still not using all the GPU potential. Jason could explain that with more technical details.

BTW at that time i don´t crunch AP, the creditscrew mess was not so known at that time, so i´m not sure about AP crunching times.
ID: 1523822 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1523839 - Posted: 2 Jun 2014, 20:57:45 UTC
Last modified: 2 Jun 2014, 20:58:24 UTC

The AP apps improved a lot in the last 2 years.
When i started AP`s 2 units took over 90 minutes.
Now i`m at 20 minutes for zero blanked, of course with some tweaks.


With each crime and every kindness we birth our future.
ID: 1523839 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1523852 - Posted: 2 Jun 2014, 21:27:07 UTC

Just for the record, I'm comparing like with like when talking about my experiences with GTX 580 and GTX 780.

20 minutes for AP is impressive, especially on Tahiti. Better than what I'm getting on average with Hawaii.
Soli Deo Gloria
ID: 1523852 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : slow computer reaching its limit


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.