New card having a few errors...

Message boards : Number crunching : New card having a few errors...
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1524938 - Posted: 6 Jun 2014, 1:38:44 UTC
Last modified: 6 Jun 2014, 1:40:56 UTC

One additional thing to be aware of when dealing with factory overclocked/superclocked models from any brand (may or may not have been a factor here). Those factory overclocks are typically made to a level of stability, determined by some maximum number of acceptable graphical 'artefacts' over a time period. In games minute/infrequent graphical glitches are often considered acceptable.

For number crunching that acceptable number of glitches is zero, and can be using different parts of the chip than the dedicated graphics parts. So sometimes a small core voltage and fan curve boost can be necessary (e.g. using eVGA precision X or similar). Both my 560ti and 780sc require small voltage bumps for rock solid Cuda operation at the factory (superclocked) frequencies.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1524938 · Report as offensive
spitfire_mk_2
Avatar

Send message
Joined: 14 Apr 00
Posts: 563
Credit: 27,306,885
RAC: 0
United States
Message 1524948 - Posted: 6 Jun 2014, 1:54:29 UTC - in response to Message 1524938.  

One additional thing to be aware of when dealing with factory overclocked/superclocked models from any brand (may or may not have been a factor here). Those factory overclocks are typically made to a level of stability, determined by some maximum number of acceptable graphical 'artefacts' over a time period. In games minute/infrequent graphical glitches are often considered acceptable.

For number crunching that acceptable number of glitches is zero, and can be using different parts of the chip than the dedicated graphics parts. So sometimes a small core voltage and fan curve boost can be necessary (e.g. using eVGA precision X or similar). Both my 560ti and 780sc require small voltage bumps for rock solid Cuda operation at the factory (superclocked) frequencies.

Interesting point.
TL could have simply downclocked his card using nVidiaInspector free utility program.
ID: 1524948 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1524974 - Posted: 6 Jun 2014, 2:44:54 UTC - in response to Message 1524938.  

One additional thing to be aware of when dealing with factory overclocked/superclocked models from any brand (may or may not have been a factor here). Those factory overclocks are typically made to a level of stability, determined by some maximum number of acceptable graphical 'artefacts' over a time period. In games minute/infrequent graphical glitches are often considered acceptable.

For number crunching that acceptable number of glitches is zero, and can be using different parts of the chip than the dedicated graphics parts. So sometimes a small core voltage and fan curve boost can be necessary (e.g. using eVGA precision X or similar). Both my 560ti and 780sc require small voltage bumps for rock solid Cuda operation at the factory (superclocked) frequencies.



Interesting point. It never occurred to me that he might not be using the Precision X that came with the card. I always use it with mine to control the fan speeds and temps. Didn't realize that it also corrected the core voltage.


Zalster
ID: 1524974 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1524976 - Posted: 6 Jun 2014, 2:46:35 UTC - in response to Message 1524938.  

Both my 560ti and 780sc require small voltage bumps for rock solid Cuda operation at the factory (superclocked) frequencies.


I'm really surprised to hear about this.

You helped me with my original Gigabyte SOC 560Ti (clock at 950MHz) because it was throwing errors. You told me to increase the voltage a tad, and I did. After a few weeks the card got worse. You told me to bump the voltage, and I did.

Eventually I had to send the card back to Gigabyte and they updated the firmware (which I could have done) and sent it back. The card got cooler and quit making all of those errors, too.

Your having to do it with the 560Ti isn't shocking, but your having to do that with your "new" 780 is discouraging.
ID: 1524976 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1525085 - Posted: 6 Jun 2014, 7:38:11 UTC - in response to Message 1525027.  

I've noticed over the last week or so a big jump in the number of inconclusives on my systems. Usually it's mid to high 30s, lately it's been low 50s and dropping down to mid 40s before climbing back up again.
However, so far none of them have been declared invalid.
Grant
Darwin NT
ID: 1525085 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34766
Credit: 261,360,520
RAC: 489
Australia
Message 1525091 - Posted: 6 Jun 2014, 7:48:40 UTC - in response to Message 1525085.  

I've noticed over the last week or so a big jump in the number of inconclusives on my systems. Usually it's mid to high 30s, lately it's been low 50s and dropping down to mid 40s before climbing back up again.
However, so far none of them have been declared invalid.

From what I've just looked at Grant, you've just been paired up with some bad wingmates. ;-)

Cheers.
ID: 1525091 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1525108 - Posted: 6 Jun 2014, 8:59:10 UTC - in response to Message 1525091.  

From what I've just looked at Grant, you've just been paired up with some bad wingmates. ;-)

Cheers.

At least they keep validating eventually.
Luckily I haven't had any repeats of 2 dud ones validating each other & mine missing out.
Grant
Darwin NT
ID: 1525108 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1525294 - Posted: 6 Jun 2014, 20:12:49 UTC

Well, I've now registered my new GTX760, purchased the 10 year extended warranty, ($30), with next day RMA support, ($44.99), from NVIDIA; and, downloaded and installed PrecisionX.

My PSU has a standard 5 Year Warranty from Antec. I'm unsure what my warranties are on the ASUS MB, AMD CPU, and RAM, and DVD Drive. (I'm assuming that these items have at least a one year warranty.) Hard drive was mine from stock that I had on hand; no warranty...

PrecisionX is showing that my card is running between 65 and 66 C while crunching Einstein@Home right now. (1:11 PM - PDT.) GPU Clock at 1123 MHz, Voltage @ 1200 mv. I'm assuming that all of this is normal.
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1525294 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1525300 - Posted: 6 Jun 2014, 20:25:30 UTC - in response to Message 1525294.  

TL,

What does your fan curve look like? Everyone has it set differently for their own personal preference. For mine I have it set with a flat line at 50% up to 50C from there it goes up to another point with the max is 70C with 100% fan speed. So for mine it almost looks like a straight line up. Make sure you have the little box at the top clicked where it says Enable Software automatic fan control. Also on the Precision X just above fan curve button make sure the Auto is clicked. These will make sure that Precisions X is controlling fan speed according to the temp of the card.


Zalster
ID: 1525300 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1525311 - Posted: 6 Jun 2014, 21:15:22 UTC - in response to Message 1525300.  

TL,

What does your fan curve look like? Everyone has it set differently for their own personal preference. For mine I have it set with a flat line at 50% up to 50C from there it goes up to another point with the max is 70C with 100% fan speed. So for mine it almost looks like a straight line up. Make sure you have the little box at the top clicked where it says Enable Software automatic fan control. Also on the Precision X just above fan curve button make sure the Auto is clicked. These will make sure that Precisions X is controlling fan speed according to the temp of the card.


Zalster


I have the fan selections set as you have specified. (Auto in both areas.) I set my target temp a little lower than 70 C; mine is set to achieve the goal of 68 C. Fans are currently running between 84% and 86% to achieve this. Once I set these limits; I heard the fans pick up speed to lower the temp... The temp, (while now crunching SETI Beta), climbed up to 71 C. With the new fan settings in place, that quickly dropped back to 68 C.
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1525311 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1525318 - Posted: 6 Jun 2014, 21:38:12 UTC - in response to Message 1525311.  
Last modified: 6 Jun 2014, 22:06:37 UTC

Sounds good. Little thing I found, if you want to look at different things about your GPUs. At the bottom of Precision X where it has GPU power, GPU clock and GPU memory displays. (If you don't see it, look for a little downward arrow in the middle of the bottom part of the Graphic interface. It should says something like Show additional setting panel when you hover over it. Just click there and it should extend the bottom of the Graphic interface) If you double click in that area, it will bring up another box that shows GPU Power, GPU clock, GPU Memory, GPU temp, GPU usage, GPU frontbus FB Usage %, GPU VID usage %, GPU Bus usage %, GPU memory usage MB, GPU voltage, GPU fan speed %, TEmp Limit, Power limit, Voltage limit, OV max limit, and utlization limit.

It's helpful when you have more than 1 GPU (as it will show all of them) and when crunching more than 1 work unit on the GPU, so you can tailor your optimization program to get the best results out of the GPU. After you get crunching and are comfortable with your results, you at some point will probably want to do the optimization programs from Lunatics to squeeze even more out of that GPU. Being able to look at all the above will help you to decide how to set your parameters at that time.


Zalster
ID: 1525318 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1526155 - Posted: 9 Jun 2014, 21:33:41 UTC

Well, it's been more than a couple of days of running the new NVIDIA GTX760 card. I see no "Inconclusives" or "Invalids" being reported on CUDA42. ...and, the card seems to be getting more CUDA50 work than CUDA42, now. Maybe eight, or so, CUDA42's in my queue.

The new card does seem to pull some power from the PSU that the GTX750 TI Superclocked didn't seem to pull; that's probably normal, however, as the 760 has the 8-Pin Adapter plugged directly into the PSU. I've noticed that with this configuration, that the top of the computer, (around the PSU), gets warmer than the old configuration. Again, probably normal, because of the configuration of GPU and new Truepower PSU.

I will continue to keep my eye on things, and report. All seems to be crunching just fine. :-)
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1526155 · Report as offensive
Jim1348

Send message
Joined: 13 Dec 01
Posts: 212
Credit: 520,150
RAC: 0
United States
Message 1526571 - Posted: 10 Jun 2014, 21:54:39 UTC - in response to Message 1524856.  

For those that don't know, I'm on an NVIDIA GTX750 TI Superclocked. I have three "Inconclusives" pending and one "Invalid" that was an "Inconclusive"...

Your problem had nothing to do with the power supply. It is because the card is overclocked. Just because it is factory overclocked doesn't make it any more stable, it is still overclocked. If you want to use the 750 Ti's again, reduce the clock to something like the chip default value; my Asus GTX 750 Ti's run at 1072 MHz (1150 MHz boost) and have been totally stable on GPUGrid for over a month. That is the most demanding project I know of.
ID: 1526571 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1526716 - Posted: 11 Jun 2014, 5:22:34 UTC - in response to Message 1526571.  

For those that don't know, I'm on an NVIDIA GTX750 TI Superclocked. I have three "Inconclusives" pending and one "Invalid" that was an "Inconclusive"...

Your problem had nothing to do with the power supply. It is because the card is overclocked. Just because it is factory overclocked doesn't make it any more stable, it is still overclocked. If you want to use the 750 Ti's again, reduce the clock to something like the chip default value; my Asus GTX 750 Ti's run at 1072 MHz (1150 MHz boost) and have been totally stable on GPUGrid for over a month. That is the most demanding project I know of.


That may well be; however, now that I'm on the GTX760 and the Antec Truepower 550 Watt PSU, there have been no more "Inconclusives" on CUDA42... This may be, (as you state), because the GTX750 TI Superclocked is overclocked; however, now that I'm on the GTX760 it has become moot... (New card, new parameters.)

I will crunch with the GTX760 for a good while. If I do anything; I'd consider jumping up to the GTX760 TI Superclocked. However, that won't happen for quite awhile.

As I'm a paranoid bastard, I took out the Extended Warranty through NVIDIA. $30 for a total of 10 Years Warranty. Since this will be my primary cruncher, and I don't have a steady income to replace parts that wear out, I figured that $30 is a good gamble, and when this card does burn out, I will then step up to the GTX760 TI Superclocked. (NVIDIA's "Step Up" program seems to allow this upgrading for the difference in price of the card.)

If/When this happens, will I then need a stronger PSU; or, will the Antec Truepower 550 Watt supply be enough for that card?

At that time, if the GTX760 TI Superclocked exhibits the same problems with CUDA42 that the GTX750 TI Superclocked showed, I will take your advice and reduce the clock speed of that card. However; again, it will be some time before that change in card happens. This one has to burn out first; and, I hope that doesn't happen anytime soon.
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1526716 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1526731 - Posted: 11 Jun 2014, 6:04:24 UTC - in response to Message 1526716.  

TL,

If you do get any more inconclusives, don't do anything unless they eventually become a series of invalids (ie 25-30). Most of those inconclusives just didn't match with your wingman close enough. After the resend to the 3rd person, most usually will match but it may take some time (up to weeks) so it's a sit and wait situation with those.

As far as power consumption. You haven't specified what company manufactured your cards but going off Nvidia and EVGA both the normal GTX 760, GTX 760 Ti and the GTX 760 TI SC all have a max power consumption of 170 Watts. So your current PSU should be fine.

Lastly, did those inconclusives from the GTX 750 Ti become invalid? or did they validate later? If they did validate then there wasn't anything wrong with the 750 TI. On the other hand, you got a much more powerful GPU to crunch, so enjoy that unit.

Thanks for the update.

Zalster
ID: 1526731 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1526938 - Posted: 11 Jun 2014, 17:29:23 UTC - in response to Message 1526731.  

TL,

If you do get any more inconclusives, don't do anything unless they eventually become a series of invalids (ie 25-30). Most of those inconclusives just didn't match with your wingman close enough. After the resend to the 3rd person, most usually will match but it may take some time (up to weeks) so it's a sit and wait situation with those.

As far as power consumption. You haven't specified what company manufactured your cards but going off Nvidia and EVGA both the normal GTX 760, GTX 760 Ti and the GTX 760 TI SC all have a max power consumption of 170 Watts. So your current PSU should be fine.

Lastly, did those inconclusives from the GTX 750 Ti become invalid? or did they validate later? If they did validate then there wasn't anything wrong with the 750 TI. On the other hand, you got a much more powerful GPU to crunch, so enjoy that unit.

Thanks for the update.

Zalster


Both cards are EVGA cards. I was unaware that there were other brands... Central Computers only carries EVGA NVIDIA cards.

All four of the "Inconclusives" on the GTX750 TI Superclocked did become "Invalids" on CUDA42.

Thanks Zalster, Jim, and everyone. :-)
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1526938 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1527396 - Posted: 12 Jun 2014, 18:15:18 UTC

Well, the GTX760 now has three "Inconclusives"; ALL CUDA42. They most likely will turn into "Invalids". So, it is a CUDA42 issue... (Right???) :-/
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1527396 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1527420 - Posted: 12 Jun 2014, 18:57:58 UTC - in response to Message 1527396.  
Last modified: 12 Jun 2014, 19:14:40 UTC

Well, the GTX760 now has three "Inconclusives"; ALL CUDA42. They most likely will turn into "Invalids". So, it is a CUDA42 issue... (Right???) :-/

Not necessarily, you have wingmen, they have to find the right signals too.

http://setiathome.berkeley.edu/workunit.php?wuid=1519749983
Your GTX760 (running the Cuda50 app) found the same number and type of signals as your wingman running a GTX660 on the Cuda42 app, could be a precision difference between the apps,
more likely a problem with your wingman's GPU, he has ten inconclusives against your three (even then that's not a very high amount).

http://setiathome.berkeley.edu/workunit.php?wuid=1519749985
Your wingman running a GTX660Ti (on the Cuda50 app) task overflowed after 2 seconds, while your task took 253 seconds, almost 100% certainty that it's your wingmans host that causing the inconclusive.

http://setiathome.berkeley.edu/workunit.php?wuid=1514793538
An old result of yours, your GTX750Ti found 29 Spikes, and one triplet, while your Stock 7.00 running wingman found one triplet, almost 100% certainty it was your GTX750Ti that caused the inconclusive (your wingman has a good record)

Claggy
ID: 1527420 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1527633 - Posted: 13 Jun 2014, 6:17:45 UTC

Thanks Claggy. :-)
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1527633 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1527648 - Posted: 13 Jun 2014, 7:18:49 UTC

I think that they should hide the inconclusives. Yes I still check mine daily, But I dont lose any sleep over them. I find I get them when Im teamed up with some one running the opencl ATI apps. Or the cal ATI apps.

Its the errors and invalids I worry about.
[/quote]

Old James
ID: 1527648 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : New card having a few errors...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.