1) Questions and Answers : GPU applications : CUDA GPU workunits -- processing errors (Message 1266897)
Posted 3 Aug 2012 by Profile Meerkat
Thanks so much for the always great replies!

Here it goes: the two cards are admittedly different, but only in make and model number, while they are identical in specs:

The only difference is the BIOS version, and to be honest, although I have a sound computer hardware knowledge and build experience, I am not prepared to flash a video card BIOS just yet.

Also, by using MSI AfterBurner I am forcing both cards to identical settings: following many reports that these 560Ti needed to be tweaked down in MHz and up in Voltage, as you can see from the pic, I have done just that and hence resolved most if not all of my unit "errors".

As you can see on AfterBurner, the usage is only on GPU2:

Interestingly, I have twice set the Phys-X card settings in NVIDIA Control Panel to "Auto", which defaults to GPU1; but every time I reboot, the Phys-X settings return to GPU2. Not sure what that really means.

This is the BOINC Log, from which it seems the choice of one GPU may not be necessarily the best... is that possible? GPU(1) is not used by has little more memory available and has slightly higher GFLOPS capability...

Lastly, I have searched the entire computer for the "cc_config.xml" file with no luck. Since I am now using the Lunatics optimised apps, should I not have found one already there, or I may still have to create one?

Thanks again for your help, I will wait for your replies before attempting to create a dangerous config file... ;-)
2) Questions and Answers : GPU applications : CUDA GPU workunits -- processing errors (Message 1266711)
Posted 2 Aug 2012 by Profile Meerkat
While I am on this, I may as well ask a seemingly silly question that has been bugging me...

Why is it that SETI@Home only uses one graphics card?!?

It does seem a real waste that, although there are two identical 560Ti, linked in SLi, recognised by the pc as such and equally accessed (with great results) by all high end video games, SETI@Home routines are always utilising the one and same card (#2)?

Sure there must be a relatively simple way of getting the application to either share the unit load between the two GPUs or running two units at the same time, to avoid stressing and heating up only one card...

Thanks for the explanation, anyone!
3) Questions and Answers : GPU applications : CUDA GPU workunits -- processing errors (Message 1263906)
Posted 24 Jul 2012 by Profile Meerkat
Done all that, thanks.

Now will need to watch those results and see what happens.


4) Questions and Answers : GPU applications : CUDA GPU workunits -- processing errors (Message 1262807)
Posted 21 Jul 2012 by Profile Meerkat
Thanks, done a little reading and a little tweaking over the past couple days.

I have separated the two cards in AfterBurner, so I can tweak the settings separately as they are not the exact same.

I raised the voltage ever so slightly and lowered considerably all three clocks (Core, Shader and Memory) and raised the fan slightly thus reducing the top temp during 97% GPU usage.

It seems that most of the errors have disappeared since the first tweak was applied a couple days ago; now can only wait to see what result the latest 1100 odd pending WU will return.

One thing I must disagree with is the assumption that the cards do not run full steam when playing games: trust me, when you are in Battlefield 3 - with the highest details settings on - the fans go full speed and the cards are both running flat tack, the heat from the 200mm exhaust fans can be felt; so, if there are no errors then, I do not believe the cards are responsible.

As a further comparison, I looked at Einstein@Home which also uses CUDA: never any errors at all... so I am afraid the common denominator here is either SETI app or the CUDA implementation with these particular cards.

Happy to be proven wrong.
5) Questions and Answers : GPU applications : CUDA GPU workunits -- processing errors (Message 1262542)
Posted 20 Jul 2012 by Profile Meerkat
G'Day all from DownUnder!

Unbeknownst to me, the GPU has apparently been kicking out lots of errors...

Rig is a Win 7 x64 with two nVidia 560Ti in SLi, 16GB DDR3 and all top notch components.

Have been a SETI partecipant since 1999 and have never had any dramas, till some guy from the US messaged me the other day saying my cards send lots of errors and have a serious problem!

How does he know... is beyond me; why would he care... same!

Sure, none of us wants to be a nuisance for the project, but I would have thought the system would have alerted me directly if there was anything untoward.

Now, my machine is in tip top shape and also used to play some serious graphics intensive, latest gen video games (not at the same time as crunching). The nVidia cards are always run on the latest stable driver available and never BSOD, freeze or frame dropped anywhere.

Hibernation is physically turned off (elevated DOS prompt > powercfg -h off) and the power scheme is "High Performance" with sleep disabled.

So now this guy got my attention... I have a look at my results and find most of my CUDA Fermi WU are inconclusive or computational error. Most errors appear to be "-9" (result overflow?) or some other triplets result exceeding the max allowed by the CUDA application; so not sure it actually is a graphics card problem.

Nevertheless, I resorted to come out here and ask the question, to see whether I could really be harming the project and there is something I can do about it.

Any help clarifying this issue would be highly appreciated.


6) Questions and Answers : Windows : Not uploading? (Message 918682)
Posted 17 Jul 2009 by Profile Meerkat
Yes. "Deadline" does not mean that your result is automatically ignored. It is just the trigger for another copy to be sent out in case yours does not come back at all. If yours gets back before the replacement, then you will get the credits (providing it validates OK) and the system will also wait until the replacement is returned or times out before deleting the WU to ensure that that cruncher is not penalised either.

Thanks for the clarification. To be honest I have been crunching for so long I am way over the "credit" thingy... I do it for the science and what the project means to me, for as small my contribution is, whilst knowing the pc's are doing something useful when idle.

The disappointment for potentially "lost" or ignored units has more to do with all the above then with any credit given.
7) Questions and Answers : Windows : Not uploading? (Message 918620)
Posted 17 Jul 2009 by Profile Meerkat
Thanks for the quick reply, fairly interesting.

Only query I have is what happens if there were no major outages: if it was simply a case of severely backed up servers, as indicated by one of the posters above?

Or from another perspective, is my overdue unit (when finally uploaded) going to be used by the project, provided it was received before the replacement?
8) Questions and Answers : Windows : Not uploading? (Message 918582)
Posted 17 Jul 2009 by Profile Meerkat
Same problem here.

Now, I have been crunching for SETI since 1999, it's the reason I started and is the reason why I have been leaving the PC's on most of the time (with some added costs to the family electricity bill).

The increased time taken to upload results (at times considerable) is becoming somewhat frustrating... I am not prone to criticism unless truly warranted and I am afraid this is one of these instances.

I have had to turn my machines off at night and when they are on during the day, they often fail to upload a unit after infinite retries; then when I turn them on again the following morning... you guessed it, it's too late and one of the units is now overdue.

Not a happy camper.

As a side note, I have pretty good crunching and unit turnaround times.

As this appears to be a known problem, may I respectfully suggest SETI increases the deadline by a few days, a week maybe? Sure they cannot be needed that urgently... A minor tweak to the deadline would allow the upload servers to get around to receiving the units in time; conversely, this would mean crunchers are not wasting valuable CPU time and electricity (environment, anyone?).

I apologise if this solution was already suggested elsewhere.

