use all gpu setting in cc_config not needed anymore?

Message boards : Number crunching : use all gpu setting in cc_config not needed anymore?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1305063 - Posted: 11 Nov 2012, 20:52:59 UTC
Last modified: 11 Nov 2012, 20:57:49 UTC

Ubuntu 12.04 :
This is a new oddity that appeared with a kernel update, I just noticed.

opening log lines wrote:
Sun 11 Nov 2012 03:37:02 PM EST | | Starting BOINC client version 7.0.33 for x86_64-pc-linux-gnu
Sun 11 Nov 2012 03:37:02 PM EST | | log flags: file_xfer, sched_ops, task, coproc_debug, cpu_sched
Sun 11 Nov 2012 03:37:02 PM EST | | Libraries: libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3
Sun 11 Nov 2012 03:37:02 PM EST | | Data directory: /home/BOINC/BOINC
Sun 11 Nov 2012 03:37:02 PM EST | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz [Family 6 Model 30 Stepping 5]
Sun 11 Nov 2012 03:37:02 PM EST | | Processor: 8.00 MB cache
Sun 11 Nov 2012 03:37:02 PM EST | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dtherm tpr_shadow vnmi flexpriority ept vpid
Sun 11 Nov 2012 03:37:02 PM EST | | OS: Linux: 3.2.0-32-generic
Sun 11 Nov 2012 03:37:02 PM EST | | Memory: 7.80 GB physical, 8.00 GB virtual
Sun 11 Nov 2012 03:37:02 PM EST | | Disk: 909.02 GB total, 855.93 GB free
Sun 11 Nov 2012 03:37:02 PM EST | | Local time is UTC -5 hours
Sun 11 Nov 2012 03:37:02 PM EST | | NVIDIA GPU 0: GeForce GTX 460 (driver version unknown, CUDA version 5.0, compute capability 2.1, 1024MB, 134214564MB available, 941 GFLOPS peak)
Sun 11 Nov 2012 03:37:02 PM EST | | NVIDIA GPU 1: GeForce GTX 460 (driver version unknown, CUDA version 5.0, compute capability 2.1, 1024MB, 134214623MB available, 941 GFLOPS peak)
Sun 11 Nov 2012 03:37:02 PM EST | | OpenCL: NVIDIA GPU 0: GeForce GTX 460 (driver version 304.43, device version OpenCL 1.1 CUDA, 1024MB, 134214564MB available)
Sun 11 Nov 2012 03:37:02 PM EST | | OpenCL: NVIDIA GPU 1: GeForce GTX 460 (driver version 304.43, device version OpenCL 1.1 CUDA, 1024MB, 134214623MB available)
Sun 11 Nov 2012 03:37:02 PM EST | | NVIDIA library reports 2 GPUs
Sun 11 Nov 2012 03:37:02 PM EST | | No ATI library found
Sun 11 Nov 2012 03:37:02 PM EST | SETI@home | Found app_info.xml; using anonymous platform
Sun 11 Nov 2012 03:37:02 PM EST | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6806055; resource share 100
Sun 11 Nov 2012 03:37:02 PM EST | Einstein@Home | URL http://einstein.phys.uwm.edu/; Computer ID 6048447; resource share 100
Sun 11 Nov 2012 03:37:02 PM EST | SETI@home | General prefs: from SETI@home (last modified 21-Oct-2012 23:35:15)
Sun 11 Nov 2012 03:37:02 PM EST | SETI@home | Computer location: home
Sun 11 Nov 2012 03:37:02 PM EST | SETI@home | General prefs: no separate prefs for home; using your defaults
Sun 11 Nov 2012 03:37:02 PM EST | | Reading preferences override file
Sun 11 Nov 2012 03:37:02 PM EST | | Preferences:
Sun 11 Nov 2012 03:37:02 PM EST | | max memory usage when active: 7185.51MB
Sun 11 Nov 2012 03:37:02 PM EST | | max memory usage when idle: 7824.22MB
Sun 11 Nov 2012 03:37:02 PM EST | | max disk usage: 100.00GB
Sun 11 Nov 2012 03:37:02 PM EST | | max CPUs used: 6
Sun 11 Nov 2012 03:37:02 PM EST | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
Sun 11 Nov 2012 03:37:02 PM EST | | Not using a proxy
Sun 11 Nov 2012 03:37:03 PM EST | | Suspending computation - user request

I deleted my cc_config yesterday, since I had no need for the log flags or options at this time.
I forgot about the <use all coprocessor> line.. well, when I re-started boinc today, to my surprise , both gpus were detected and in use :-) without a cc-config telling it to do so. well.. there is a config file present but it has no options set , only log flags.
ID: 1305063 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1305068 - Posted: 11 Nov 2012, 21:04:12 UTC - in response to Message 1305063.  

Ubuntu 12.04 :
This is a new oddity that appeared with a kernel update, I just noticed.

opening log lines wrote:
Sun 11 Nov 2012 03:37:02 PM EST | | Starting BOINC client version 7.0.33 for x86_64-pc-linux-gnu
Sun 11 Nov 2012 03:37:02 PM EST | | log flags: file_xfer, sched_ops, task, coproc_debug, cpu_sched

You had coproc_debug enabled.

Claggy
ID: 1305068 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1305070 - Posted: 11 Nov 2012, 21:07:29 UTC
Last modified: 11 Nov 2012, 21:21:35 UTC

So is that why it was able to detect and use both gpu with out option <use_all_coproc> set ?


Edit : coproc debug is set to try and figure out why about 5% of my gpu tasks end in error.
The information it provides has not helped
HOst with the issue
see error page of work database

Edit2 :
I tested system wide memory for 48 hours, no errors.
I've tryed 5 different versions of nvidia drivers.
Cleaned and reseated gpu, even cleaned the contacts.
Power supply tests A ok with 40 amps to spare at full load.

really bugs the heck out of me that it produces normal results 95% of the time. then seemingly randomly poops out an error or two. and lately a couple invalids.
ID: 1305070 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1305080 - Posted: 11 Nov 2012, 21:19:04 UTC - in response to Message 1305070.  
Last modified: 11 Nov 2012, 21:19:39 UTC

So is that why it was able to detect and use both gpu with out option <use_all_coproc> set ?

No, you're suffering the Wacky Nvidia GPU Memory reporting Bug:

Sun 11 Nov 2012 03:37:02 PM EST | | NVIDIA GPU 0: GeForce GTX 460 (driver version unknown, CUDA version 5.0, compute capability 2.1, 1024MB, 134214564MB available, 941 GFLOPS peak)
Sun 11 Nov 2012 03:37:02 PM EST | | NVIDIA GPU 1: GeForce GTX 460 (driver version unknown, CUDA version 5.0, compute capability 2.1, 1024MB, 134214623MB available, 941 GFLOPS peak)
Sun 11 Nov 2012 03:37:02 PM EST | | OpenCL: NVIDIA GPU 0: GeForce GTX 460 (driver version 304.43, device version OpenCL 1.1 CUDA, 1024MB, 134214564MB available)
Sun 11 Nov 2012 03:37:02 PM EST | | OpenCL: NVIDIA GPU 1: GeForce GTX 460 (driver version 304.43, device version OpenCL 1.1 CUDA, 1024MB, 134214623MB available)


At the moment Both GPUs are reporting a similar amount of available GPU memory, so to Boinc they are similar enough, that half of Wacky Nvidia GPU memory reporting Bug is fixed in Boinc 7.0.36 (the other half is fixed in Boinc 7.0.38/39)

Claggy
ID: 1305080 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1305088 - Posted: 11 Nov 2012, 21:28:14 UTC
Last modified: 11 Nov 2012, 21:43:00 UTC

So is there a linux port for 7.0.36 yet ?

Edit : found it , just changed the 3 to a 6 in the link ageless sent me for 7.0.33
ID: 1305088 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1305108 - Posted: 11 Nov 2012, 21:48:12 UTC - in response to Message 1305088.  

So is there a linux port for 7.0.36 yet ?

There has been for almost the last two months:

BOINC 7 Change Log and news

One thing to note about Boinc versions 7.0.32 and later, they supply a higher internal flops value for the GPU, and this higher internal flops value puts existing GPU tasks on the verge of going Maximum Time Exceeded,
so any existing GPU tasks should be completed prior to upgrading, New GPU tasks will get new higher <rsc_fpops_est> and <rsc_fpops_bound> figures and won't be affected.

Claggy
ID: 1305108 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1305123 - Posted: 11 Nov 2012, 22:41:32 UTC

I don't visit the boinc development forums so, I would not have known.

This memory reporting bug , do you think it is whats causing the Cuda error
'(cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, (cudaAcc_NumDataPoints / fftlen) * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost))' in file 'cuda/cudaAcc_summax.cu' in line 239 : unspecified launch failure.
?



ID: 1305123 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1305132 - Posted: 11 Nov 2012, 22:58:56 UTC - in response to Message 1305123.  

I don't visit the boinc development forums so, I would not have known.

This memory reporting bug , do you think it is whats causing the Cuda error
'(cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, (cudaAcc_NumDataPoints / fftlen) * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost))' in file 'cuda/cudaAcc_summax.cu' in line 239 : unspecified launch failure.
?




No, that's totally separate, have a read of one of Jason recent posts: Message 1302800

Claggy
ID: 1305132 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1305138 - Posted: 11 Nov 2012, 23:14:23 UTC - in response to Message 1305132.  

No, that's totally separate, have a read of one of Jason recent posts: Message 1302800[/url]

Claggy



I am a participant in that thread already.

He is saying in so many many words that they have no idea what that problem is ?
ID: 1305138 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1305312 - Posted: 12 Nov 2012, 9:10:07 UTC - in response to Message 1305138.  
Last modified: 12 Nov 2012, 9:10:52 UTC

He is saying in so many many words that they have no idea what that problem is ?


Sortof correct :D (and rather funny!). The basic timeline goes something like:
- Everyone uses physical device, & driver, models & is happy with the limitations. ( XP Driver Model, similar on Linux with same hardware )
- Microsoft is as Microsoft does & decides your GPU is now part of your computer, designs WDDM looking forward some 15-25 years, for which ~Vista early release we know hasn't been well coped with since, in different ways at different levels.
- hardware Engineers caught up & ran with the new models, outrunning software dev capacity to read & understand the fundamental changes & change deprecated habits.

To a similar question from a user here, right back around when Cuda apps were first released here from nVidia Engineers, I publicly commented on that there were parallels (pun intended) to historic supercomputer development (having worked in the field) in the code approaches & technologies that would inevitably encounter the same set of difficulties along with new ones.

At the time I predicted 'Growing pains' and these gotchas are part of that process, and it seems they have been ongoing. You continue to have the choice of moving with the technologies or sticking with what works for you, both being viable approaches from time to time.

As with multithreading on multicore CPUs it has taken many years for software development to evolve to catch up, lagging far behind the hardware engineering. That includes OS, Drivers, Tools, but most especially in 'programmer sensibilities' which can be very resistant to change.

Being relatively 'new' (circa 1998), gpGPU is still very rapidly evolving, and what worked well last month may be marginal this month, and completely deprecated in the next. With Cuda things have stabilised quite a lot, but at the same time recent developments present an entirely new set of challenges. So much I've learned that is basically undocumented, that a series of articles directed at developers is on the cards.

I'm a proponent of learning & adapting as the technology evolves, while also craving some stability.

All that really says is that the technologies are still maturing. Whether that settles down anytime soon, or goes on for a lot longer, well your guess is as good as mine, though software developer inertia remains high as it has with multithreading.

Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1305312 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1305551 - Posted: 12 Nov 2012, 21:27:37 UTC

Thank you once more for the narrative. I do understand. Having just scratched the surface of my first "modern" computer in 10 years, I can fully empathise with the plight of the developer.
I am very willing to step back in time/tech to make this work but my Tardis lacks the proper coordinates.
This computer is for play and distributed computing only so, perhaps an older distro of linux would be more diagnostic. ?

Lately, I have been mucking around with bios settings, disabling various known "new" features in hopes some one of those features is the cause of my issue.

Specifically, where memory is concerned ... what memory setting would be least prone to failure ?
: it is currently set to "XMP" (intel) it is my understanding that this might be up-clocking my ram on the fly, though during POST, the stated settings are in fact the OEM settings for the memory modules.

Should the bios be set to allow memory re-mapping? (currently enabled)
ID: 1305551 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1305711 - Posted: 13 Nov 2012, 5:21:26 UTC - in response to Message 1305551.  
Last modified: 13 Nov 2012, 5:27:06 UTC

Actually probably the RAM thing could be a big issue, as later video integrates RAM, BUS, and VIDEO RAM all quite closely compared to the past.

If you're using an 'Auto' XMP profile, check the voltage supplied is correct to the RAM spec, and that the BIOS is set to a command rate of 2t. If not, you have to go to manual settings. Overeager auto settings have been a problem lately there, as most experienced system builders/overclockers find it's exceedingly rare to find a system memory cotnroller / RAM combination that will work reliably at a command rate of 1T, which unfortunately seems to be the default for many situations.

In addition, you want maximum RAM signal integrity without sinking or sourcing excessive current through the memory controller. That's adjusted by a setting called Vtt, which should be around 75-80% of the RAM rating. That's basically radio antenna 'impedance matching' of the terminations & circuit traces. In an increasing number of cases, anything auto could be asking for issues.

These circumstances with 'auto' settings not working right have only really become a big issue as 'mainstream enthusiast' midrange parts having become commonplace. It's a similar situation to the notorious 560ti challenges. An exaggerated analogy might be if your local used car dealer suddenly started selling refurbished Apache longbow helicopters due to excess supply & demand by iphone bearing gamer teenagers with too much money.

HTH,
Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1305711 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1305733 - Posted: 13 Nov 2012, 7:10:07 UTC

new log clip wrote:

Tue 13 Nov 2012 01:33:24 AM EST | | Starting BOINC client version 7.0.36 for x86_64-pc-linux-gnu
Tue 13 Nov 2012 01:33:24 AM EST | | log flags: file_xfer, sched_ops, task, coproc_debug, cpu_sched
Tue 13 Nov 2012 01:33:24 AM EST | | Libraries: libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3
Tue 13 Nov 2012 01:33:24 AM EST | | Data directory: /home/BOINC/BOINC
Tue 13 Nov 2012 01:33:24 AM EST | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz [Family 6 Model 30 Stepping 5]
Tue 13 Nov 2012 01:33:24 AM EST | | Processor: 8.00 MB cache
Tue 13 Nov 2012 01:33:24 AM EST | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dtherm tpr_shadow vnmi flexpriority ept vpid
Tue 13 Nov 2012 01:33:24 AM EST | | OS: Linux: 3.2.0-32-generic
Tue 13 Nov 2012 01:33:24 AM EST | | Memory: 3.24 GB physical, 8.00 GB virtual
Tue 13 Nov 2012 01:33:24 AM EST | | Disk: 909.02 GB total, 856.03 GB free
Tue 13 Nov 2012 01:33:24 AM EST | | Local time is UTC -5 hours
Tue 13 Nov 2012 01:33:24 AM EST | | NVIDIA GPU 0: GeForce GTX 460 (driver version unknown, CUDA version 5.0, compute capability 2.1, 1024MB, 907MB available, 941 GFLOPS peak)
Tue 13 Nov 2012 01:33:24 AM EST | | NVIDIA GPU 1: GeForce GTX 460 (driver version unknown, CUDA version 5.0, compute capability 2.1, 1024MB, 991MB available, 941 GFLOPS peak)
Tue 13 Nov 2012 01:33:24 AM EST | | OpenCL: NVIDIA GPU 0: GeForce GTX 460 (driver version 304.43, device version OpenCL 1.1 CUDA, 1024MB, 907MB available)
Tue 13 Nov 2012 01:33:24 AM EST | | OpenCL: NVIDIA GPU 1: GeForce GTX 460 (driver version 304.43, device version OpenCL 1.1 CUDA, 1024MB, 991MB available)
Tue 13 Nov 2012 01:33:24 AM EST | | NVIDIA library reports 2 GPUs

Tue 13 Nov 2012 01:33:24 AM EST | | No ATI library found
Tue 13 Nov 2012 01:33:24 AM EST | SETI@home | Found app_info.xml; using anonymous platform
Tue 13 Nov 2012 01:33:24 AM EST | | Version change (7.0.33 -> 7.0.36)
Tue 13 Nov 2012 01:33:24 AM EST | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6806055; resource share 100
Tue 13 Nov 2012 01:33:24 AM EST | Einstein@Home | URL http://einstein.phys.uwm.edu/; Computer ID 6048447; resource share 100
Tue 13 Nov 2012 01:33:24 AM EST | SETI@home | General prefs: from SETI@home (last modified 21-Oct-2012 23:35:15)
Tue 13 Nov 2012 01:33:24 AM EST | SETI@home | Computer location: home
Tue 13 Nov 2012 01:33:24 AM EST | SETI@home | General prefs: no separate prefs for home; using your defaults
Tue 13 Nov 2012 01:33:24 AM EST | | Reading preferences override file
Tue 13 Nov 2012 01:33:24 AM EST | | Preferences:
Tue 13 Nov 2012 01:33:24 AM EST | | max memory usage when active: 2990.18MB
Tue 13 Nov 2012 01:33:24 AM EST | | max memory usage when idle: 3255.97MB
Tue 13 Nov 2012 01:33:24 AM EST | | max disk usage: 100.00GB
Tue 13 Nov 2012 01:33:24 AM EST | | max CPUs used: 6


OK one bug exterminated :)

Still chucking occasional cuda memcpy errors .
I looked for the timing setting Jason mentioned and could not find a 1t or 2t setting only option/s are 1N 2N 3N <-- same thing ?
ID: 1305733 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1305755 - Posted: 13 Nov 2012, 9:35:12 UTC - in response to Message 1305733.  

...Still chucking occasional cuda memcpy errors .
I looked for the timing setting Jason mentioned and could not find a 1t or 2t setting only option/s are 1N 2N 3N <-- same thing ?


Yep, and there can be multiple names for 'Vtt' as well.

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1305755 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1306171 - Posted: 14 Nov 2012, 19:08:27 UTC

argh! ... This is crazy ! .. I watched over 100 tasks complete perfectly and, the minute I went to bed it crapped out an error every 15 minutes till I woke.
Now I am watching again ..guess what.. no errors. and nothing has changed except that I am actively typing and mousing.

Is the sleep bug really sneaky?

like, could there be drivers aflicted that are not on the list ?

What would be some other names for "Vtt" ?
ID: 1306171 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1306299 - Posted: 15 Nov 2012, 2:33:23 UTC - in response to Message 1306171.  

Sure, Linux drivers often having different build numbers something could still reside there. Those memcpy() failures though are not generally symptoms of the sleepy bug. More likely some sortof power saving settings by the sound of it, like the CPU throttling down & missing its cue.

for aliases for Vtt, try
http://www.masterslair.com/vcore-vtt-dram-pll-pch-voltages-explained-core-i7-i5-i3/#vtt-aka-imc-qpivtt-qpidram
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1306299 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1312670 - Posted: 8 Dec 2012, 19:29:57 UTC
Last modified: 8 Dec 2012, 19:30:26 UTC

Jason wrote:

for aliases for Vtt, try
http://www.masterslair.com/vcore-vtt-dram-pll-pch-voltages-explained-core-i7-i5-i3/#vtt-aka-imc-qpivtt-qpidram


Thanks, That web page was very useful. In fact it happens to be based on my exact model motherboard.

Updating my issue : I crunched for Einstein while seti was in repair mode. This shed some light of the problems I have been having
Since, I could not (and still cant) get a single GPU cuda task to even pass the initial validate sanity check at E@H.

Members of that forum pointed out that my "twin" gtx460 GPU runs in SLI mode with itself. and members had reported problems with SLI ...so I disabled in xconfig.
Also changed the "MultiGPU" option to off as well.
Now both GPUs operate independently of each other
While this did not help my E@H problem at all,... it did seem to stop the constant cuda-mem-copy errors I had been getting at S@H. (3 wus errored in 60 hours of operation VS 10 to 20 per day previously)

Now, I can plainly see some tasks coming back "invalid" here at S@H and all of those so far have been issued to "gpu2" in my host. Looks like I have found the culprit to a degree.
Literally, GPU 2 runs 30c hotter than GPU1 on the same card. Utilizing "coolbits" I can only control the fan for GPU1 ,GPU2's fan runs at the stock variable speeds.
I cant quite figure out how to get the coolbits thermal settings to apply to both fans.
ID: 1312670 · Report as offensive

Message boards : Number crunching : use all gpu setting in cc_config not needed anymore?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.