nVidia Titan issue

Message boards : Number crunching : nVidia Titan issue
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Woodgie
Avatar

Send message
Joined: 6 Dec 99
Posts: 134
Credit: 89,630,417
RAC: 55
United Kingdom
Message 1866978 - Posted: 12 May 2017, 19:53:33 UTC

OK, so it's been a while. A long while. Hi again guys and gals!

I've recently sparked up Outlander after a year or so of down time and am back into the crunching frey but I have encountered a bit of a problem which MIGHT be a case of "Your Graphics Card Is Too Old" and I'm willing to accept that... Begrudgingly.

I have 2 x nVidia 750Ti which crunch away happily, albeit not as fast as I'd like, so I found the old (1st gen) Titan which I used to use, stuck it in the computer, and tried to get it to crunch.

But it won't.

12/05/2017 20:30:44 |  | CUDA: NVIDIA GPU 0 (not used): GeForce GTX TITAN (driver version 382.05, CUDA version 8.0, compute capability 3.5, 4096MB, 4010MB available, 4707 GFLOPS peak)


I have a feeling that that 'compute capability 3.5' is the kicker here. I'm probably missing something about Hardware version/Driver version too. At least that's what my gut tells me.

-I have the Lunatics 0.44 optimised apps installed in Windows 10

-My app_info.xml has each GPU app set with an avg_ncpus of 0.25 and a max_ncpus of 0.5 (we can debate the wisdom of that later unless it's directly related to the issue at hand :-) ).

-I DID install the optimised apps BEFORE swapping in the Titan, I am going to re-install them after I finish this post to see if that makes any difference. I'll report back.

The Titan seems to be working in all other respects. It'll drive a monitor, the nVidia GeForce Experience application sees it and Device Manager reports it is not throwing any hissy-fits. Well OK, it actually reports "This device is working properly."

Other than that, welcome back, me![/b]
~W

ID: 1866978 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1866979 - Posted: 12 May 2017, 20:02:25 UTC - in response to Message 1866978.  

It sounds like your cc_config. Do you have <use_all_gpus>1</use_all_gpus> in it under options?

If not that, did you re-install the driver after adding the card? Clean install of course.
ID: 1866979 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1866980 - Posted: 12 May 2017, 20:04:21 UTC - in response to Message 1866979.  

Agree with Brent, check to see if you have a cc_config.xml in the BOINC folder
ID: 1866980 · Report as offensive
Profile Woodgie
Avatar

Send message
Joined: 6 Dec 99
Posts: 134
Credit: 89,630,417
RAC: 55
United Kingdom
Message 1866985 - Posted: 12 May 2017, 20:40:20 UTC - in response to Message 1866979.  

Thanks for that.

I didn't have <use_all_gpus>1</use_all_gpus> in cc_config and adding it has made no difference. (It worked without it with both 750Ti, though)

I've re-installed the optimised apps and twiddled with app-info.xml to no avail.

We'll see if any genius can spot what I am sure is obvious. I just can't see it.

I'll re-install the nVisia drivers, if that doesn't work; back to the 2 x 750Ti

It sounds like your cc_config. Do you have <use_all_gpus>1</use_all_gpus> in it under options?

If not that, did you re-install the driver after adding the card? Clean install of course.

~W

ID: 1866985 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1866987 - Posted: 12 May 2017, 20:45:25 UTC - in response to Message 1866985.  

post the first 30 lines of the start up log of BOINC
ID: 1866987 · Report as offensive
Profile Woodgie
Avatar

Send message
Joined: 6 Dec 99
Posts: 134
Credit: 89,630,417
RAC: 55
United Kingdom
Message 1867000 - Posted: 12 May 2017, 21:32:21 UTC - in response to Message 1866987.  

post the first 30 lines of the start up log of BOINC


12/05/2017 22:30:02 |  | Starting BOINC client version 7.6.33 for windows_x86_64
12/05/2017 22:30:02 |  | log flags: file_xfer, sched_ops, task, coproc_debug, mem_usage_debug, task_debug
12/05/2017 22:30:02 |  | Libraries: libcurl/7.47.1 OpenSSL/1.0.2g zlib/1.2.8
12/05/2017 22:30:02 |  | Data directory: C:\ProgramData\BOINC
12/05/2017 22:30:02 |  | Running under account zeus
12/05/2017 22:30:02 |  | [coproc] launching child process at C:\Program Files\BOINC\boinc.exe
12/05/2017 22:30:02 |  | [coproc] relative to directory C:\ProgramData\BOINC
12/05/2017 22:30:02 |  | [coproc] with data directory "C:\ProgramData\BOINC"
12/05/2017 22:30:02 |  | CUDA: NVIDIA GPU 0 (not used): GeForce GTX TITAN (driver version 376.53, CUDA version 8.0, compute capability 3.5, 4096MB, 4010MB available, 4707 GFLOPS peak)
12/05/2017 22:30:02 |  | CUDA: NVIDIA GPU 1: GeForce GTX 750 Ti (driver version 376.53, CUDA version 8.0, compute capability 5.0, 2048MB, 1689MB available, 1388 GFLOPS peak)
12/05/2017 22:30:02 |  | OpenCL: NVIDIA GPU 0 (not used): GeForce GTX TITAN (driver version 376.53, device version OpenCL 1.2 CUDA, 6144MB, 4010MB available, 4707 GFLOPS peak)
12/05/2017 22:30:02 |  | OpenCL: NVIDIA GPU 1: GeForce GTX 750 Ti (driver version 376.53, device version OpenCL 1.2 CUDA, 2048MB, 1689MB available, 1388 GFLOPS peak)
12/05/2017 22:30:02 |  | [coproc] NVIDIA library reports 2 GPUs
12/05/2017 22:30:02 |  | [coproc] No ATI library found.
12/05/2017 22:30:02 | SETI@home | Found app_info.xml; using anonymous platform
12/05/2017 22:30:02 |  | Host name: outlander
12/05/2017 22:30:02 |  | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz [Family 6 Model 60 Stepping 3]
12/05/2017 22:30:02 |  | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx tm2 pbe fsgsbase bmi1 smep bmi2
12/05/2017 22:30:02 |  | OS: Microsoft Windows 10: Professional x64 Edition, (10.00.15063.00)
12/05/2017 22:30:02 |  | Memory: 15.94 GB physical, 18.31 GB virtual
12/05/2017 22:30:02 |  | Disk: 231.87 GB total, 149.91 GB free
12/05/2017 22:30:02 |  | Local time is UTC +1 hours
12/05/2017 22:30:02 | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 8246123; resource share 100
12/05/2017 22:30:07 | SETI@home | General prefs: from SETI@home (last modified 24-Mar-2015 23:26:21)
12/05/2017 22:30:07 | SETI@home | Computer location: home
12/05/2017 22:30:07 |  | General prefs: using separate prefs for home
12/05/2017 22:30:07 |  | Preferences:
12/05/2017 22:30:07 |  | max memory usage when active: 8160.73MB
12/05/2017 22:30:07 |  | max memory usage when idle: 13057.18MB
12/05/2017 22:30:07 |  | max disk usage: 10.00GB
12/05/2017 22:30:07 |  | (to change preferences, visit a project web site or select Preferences in the Manager)
12/05/2017 22:30:07 |  | [mem_usage] All others: WS 2126.61MB, swap 1141.48MB, user 22.500s, kernel 20.859s
12/05/2017 22:30:07 |  | [mem_usage] enforce: available RAM 8160.73MB swap 3750.69MB
12/05/2017 22:30:07 | SETI@home | [coproc] Assigning 0.500000 of NVIDIA free instance 0 to 14au08ac.7668.18886.7.34.106_1
12/05/2017 22:30:07 | SETI@home | [coproc] Assigning 0.500000 of NVIDIA instance 0 to 13dc08ac.3470.113515.10.37.180_1

~W

ID: 1867000 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1867008 - Posted: 12 May 2017, 22:01:24 UTC - in response to Message 1867000.  
Last modified: 12 May 2017, 22:23:20 UTC

You should have a line ..
2017-05-12 3:55:43 PM | | Config: use all coprocessors
It didn't seem to read it
cc_config should be of the form

<cc_config>
    <log_flags>
        <file_xfer>1</file_xfer>
        <sched_ops>1</sched_ops>
    </log_flags>
    <options>
        <use_all_gpus>1</use_all_gpus>
    </options>
</cc_config>
ID: 1867008 · Report as offensive
Harri Liljeroos
Avatar

Send message
Joined: 29 May 99
Posts: 3989
Credit: 85,281,665
RAC: 126
Finland
Message 1867009 - Posted: 12 May 2017, 22:08:03 UTC - in response to Message 1867000.  
Last modified: 12 May 2017, 22:09:43 UTC

It seems that your cc_config.xml is not found, it should go to the Boinc data directory \ProgramData\Boinc.
I would expect to see a line like this in the log:

10-May-2017 11:47:48 [---] Config: use all coprocessors


[edit] Brent beat me while I was typing.
ID: 1867009 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1867019 - Posted: 12 May 2017, 22:32:09 UTC - in response to Message 1867009.  

It could also be he inserted the cc_config but didn't restart BOINC or tell it to reread the config files.

Somethings to consider
ID: 1867019 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1867126 - Posted: 13 May 2017, 15:16:00 UTC

I see it is working now
setiathome_CUDA: Found 2 CUDA device(s):
nVidia Driver Version 376.53
  Device 1: GeForce GTX TITAN, 4095 MiB, regsPerBlock 65536
     computeCap 3.5, multiProcs 14 
     pciBusID = 1, pciSlotID = 0
  Device 2: GeForce GTX 750 Ti, 2048 MiB, regsPerBlock 65536
     computeCap 5.0, multiProcs 5 
     pciBusID = 2, pciSlotID = 0
ID: 1867126 · Report as offensive
Profile Woodgie
Avatar

Send message
Joined: 6 Dec 99
Posts: 134
Credit: 89,630,417
RAC: 55
United Kingdom
Message 1867260 - Posted: 14 May 2017, 13:18:07 UTC - in response to Message 1867126.  

It worked for about 5 minutes. I was writing a reply explaining I somehow missed the <use_all_gpus>0</use_all_gpus> directive at the bottom of the file (obviously over-riding the <use_all_gpus>1</use_all_gpus> I put earlier in the file) when...

My CPU gave up! (logic board reporting through a hex code display that the CPU is not initialising and a very prominent and red "your CPU is fried" led)

But that's a whole 'nother story...

What I don't quite get is that I never edited the cc_config file when I was running the 2 x 750Ti - so it must have had a <use_all_gpus>0</use_all_gpus> directive - yet BOINC/Seti@Home used both GPUs.

I see it is working now
setiathome_CUDA: Found 2 CUDA device(s):
nVidia Driver Version 376.53
  Device 1: GeForce GTX TITAN, 4095 MiB, regsPerBlock 65536
     computeCap 3.5, multiProcs 14 
     pciBusID = 1, pciSlotID = 0
  Device 2: GeForce GTX 750 Ti, 2048 MiB, regsPerBlock 65536
     computeCap 5.0, multiProcs 5 
     pciBusID = 2, pciSlotID = 0

~W

ID: 1867260 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1867284 - Posted: 14 May 2017, 16:58:15 UTC - in response to Message 1867260.  

Sorry to hear about the CPU...

Having that go is never a good experience. I hope you are able to get a replacement for it or are you thinking new cpu and new MoBo?
ID: 1867284 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1867298 - Posted: 14 May 2017, 19:17:47 UTC - in response to Message 1867260.  

OUCH That's a nasty turn of events! Sorry to hear that.

I heard someone mention that you don't need <use_all_gpus>1</use_all_gpus> IF your cards are identical, or very similar. That must have been what was happening before.
ID: 1867298 · Report as offensive
Profile Woodgie
Avatar

Send message
Joined: 6 Dec 99
Posts: 134
Credit: 89,630,417
RAC: 55
United Kingdom
Message 1867398 - Posted: 15 May 2017, 4:57:45 UTC - in response to Message 1867298.  

It's a 3rd act plot twist I could have done without, that's for sure. To answer an earlier question I think I'll just go for a new CPU for the moment.

Makes sense about the cc_config file and identical cards, though, if that is the case.

It's all knowledge and experience which I can pass on if I see someone else in the same situation.

Thanks, all.

OUCH That's a nasty turn of events! Sorry to hear that.

I heard someone mention that you don't need <use_all_gpus>1</use_all_gpus> IF your cards are identical, or very similar. That must have been what was happening before.

~W

ID: 1867398 · Report as offensive

Message boards : Number crunching : nVidia Titan issue


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.