GTX 295 Driver Hell

Message boards : Number crunching : GTX 295 Driver Hell
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1086503 - Posted: 12 Mar 2011, 19:05:27 UTC - in response to Message 1086499.  
Last modified: 12 Mar 2011, 19:14:40 UTC

Try reinstalling Lunatics with Astropulse off, CUDA on, and SETI set to the x32 SSE (Faster on AMD) option on. If you are running Intel, this is suboptimal, but testing it will help us understand why this is failing. Also, I dont think it will amount to a significant performance decrease.

??? (^$(%#&%#$ !!!

How would changing to a different CPU app affect what (I believe) you're trying to test - namely, the performance of the CUDA application when run (correctly) with the CUDA 4.0 DLLs?

Edit - for a discussion about DLL dependencies, I invite you to read my http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=1778&nowrap=true#39386
ID: 1086503 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1086510 - Posted: 12 Mar 2011, 19:52:08 UTC - in response to Message 1086499.  

There were some other files with similar naming convention that were in the toolkit. Just being lazy, I copied them all over to the project directory. Now I'm thinking the DLLs might be dependent on one of them. I will take all of the 32-Bit versions, zip them and post them shortly. Perhaps this is the reason it is working for me and not others.

*Edit: Cancel that, works fine without them.

Try reinstalling Lunatics with Astropulse off, CUDA on, and SETI set to the x32 SSE (Faster on AMD) option on. If you are running Intel, this is suboptimal, but testing it will help us understand why this is failing. Also, I dont think it will amount to a significant performance decrease.


That is the configuration I use on the machine that has a CUDA card at work. Did the process a few times just to be sure. I even tried one go with just CUDA enabled & no CPU processing. Was no go for me.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1086510 · Report as offensive
-BeNt-
Avatar

Send message
Joined: 17 Oct 99
Posts: 1234
Credit: 10,116,112
RAC: 0
United States
Message 1086515 - Posted: 12 Mar 2011, 20:11:32 UTC - in response to Message 1086483.  


I wasn't. I confidently predicted that a task would crash during early beta testing, because I knew I was using the wrong DLLs. Instead, Windows found the right DLLs installed in a completely different part of the hard disk, and ran the task to completion for me.


Umm k, it ran fine here, was an issue with the video card drivers not the dlls. I just don't want to run the 266.58's or the dev drivers. I'm perfectly happy with my 267.24's as of right now, as I've said in the past my machine is more for gaming than Seti@Home so if it effects my gaming performance it's out. Each machine is different. It was using the 4.0 dlls as prescribed. There must be a mix of drivers, namely older ones, that don't agree with the new Cuda dlls or vice versa. Either way, these are not ready for prime time, hence RC release and not RTM. Testing on my end is complete, cause driver crashes after about 11 hours running. Saw good gains though can't wait to see it when they get the mix sorted.\

No need to get defensive. ;)
Traveling through space at ~67,000mph!
ID: 1086515 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1091227 - Posted: 28 Mar 2011, 19:01:59 UTC

I remember reading this thread a while ago while working on the upgrade to my other computer, so I thought I would do the same to the new one I just finished building. I upgraded the .dll's to the cufft32_40_10.dll & cudart32_40_10.dll plus the libfftw3f-3.dll. Replaced all the references to the 3 vers in the app_info file, and even renamed the old files with a .bak extension. When I start BOINC, it seems to say that I am still running 3? (3020?)

3/28/2011 12:25:36 PM Starting BOINC client version 6.10.60 for windows_x86_64
3/28/2011 12:25:36 PM log flags: file_xfer, sched_ops, task
3/28/2011 12:25:36 PM Libraries: libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
3/28/2011 12:25:36 PM Data directory: C:\Documents and Settings\All Users\Application Data\BOINC
3/28/2011 12:25:36 PM Running under account Administrator
3/28/2011 12:25:36 PM Processor: 8 GenuineIntel Intel(R) Core(TM) i7 CPU @ 9500 @ 3.07GHz [Family 6 Model 26 Stepping 5]
3/28/2011 12:25:36 PM Processor: 256.00 KB cache
3/28/2011 12:25:36 PM Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm vmx tm2 popcnt pbe
3/28/2011 12:25:36 PM OS: Microsoft Windows XP: Professional x64 Edition, Service Pack 2, (05.02.3790.00)
3/28/2011 12:25:36 PM Memory: 5.99 GB physical, 7.61 GB virtual
3/28/2011 12:25:36 PM Disk: 232.88 GB total, 223.41 GB free
3/28/2011 12:25:36 PM Local time is UTC -5 hours
3/28/2011 12:25:36 PM NVIDIA GPU 0: GeForce GTX 260 (driver version 26658, CUDA version 3020, compute capability 1.3, 896MB, 597 GFLOPS peak)
3/28/2011 12:25:36 PM NVIDIA GPU 1: GeForce GTX 260 (driver version 26658, CUDA version 3020, compute capability 1.3, 896MB, 597 GFLOPS peak)
3/28/2011 12:25:36 PM NVIDIA GPU 2: GeForce GTX 260 (driver version 26658, CUDA version 3020, compute capability 1.3, 896MB, 597 GFLOPS peak)
3/28/2011 12:25:36 PM NVIDIA GPU 3: GeForce GTX 260 (driver version 26658, CUDA version 3020, compute capability 1.3, 896MB, 597 GFLOPS peak)
3/28/2011 12:25:36 PM SETI@home Found app_info.xml; using anonymous platform
3/28/2011 12:25:36 PM SETI@home URL http://setiathome.berkeley.edu/; Computer ID 5873188; resource share 100
3/28/2011 12:25:36 PM SETI@home General prefs: from SETI@home (last modified 26-May-2007 00:12:51)
3/28/2011 12:25:36 PM SETI@home Computer location: home
3/28/2011 12:25:36 PM SETI@home General prefs: no separate prefs for home; using your defaults
3/28/2011 12:25:36 PM Reading preferences override file
3/28/2011 12:25:36 PM Preferences:
3/28/2011 12:25:36 PM max memory usage when active: 3067.44MB
3/28/2011 12:25:36 PM max memory usage when idle: 5828.13MB
3/28/2011 12:25:36 PM max disk usage: 100.00GB
3/28/2011 12:25:36 PM suspend work if non-BOINC CPU load exceeds 85 %
3/28/2011 12:25:36 PM (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
3/28/2011 12:25:36 PM Not using a proxy

Here are my questions, I have XP 64 bit, the 64 bit BOINC and the 64 bit Lunatics installer installed, but when I went to my directory to change the above files, I noticed that all that was listed were the 32 bit ones. I copied both sets (32 & 64 bit) over to the correct dir, and added .bak to the old 32 bit ones. It appears that it is running the 32 bit file not the 64. Am I missing something here? How can I get it to run the 64 bit app by default? Thanks

ID: 1091227 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1091253 - Posted: 28 Mar 2011, 20:43:38 UTC - in response to Message 1091227.  

Okay,
3/28/2011 12:25:36 PM Starting BOINC client version 6.10.60 for windows_x86_64


You are running the 64bit client version.

CUDA version 3020,


This is the CUDA version you are running... 3.2 not the 32 bit.

I'm sure someone will come along that knows a bit more about it to explain better but everything looks like a go to me.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1091253 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1091280 - Posted: 28 Mar 2011, 22:02:15 UTC - in response to Message 1091227.  

I remember reading this thread a while ago while working on the upgrade to my other computer, so I thought I would do the same to the new one I just finished building. I upgraded the .dll's to the cufft32_40_10.dll & cudart32_40_10.dll plus the libfftw3f-3.dll. Replaced all the references to the 3 vers in the app_info file, and even renamed the old files with a .bak extension. When I start BOINC, it seems to say that I am still running 3? (3020?)
3/28/2011 12:25:36 PM NVIDIA GPU 0: GeForce GTX 260 (driver version 26658, CUDA version 3020, compute capability 1.3, 896MB, 597 GFLOPS peak)
3/28/2011 12:25:36 PM SETI@home Found app_info.xml; using anonymous platform

BOINC is showing you the driver you have installed. Then the version of CUDA that driver supports. Not what you put in your app_info.xml.
Here are my questions, I have XP 64 bit, the 64 bit BOINC and the 64 bit Lunatics installer installed, but when I went to my directory to change the above files, I noticed that all that was listed were the 32 bit ones. I copied both sets (32 & 64 bit) over to the correct dir, and added .bak to the old 32 bit ones. It appears that it is running the 32 bit file not the 64. Am I missing something here? How can I get it to run the 64 bit app by default? Thanks

The CUDA application is only 32-bit. Hence Lunatics_Win64v0.37_AP505r409_AKv8bx64_Cudax32f.exe
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1091280 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1091282 - Posted: 28 Mar 2011, 22:20:40 UTC - in response to Message 1091280.  

Thanks for the education, guys. One last question, what would the 64 bit versions be used for?

ID: 1091282 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65758
Credit: 55,293,173
RAC: 49
United States
Message 1091285 - Posted: 28 Mar 2011, 22:29:22 UTC - in response to Message 1091282.  

Thanks for the education, guys. One last question, what would the 64 bit versions be used for?

Well right now, At least last I heard, Cuda is only available in a 32bit executable, As the lunatics do not see any reason for a 64bit version for Windows.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1091285 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1091303 - Posted: 28 Mar 2011, 22:58:45 UTC - in response to Message 1091282.  

Thanks for the education, guys. One last question, what would the 64 bit versions be used for?

Presumably to give the people that wish to write 64-bit application the resources to do so.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1091303 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1091319 - Posted: 29 Mar 2011, 0:07:06 UTC - in response to Message 1091303.  
Last modified: 29 Mar 2011, 0:18:18 UTC

Thanks for the education, guys. One last question, what would the 64 bit versions be used for?

Presumably to give the people that wish to write 64-bit application the resources to do so.


Correct. For some detail as to why a 64 bit Cuda multibeam application is pointless at this stage:
- Cuda code runs on GPUs not CPUs, bittage of the Windows OS & CPU doesn't matter for device code.
- As of newer Cuda releases, 64 bit CPU applications must be linked with 64 bit cuda device code, which is slower due to double the pointer address sizes.
- 64 bit device code allows the use of the massive amounts memory available on Tesla cards.
- For the most part, in a very general sense, a lot of optimisation involves reorganising & minimising the memory access & reducing the footprint. Actually making use of enough memory to justify use of 64 bit pointers and address space on these cards that can (very few anyway) would be akin to de-optimising ... memory is very slow compared to computation
- Misconceptions about 64 bitness being faster, I suspect come from the fact that on CPU it generally is. In simple terms the special architecture circuitry & OS layers to handle both at once make 64 bit cpu apps faster running natively... this is not the case with a 64 bit model in GPUs, largely because they don't run an OS.

A 64 bit cpu host & cuda device code application would be slower (yes have tested this), double the maintenance & likely be restricted to professional supercompute cards & Fermis.

HTH,
Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1091319 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1091327 - Posted: 29 Mar 2011, 0:32:49 UTC - in response to Message 1091319.  

Thanks for the education, guys. One last question, what would the 64 bit versions be used for?

Presumably to give the people that wish to write 64-bit application the resources to do so.


Correct. For some detail as to why a 64 bit Cuda multibeam application is pointless at this stage:
- Cuda code runs on GPUs not CPUs, bittage of the Windows OS & CPU doesn't matter for device code.
- As of newer Cuda releases, 64 bit CPU applications must be linked with 64 bit cuda device code, which is slower due to double the pointer address sizes.
- 64 bit device code allows the use of the massive amounts memory available on Tesla cards.
- For the most part, in a very general sense, a lot of optimisation involves reorganising & minimising the memory access & reducing the footprint. Actually making use of enough memory to justify use of 64 bit pointers and address space on these cards that can (very few anyway) would be akin to de-optimising ... memory is very slow compared to computation
- Misconceptions about 64 bitness being faster, I suspect come from the fact that on CPU it generally is. In simple terms the special architecture circuitry & OS layers to handle both at once make 64 bit cpu apps faster running natively... this is not the case with a 64 bit model in GPUs, largely because they don't run an OS.

A 64 bit cpu host & cuda device code application would be slower (yes have tested this), double the maintenance & likely be restricted to professional supercompute cards & Fermis.

HTH,
Jason

So... so ia64 S@H CUDA app then? ;)
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1091327 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1091329 - Posted: 29 Mar 2011, 0:34:29 UTC

Jason, Thank You for the Great answer/explanation. Very informative. You da man :)

ID: 1091329 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1091335 - Posted: 29 Mar 2011, 1:04:10 UTC - in response to Message 1091327.  
Last modified: 29 Mar 2011, 2:00:06 UTC

So... so ia64 S@H CUDA app then? ;)


Maybe when we all have Tesla cards, and Boinc is changed to allow processing multiple tasks at once through the same application instance :D ... could happen!

[Edit:] BTW I assumed you meant x86_64 aka AMD64 aka EMT64. ia64 is Itanium.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1091335 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1091772 - Posted: 30 Mar 2011, 23:37:58 UTC - in response to Message 1091280.  

...
The CUDA application is only 32-bit. Hence Lunatics_Win64v0.37_AP505r409_AKv8bx64_Cudax32f.exe

Just to clarify, that 32 is happenstance since Jason's experimental build branch has gone through x33 and x34 already. But it is true that the released version's full name is "Lunatics_x32f_win32_cuda30_preview.exe" in both the 32 and 64 bit installers.
                                                                    Joe
ID: 1091772 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1091786 - Posted: 31 Mar 2011, 1:10:43 UTC - in response to Message 1091335.  

So... so ia64 S@H CUDA app then? ;)


Maybe when we all have Tesla cards, and Boinc is changed to allow processing multiple tasks at once through the same application instance :D ... could happen!

[Edit:] BTW I assumed you meant x86_64 aka AMD64 aka EMT64. ia64 is Itanium.

I meant ia64 - Itanium. Mostly just being silly as ia64 was the loosing 64-bit format with it not supporting x86.

Just to clarify, that 32 is happenstance since Jason's experimental build branch has gone through x33 and x34 already. But it is true that the released version's full name is "Lunatics_x32f_win32_cuda30_preview.exe" in both the 32 and 64 bit installers.

Thanks for pointing that out Joe. The devs at work using x32/x64 for platform designations so much I just read x32 as meaning x86 these days.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1091786 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5

Message boards : Number crunching : GTX 295 Driver Hell


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.