@Pre-FERMI nVidia GPU users: Important warning

Message boards : Number crunching : @Pre-FERMI nVidia GPU users: Important warning
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 · Next

AuthorMessage
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 1646180 - Posted: 25 Feb 2015, 2:45:56 UTC
Last modified: 25 Feb 2015, 2:47:01 UTC

Oh hell yeah, fellas! We are IN BUSINESS! I have confirmed (via initial test only), that AstroPulse pulse correction is now being CORRECTLY DETECTED on R340 driver 341.44!!! I'll continue to do testing, but thinks are working so far!

Raistmer:
Can you please do some testing on your end, using 341.44, to verify it's working for you?

Richard:
Let's go get some drinks, bud! PS: Can you please confirm some testing on your 9800GT?
ID: 1646180 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1646232 - Posted: 25 Feb 2015, 5:55:49 UTC - in response to Message 1646180.  
Last modified: 25 Feb 2015, 6:38:26 UTC

Congrats on persisting through tough times, and coming through :D

It's no secret there are a lot of changes with long term support, coming from internal nVidia reasons, to global MS and Google decisions/changes. Keeping the widest range of backward support open is IMO a win for users, and the companies involved.

Now can you get Google to support OpenCL &/or Cuda on suitably equipped [stock] nexus devices ? ( :P, porting to renderscript is going to be a pain I could have done without )
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1646232 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1646325 - Posted: 25 Feb 2015, 9:34:48 UTC - in response to Message 1646180.  

Richard:
Let's go get some drinks, bud! PS: Can you please confirm some testing on your 9800GT?

Yay! Yes to both. Timezone clash means I've only just seen this - I'll load up as soon as the coffee kicks in.
ID: 1646325 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1646331 - Posted: 25 Feb 2015, 10:05:44 UTC - in response to Message 1646180.  
Last modified: 25 Feb 2015, 10:06:15 UTC

Raistmer:
Can you please do some testing on your end, using 341.44, to verify it's working for you?

Build with lifted ban for 341.xx drivers: https://www.dropbox.com/s/uedoe9qoxnqy6qu/AP7_win_x86_SSE2_OpenCL_NV_r2745.7z?dl=0
ID: 1646331 · Report as offensive
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 1646389 - Posted: 25 Feb 2015, 11:29:40 UTC - in response to Message 1646331.  

Raistmer:
Can you please do some testing on your end, using 341.44, to verify it's working for you?

Build with lifted ban for 341.xx drivers: https://www.dropbox.com/s/uedoe9qoxnqy6qu/AP7_win_x86_SSE2_OpenCL_NV_r2745.7z?dl=0


Raistmer:
You cannot lift the ban for all 341.xx drivers.
You still need to ban any driver between 337.88 and 341.44.
For example, 341.21 will give bad results.
ID: 1646389 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1646392 - Posted: 25 Feb 2015, 11:34:14 UTC

Confirming that tests complete with 9800GT, using Windows 7/64 and matching 64-bit WDDM version of the 341.44 driver.

All six problematic NV OpenCL test/demo apps passed.
Original Astropulse index case (r2690, single_pulses WU) passed and correct pulses found:

With 341.44:
Single pulse: peak_power=38.55 dm=-5720 fft_num=13991936 peak_bin=13999216 scale=2
Single pulse: peak_power=653.6 dm=-5749 fft_num=25804800 peak_bin=25807872 scale=9
Single pulse: peak_power=91.79 dm=-5800 fft_num=3063808 peak_bin=3071328 scale=5
Single pulse: peak_power=63.35 dm=-5803 fft_num=3063808 peak_bin=3071344 scale=4

With 337.88:
Single pulse: peak_power=38.55 dm=-5720 fft_num=13991936 peak_bin=13999216 scale=2
Single pulse: peak_power=653.6 dm=-5749 fft_num=25804800 peak_bin=25807872 scale=9
Single pulse: peak_power=91.79 dm=-5800 fft_num=3063808 peak_bin=3071328 scale=5
Single pulse: peak_power=63.35 dm=-5803 fft_num=3063808 peak_bin=3071344 scale=4

I'd call that a match.
ID: 1646392 · Report as offensive
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 1646395 - Posted: 25 Feb 2015, 11:37:21 UTC
Last modified: 25 Feb 2015, 11:43:28 UTC

AWESOME!

Richard: Also, OpenCL SDK tests 2 and 7 were failing for even R337 drivers, and for me, even those tests work successfully for me on R340 341.44. So, I believe this means that multiple bugs were likely fixed.
ID: 1646395 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1646424 - Posted: 25 Feb 2015, 13:01:20 UTC

Great news indeed. Nice to see that pre-Fermi GPUs can still be used constructively.
Soli Deo Gloria
ID: 1646424 · Report as offensive
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 1646856 - Posted: 26 Feb 2015, 12:51:03 UTC

My bug ticket has been updated (pasted below).
Richard: Would it be possible for you to verify that my findings in the 3rd message, also apply to your 9800GT? Thanks.



-----------------------------------------
19 February 2015 8:28 pm
JacobKlein
The last driver for my FX 3800M, was 12/5/2014, and did not include the fix. When can we expect a fixed driver to be released?
Thanks,
Jacob Klein
-----------------------------------------
25 February 2015 10:54 pm
Kevin Kang
Hi Jacob The latest R340 driver version - 341.44 was posted on February 24, 2015, it should contain this fix, please let us know if it works for your FX 3800M GPU. Thanks! http://www.nvidia.com/download/driverResults.aspx/82388/en-us
Thanks,
Kevin
-----------------------------------------
26 February 2015 4:47 am
JacobKlein
Hi Kevin/Team,
Yes, 341.44 does solve the problems I was having with SDK examples 19, 21, 26, 34, 36, 37 on R340 drivers. And it looks like it also solves the issue with SDK examples 2, 7, which affected R337 and R340. And it also appears to solve the issue in Bug 1554016, though that bug reporter will need to do final verification. Questions: 1) SDK example 9 still fails with a TDR and output "Out of Memory? - Error # -5 (CL_OUT_OF_RESOURCES) at line 248 , in file .\oclMatVecMul.cpp" --- Is that expected? And can you reproduce that error? 2) SDK example 13 crashes on exit - is that expected? 3) On several examples, my FX3800M output value for "CL_DEVICE_LOCAL_MEM_SIZE" reports "15 KByte", yet in R337 it reported "16 KByte"; which value is correct?
Thank you,
Jacob Klein
ID: 1646856 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1646860 - Posted: 26 Feb 2015, 13:38:19 UTC - in response to Message 1646856.  

Richard: Would it be possible for you to verify that my findings in the 3rd message, also apply to your 9800GT? Thanks.

Sure. Could you identify the remaining failure samples by name, please? The download page I got mine from was un-numbered, so #9 and #13 are ambiguous.
ID: 1646860 · Report as offensive
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 1646861 - Posted: 26 Feb 2015, 13:47:58 UTC - in response to Message 1646860.  
Last modified: 26 Feb 2015, 13:48:43 UTC

Questions: 1) SDK example 9 still fails with a TDR and output "Out of Memory? - Error # -5 (CL_OUT_OF_RESOURCES) at line 248 , in file .\oclMatVecMul.cpp" --- Is that expected? And can you reproduce that error? 2) SDK example 13 crashes on exit - is that expected? 3) On several examples, my FX3800M output value for "CL_DEVICE_LOCAL_MEM_SIZE" reports "15 KByte", yet in R337 it reported "16 KByte"; which value is correct?


Richard: Would it be possible for you to verify that my findings in the 3rd message, also apply to your 9800GT? Thanks.


Sure. Could you identify the remaining failure samples by name, please? The download page I got mine from was un-numbered, so #9 and #13 are ambiguous.


Example that produces the "CL_DEVICE_LOCAL_MEM_SIZE" discrepancy for me: example 22.

So...
SDK example 9: oclMatVecMul
SDK example 13: oclSimpleD3D10Texture
SDK example 22: oclRadixSort
ID: 1646861 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1646879 - Posted: 26 Feb 2015, 14:48:44 UTC - in response to Message 1646861.  

SDK example 9: oclMatVecMul

C:\OpenCL tests\oclMatVecMul\NVIDIA GPU Computing SDK\OpenCL\bin\win64\Release\oclMatVecMul.exe Starting...

Determining Matrix height from available GPU mem...
oclGetPlatformID...
clGetDeviceIDs clCreateContext...
clGetDeviceInfo...
Matrix width = 1100
Matrix height = 30504

Allocate and Init Host Mem...

Get the Platform ID...

Get the Device info and select Device...
# of Devices Available = 1
Using Device 0: GeForce 9800 GT
# of Compute Units = 14

clCreateContext...
clCreateCommandQueue...
clCreateBuffer (M, V and W in device global memory, mem_size_m = 134217600)...
oclLoadProgSource (oclMatVecMul.cl)...
clCreateProgramWithSource...
clBuildProgram...
clEnqueueWriteBuffer (M and V)...

Running with Kernel MatVecMulUncoalesced0...

Clear result with clEnqueueWriteBuffer (W)...
clCreateKernel...
Global Work Size = 30720
Local Work Size = 256
# of Work Groups = 120
clSetKernelArg...

clEnqueueNDRangeKernel (MatVecMulUncoalesced0)...
clEnqueueReadBuffer (W)...
Comparing against Host/C++ computation...

GPU Result MATCHES CPU Result within allowable tolerance

Running with Kernel MatVecMulUncoalesced1...

Clear result with clEnqueueWriteBuffer (W)...
clCreateKernel...
Global Work Size = 7168
Local Work Size = 256
# of Work Groups = 28
clSetKernelArg...

clEnqueueNDRangeKernel (MatVecMulUncoalesced1)...
clEnqueueReadBuffer (W)...
Comparing against Host/C++ computation...

GPU Result MATCHES CPU Result within allowable tolerance

Running with Kernel MatVecMulCoalesced0...

Clear result with clEnqueueWriteBuffer (W)...
clCreateKernel...
Global Work Size = 7168
Local Work Size = 256
# of Work Groups = 28
clSetKernelArg...

clEnqueueNDRangeKernel (MatVecMulCoalesced0)...
clEnqueueReadBuffer (W)...
Comparing against Host/C++ computation...

GPU Result MATCHES CPU Result within allowable tolerance

Running with Kernel MatVecMulCoalesced1...

Clear result with clEnqueueWriteBuffer (W)...
clCreateKernel...
Global Work Size = 7168
Local Work Size = 256
# of Work Groups = 28
clSetKernelArg...

clEnqueueNDRangeKernel (MatVecMulCoalesced1)...
clEnqueueReadBuffer (W)...
Comparing against Host/C++ computation...

GPU Result MATCHES CPU Result within allowable tolerance

Running with Kernel MatVecMulCoalesced2...

Clear result with clEnqueueWriteBuffer (W)...
clCreateKernel...
Global Work Size = 7168
Local Work Size = 256
# of Work Groups = 28
clSetKernelArg...

clEnqueueNDRangeKernel (MatVecMulCoalesced2)...
clEnqueueReadBuffer (W)...
Comparing against Host/C++ computation...

GPU Result MATCHES CPU Result within allowable tolerance

Running with Kernel MatVecMulCoalesced3...

Clear result with clEnqueueWriteBuffer (W)...
clCreateKernel...
Global Work Size = 7168
Local Work Size = 256
# of Work Groups = 28
clSetKernelArg...

clEnqueueNDRangeKernel (MatVecMulCoalesced3)...
clEnqueueReadBuffer (W)...
Comparing against Host/C++ computation...

GPU Result MATCHES CPU Result within allowable tolerance

Starting Cleanup...

C:\OpenCL tests\oclMatVecMul\NVIDIA GPU Computing SDK\OpenCL\bin\win64\Release\oclMatVecMul.exe Exiting...
-----------------------------------------------------------

SDK example 13: oclSimpleD3D10Texture

oclSimpleD3D10Texture Starting...

simpleD3D10Texture did not detect a D3D10 device, exiting...

Starting Cleanup...

[oclSimpleD3D10Texture] test results...
PASSED
> exiting in 3 seconds: 3...2...1...done!

SDK example 22: oclRadixSort

clGetPlatformID...
clGetDeviceIDs...
clCreateContext...
Create command queue...

CL_DEVICE_NAME: GeForce 9800 GT
CL_DEVICE_VENDOR: NVIDIA Corporation
CL_DRIVER_VERSION: 341.44
CL_DEVICE_VERSION: OpenCL 1.0 CUDA
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU
CL_DEVICE_MAX_COMPUTE_UNITS: 14
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
CL_DEVICE_MAX_WORK_ITEM_SIZES: 512 / 512 / 64
CL_DEVICE_MAX_WORK_GROUP_SIZE: 512
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1500 MHz
CL_DEVICE_ADDRESS_BITS: 32
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 128 MByte
CL_DEVICE_GLOBAL_MEM_SIZE: 512 MByte
CL_DEVICE_ERROR_CORRECTION_SUPPORT: no
CL_DEVICE_LOCAL_MEM_TYPE: local
CL_DEVICE_LOCAL_MEM_SIZE: 15 KByte
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE
CL_DEVICE_IMAGE_SUPPORT: 1
CL_DEVICE_MAX_READ_IMAGE_ARGS: 128
CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 8
CL_DEVICE_SINGLE_FP_CONFIG: INF-quietNaNs round-to-nearest round-to-zero round-to-inf fma

CL_DEVICE_IMAGE <dim> 2D_MAX_WIDTH 4096
2D_MAX_HEIGHT 16383
3D_MAX_WIDTH 2048
3D_MAX_HEIGHT 2048
3D_MAX_DEPTH 2048

CL_DEVICE_EXTENSIONS: cl_khr_byte_addressable_store
cl_khr_icd
cl_khr_gl_sharing
cl_nv_compiler_options
cl_nv_device_attribute_query
cl_nv_pragma_unroll
cl_nv_d3d9_sharing
cl_nv_d3d10_sharing
cl_khr_d3d10_sharing
cl_nv_d3d11_sharing
cl_nv_copy_opts
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics


CL_DEVICE_COMPUTE_CAPABILITY_NV: 1.1
NUMBER OF MULTIPROCESSORS: 14
NUMBER OF CUDA CORES: 112
CL_DEVICE_REGISTERS_PER_BLOCK_NV: 8192
CL_DEVICE_WARP_SIZE_NV: 32
CL_DEVICE_GPU_OVERLAP_NV: CL_TRUE
CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV: CL_TRUE
CL_DEVICE_INTEGRATED_MEMORY_NV: CL_FALSE
CL_DEVICE_PREFERRED_VECTOR_WIDTH_<t> CHAR 1, SHORT 1, INT 1, LONG 1, FLOAT 1, DOUBLE 0


Running Radix Sort on 1 GPU(s) ...
[PASSED]
ID: 1646879 · Report as offensive
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 1646881 - Posted: 26 Feb 2015, 14:54:52 UTC - in response to Message 1646879.  
Last modified: 26 Feb 2015, 14:58:18 UTC

So...

Did your SDK example 9 fail with a TDR at all, with output "Out of Memory? - Error # -5 (CL_OUT_OF_RESOURCES) at line 248 , in file .\oclMatVecMul.cpp"? It looks like you had no problem, right?

Did your SDK example 13 crash on exit? Output will not indicate the crash. Actually, it looks like yours did not detect a DirectX 10 device, and exited before it could crash. Mine attempted and crashed and burned.

I see your SDK example 22 shows "15 KByte" for "CL_DEVICE_LOCAL_MEM_SIZE", on 341.44. But did it show "16 KByte" on 337.88?
ID: 1646881 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1646893 - Posted: 26 Feb 2015, 15:46:20 UTC - in response to Message 1646881.  

ex 9: clean result, no errors
ex 13: no crash. Agree non-capable device
ex 22: didn't run under old driver, no comparison readily to hand
ID: 1646893 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1646894 - Posted: 26 Feb 2015, 15:46:40 UTC

ID: 1646894 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1648520 - Posted: 2 Mar 2015, 17:55:48 UTC

Any confirmation that new build works OK with 341.44 driver on pre-FERMI cards?
ID: 1648520 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1648526 - Posted: 2 Mar 2015, 18:19:51 UTC - in response to Message 1648520.  

Any confirmation that new build works OK with 341.44 driver on pre-FERMI cards?

Would you consider this a release candidate? If so, I'll dig the ancient 9800GT out of storage yet again and run the bench. If we're not near a release, I'll leave it a bit later to test any other tinkerings as well - I have to re-rig the supplementary power supply too, so it's all a bit of a hassle.
ID: 1648526 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1648536 - Posted: 2 Mar 2015, 18:41:03 UTC - in response to Message 1648520.  

Any confirmation that new build works OK with 341.44 driver on pre-FERMI cards?

My 9800GTX+ is 200 miles away and powered down, the earliest i try it is Saturday,

Claggy
ID: 1648536 · Report as offensive
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 1648641 - Posted: 3 Mar 2015, 1:16:36 UTC

I will gladly test with my FX3800M, if someone provides me with complete instructions on how/what to test.
ID: 1648641 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1648885 - Posted: 3 Mar 2015, 22:54:54 UTC - in response to Message 1648526.  

Any confirmation that new build works OK with 341.44 driver on pre-FERMI cards?

Would you consider this a release candidate? If so, I'll dig the ancient 9800GT out of storage yet again and run the bench. If we're not near a release, I'll leave it a bit later to test any other tinkerings as well - I have to re-rig the supplementary power supply too, so it's all a bit of a hassle.


There were no updates in AP tree since last beta. So yes, if this app works Ok it will be supplied to Eric for upload on beta and to you to include in installer update.
ID: 1648885 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 · Next

Message boards : Number crunching : @Pre-FERMI nVidia GPU users: Important warning


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.