Message boards :
Number crunching :
Astropulse fails on Nvidia Titan
Message board moderation
Author | Message |
---|---|
Chris Oliver Send message Joined: 4 Jul 99 Posts: 72 Credit: 134,288,250 RAC: 15 |
Is anybody successfully running any version of Astropulse on their Titan cards ??? I have tried both Lunatics and Seti's standard version but all workunits fail. http://setiathome.berkeley.edu/show_host_detail.php?hostid=6758290 |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Is anybody successfully running any version of Astropulse on their Titan cards ??? Is this specific to driver (314.14) or GPU (TITAN). That is, if app work OK with this driver (314.14) on another NV GPU and what about different driver for TITAN GPU ? SETI apps news We're not gonna fight them. We're gonna transcend them. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
|
Chris Oliver Send message Joined: 4 Jul 99 Posts: 72 Credit: 134,288,250 RAC: 15 |
Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 1.1 CUDA 4.2.1 Platform Name: NVIDIA CUDA Platform Vendor: NVIDIA Corporation Platform Extensions: cl_khr_byte_addressable_store c l_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_ sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll Platform Name: NVIDIA CUDA Number of devices: 1 Device Type: CL_DEVICE_TYPE_GPU Device ID: 4318 Max compute units: 14 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 64 Max work group size: 1024 Preferred vector width char: 1 Preferred vector width short: 1 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Max clock frequency: 875Mhz Address bits: 14757395255531667488 Max memory allocation: 1610530816 Image support: Yes Max number of images read arguments: 256 Max number of images write arguments: 16 Max image 2D width: 32768 Max image 2D height: 32768 Max image 3D width: 4096 Max image 3D height: 4096 Max image 3D depth: 4096 Max samplers within kernel: 32 Max size of kernel argument: 4352 Alignment (bits) of base address: 4096 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 128 Cache size: 229376 Global memory size: 6442123264 Constant buffer size: 65536 Max number of constant args: 9 Local memory type: Scratchpad Local memory size: 49152 Error correction support: 0 Profiling timer resolution: 1000 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue properties: Out-of-Order: Yes Profiling : Yes Platform ID: 02720EC8 Name: GeForce GTX TITAN Vendor: NVIDIA Corporation Driver version: 314.14 Profile: FULL_PROFILE Version: OpenCL 1.1 CUDA Extensions: cl_khr_byte_addressable_store c l_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_ sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extend ed_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics c l_khr_fp64 C:\> Can you post the ouput from CLinfo please |
SonicAgamemnon Send message Joined: 8 Apr 06 Posts: 33 Credit: 30,435,904 RAC: 7 |
This may be related: Could the issue be incorrect processor detection-- Fermi versus Kepler? For example, I have a Quadro K5000 (Kepler chip) but I've noticed all the WUs are labeled Fermi. I think the Titan uses a Kepler chip as well. Shouldn't the WUs be formatted for Kepler instead of Fermi, or since the processing uses an open API (OpenCL) perhaps this doesn't matter at all? It's a bit confusing. Here is my system start-up log: 3/13/2013 5:56:23 AM | | Starting BOINC client version 7.0.28 for windows_x86_64 3/13/2013 5:56:23 AM | | log flags: file_xfer, sched_ops, task 3/13/2013 5:56:23 AM | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6 3/13/2013 5:56:23 AM | | Data directory: C:\ProgramData\BOINC 3/13/2013 5:56:23 AM | | Running under account SonicAgamemnon 3/13/2013 5:56:23 AM | | Processor: 32 GenuineIntel Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz [Family 6 Model 45 Stepping 7] 3/13/2013 5:56:23 AM | | Processor: 256.00 KB cache 3/13/2013 5:56:23 AM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm vmx smx tm2 dca popcnt aes pbe 3/13/2013 5:56:23 AM | | OS: Microsoft Windows 7: Professional x64 Edition, Service Pack 1, (06.01.7601.00) 3/13/2013 5:56:23 AM | | Memory: 63.96 GB physical, 127.92 GB virtual 3/13/2013 5:56:23 AM | | Disk: 463.79 GB total, 305.50 GB free 3/13/2013 5:56:23 AM | | Local time is UTC -7 hours 3/13/2013 5:56:23 AM | | VirtualBox version: 4.2.6 3/13/2013 5:56:23 AM | | NVIDIA GPU 0: Quadro K5000 (driver version 311.15, CUDA version 5.0, compute capability 3.0, 4096MB, 8384396MB available, 2167 GFLOPS peak) 3/13/2013 5:56:23 AM | | OpenCL: NVIDIA GPU 0: Quadro K5000 (driver version 311.15, device version OpenCL 1.1 CUDA, 4096MB, 8384396MB available) 3/13/2013 5:56:23 AM | | Config: report completed tasks immediately 3/13/2013 5:56:23 AM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6907411; resource share 100 3/13/2013 5:56:23 AM | SETI@home | General prefs: from SETI@home (last modified 03-Mar-2013 06:11:06) 3/13/2013 5:56:23 AM | SETI@home | Computer location: home 3/13/2013 5:56:23 AM | SETI@home | General prefs: no separate prefs for home; using your defaults 3/13/2013 5:56:23 AM | | Reading preferences override file 3/13/2013 5:56:23 AM | | Preferences: 3/13/2013 5:56:23 AM | | max memory usage when active: 32747.49MB 3/13/2013 5:56:23 AM | | max memory usage when idle: 58945.49MB 3/13/2013 5:56:23 AM | | max disk usage: 25.00GB 3/13/2013 5:56:23 AM | | max download rate: 1022976 bytes/sec 3/13/2013 5:56:23 AM | | max upload rate: 1022976 bytes/sec 3/13/2013 5:56:23 AM | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) 3/13/2013 5:56:23 AM | | Not using a proxy 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.24496.72404.206158430216.10.136_0 using setiathome_enhanced version 603 in slot 13 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.2280.70359.206158430217.10.54_0 using setiathome_enhanced version 603 in slot 6 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.192.vlar_1 using setiathome_enhanced version 603 in slot 26 3/13/2013 5:56:23 AM | SETI@home | Restarting task 20oc12ad.7745.93187.206158430219.10.166_2 using setiathome_enhanced version 603 in slot 3 3/13/2013 5:56:23 AM | SETI@home | Restarting task 19se12ae.26599.41934.206158430219.10.250_2 using setiathome_enhanced version 603 in slot 8 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.212.vlar_1 using setiathome_enhanced version 603 in slot 25 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.232.vlar_0 using setiathome_enhanced version 603 in slot 22 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.89.vlar_1 using setiathome_enhanced version 603 in slot 19 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.206.vlar_1 using setiathome_enhanced version 603 in slot 24 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.73.vlar_0 using setiathome_enhanced version 603 in slot 11 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.83.vlar_1 using setiathome_enhanced version 603 in slot 29 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.9714.70768.206158430218.10.78_1 using setiathome_enhanced version 603 in slot 32 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.9714.70768.206158430218.10.84_0 using setiathome_enhanced version 603 in slot 4 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.13155.206158430214.10.235.vlar_1 using setiathome_enhanced version 603 in slot 30 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.9714.70768.206158430218.10.12_0 using setiathome_enhanced version 603 in slot 27 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.9714.70768.206158430218.10.72_0 using setiathome_enhanced version 603 in slot 9 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.18827.480.206158430219.10.57_0 using setiathome_enhanced version 603 in slot 15 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.19290.206158430214.10.148.vlar_1 using setiathome_enhanced version 603 in slot 16 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.24198.206158430214.10.40_0 using setiathome_enhanced version 603 in slot 17 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.24198.206158430214.10.28_0 using setiathome_enhanced version 603 in slot 20 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.24198.206158430214.10.22_1 using setiathome_enhanced version 603 in slot 12 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.24198.206158430214.10.34_0 using setiathome_enhanced version 603 in slot 18 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.24198.206158430214.10.81_1 using setiathome_enhanced version 603 in slot 1 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.1851.889.206158430221.10.26.vlar_1 using setiathome_enhanced version 603 in slot 10 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.126.vlar_1 using setiathome_enhanced version 603 in slot 23 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.157.vlar_1 using setiathome_enhanced version 603 in slot 21 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.1851.889.206158430221.10.39.vlar_0 using setiathome_enhanced version 603 in slot 0 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.145.vlar_1 using setiathome_enhanced version 603 in slot 5 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.57.vlar_1 using setiathome_enhanced version 603 in slot 14 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.151.vlar_0 using setiathome_enhanced version 603 in slot 28 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.1851.889.206158430221.10.41.vlar_1 using setiathome_enhanced version 603 in slot 2 3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.163.vlar_0 using setiathome_enhanced version 603 in slot 7 3/13/2013 5:56:23 AM | SETI@home | Restarting task 20oc12ad.7745.93187.206158430219.10.140_2 using setiathome_enhanced version 610 (cuda_fermi) in slot 31 "History is a pack of lies about events that never happened told by people who weren't there." - Santayana |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
This may be related: Could the issue be incorrect processor detection-- Fermi versus Kepler? That doesn't matter, it's just a plan_class, the same app will still be used. Claggy |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
This may be related: Could the issue be incorrect processor detection-- Fermi versus Kepler? And you are strongly advised never to change the plan class label unilaterally. It won't help processing in the slightest, and it is likely to cause you unexpected problems (like, the loss of cached work) when you come to upgrade applications in the future. |
Cliff Harding Send message Joined: 18 Aug 99 Posts: 1432 Credit: 110,967,840 RAC: 67 |
Have you added the CUDA_GRID_SIZE_COMPAT to you system environment yet? http://setiathome.berkeley.edu/forum_thread.php?id=69735#1296126 I don't buy computers, I build them!! |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Have you added the CUDA_GRID_SIZE_COMPAT to you system environment yet? Yes, he has: http://lunatics.kwsn.net/12-gpu-crunching/nvidia-titan-error-on-astropulse.msg51553.html#msg51553 Claggy |
ExchangeMan Send message Joined: 9 Jan 00 Posts: 115 Credit: 157,719,104 RAC: 0 |
Is anybody successfully running any version of Astropulse on their Titan cards ??? Hi Chris, I also have a Titan and am experiencing lots of problems running AP on it. I keep getting a -4 allocation error. The app just crashes and restarts with the same error again. You either have to suspend the job or abort it to stop it. You're using a newer driver than I am. I'm using 314.09 which is the first driver released with Titan. I've been working with Raistmer and trying different things, but so far to no avail. I've tried a new rebuild of windows, reverting back to 7.0.28 of Boinc, changing various parameters in the command line file, various versions of the AP application etc. I thought it was just me or my configuration, but apparently I'm not alone. I was going to try driver 314.14, but looks like that's what your running. The interesting thing is, I actually had a successful run using R1761 but can't get it to run anymore no matter what I do. I believe it just happened once, maybe twice. It kept getting the -4 error over and over, then magically for no explainable reason something happened and the job ran. Here's a link to the work unit. http://setiathome.berkeley.edu/result.php?resultid=2855921645 |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Looking through various error tasks to find an AP OpenCL one, I found this example: http://setiathome.berkeley.edu/result.php?resultid=2877559084 Which I notice has restarted many times choking on: ERROR: clEnqueueNDRangeKernel: GPU_coadd_kernel_cl: -4 which is: #define CL_MEM_OBJECT_ALLOCATION_FAILURE -4 the only thing that comes to mind there is perhaps Raistmer's using a lot of shared memory (in Cuda terminology), which might make sense for Coadds, so the extra SMX clusters per multiprocessor put some size out of bounds. I'm not familiar with the existing options for these applications, but if there is one to reduce threads per block ( or whatever the equivalent OpenCL terminology is), or switch the Kepler Cache mode to 'preferShared' from the default preferNone, then that may help (for debugging/test builds etc). "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Chris Oliver Send message Joined: 4 Jul 99 Posts: 72 Credit: 134,288,250 RAC: 15 |
Hi ExchangeMan, It sounds like we are experiencing exactly the same issue, it would be interesting to know if this issue occurs with the Tesla's as the Titan shares a lot of the internals.
|
ExchangeMan Send message Joined: 9 Jan 00 Posts: 115 Credit: 157,719,104 RAC: 0 |
Hi ExchangeMan, Ya, that question about AP and Tesla would be interesting. I would love to have a Tesla card to play with, but from the little I read about it MB runs as good or better on a 690 for a lot less dollars. Perhaps the source code for the Seti apps would need to be tweaked somehow to take advantage of Tesla and maybe that's true a little bit here on Titan - I'm not a Seti developer so I might just be talking out of you know what. |
Paul Bowyer Send message Joined: 15 Aug 99 Posts: 11 Credit: 137,603,890 RAC: 0 |
I installed the latest Nvidia beta driver - 314.21 - and just completed 4 AP's. Using 1761 and app_info set to: <avg_ncpus>1</avg_ncpus> <count>1</count> processpriority = abovenormal pfblockspersm = 16 pfperiodsperlaunch = 250 GPU load was at 9% running just 1 wu, but I'm going to try running 2. [/url] |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I installed the latest Nvidia beta driver - 314.21 - and just completed 4 AP's. I'm not sure about listed params. Looks like you listed params for CUDA MB app. You used these params for AP app: DATA_CHUNK_UNROLL set to:12 FFA thread block override value:8192 FFA thread fetchblock override value:4096 Perhaps, it's driver issue indeed cause your host has few finished OK and looking good tasks with 314.21. Could others update to this driver version ? Regarding shared memory overflow - AP was developed to be HD4xxx compatible so no shared memory. I need to check with code but as far as I can recall I did not add any "HD5" kernels to it since initial development. Some additional potential for further optimization here of course, but all in own time. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Paul Bowyer Send message Joined: 15 Aug 99 Posts: 11 Credit: 137,603,890 RAC: 0 |
I'm not sure about listed params. Looks like you listed params for CUDA MB app. oops. I was all over the place today, trying to tweak MB's as well.[/quote] |
Chris Oliver Send message Joined: 4 Jul 99 Posts: 72 Credit: 134,288,250 RAC: 15 |
Indeed 314.21(beta) is allowing Astropulse. All looks good so far.
|
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Indeed 314.21(beta) is allowing Astropulse. All looks good so far. What does the CLinfo output look like on this driver? Claggy |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
There is mention of a specific OpenCL fix for TITAN in the 314.21 beta driver release notes: [GeForce GTX TITAN]: Application crashes or a TDR occurs when running |
ExchangeMan Send message Joined: 9 Jan 00 Posts: 115 Credit: 157,719,104 RAC: 0 |
I've upgraded to the beta driver and am seeing success also. My first job aborted, but that was because I was fiddling around in the slots directory. Other than that, so far so good. BTW, I increased unroll to 14, ffa_block to 16384 and ffa_block_fetch to 8192. Don't know if that will make things run faster, but according to V4 Precision X the GPU usage is running over 90% most of the time - and that's with only a single job running. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.