Astropulse fails on Nvidia Titan


log in

Advanced search

Message boards : Number crunching : Astropulse fails on Nvidia Titan

1 · 2 · Next
Author Message
Chris Oliver
Avatar
Send message
Joined: 4 Jul 99
Posts: 39
Credit: 53,263,809
RAC: 31,759
United Kingdom
Message 1347187 - Posted: 16 Mar 2013, 6:24:31 UTC

Is anybody successfully running any version of Astropulse on their Titan cards ???

I have tried both Lunatics and Seti's standard version but all workunits fail.

http://setiathome.berkeley.edu/show_host_detail.php?hostid=6758290
____________

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3504
Credit: 47,772,147
RAC: 46,739
Russia
Message 1347279 - Posted: 16 Mar 2013, 12:18:38 UTC - in response to Message 1347187.

Is anybody successfully running any version of Astropulse on their Titan cards ???

I have tried both Lunatics and Seti's standard version but all workunits fail.

http://setiathome.berkeley.edu/show_host_detail.php?hostid=6758290


Is this specific to driver (314.14) or GPU (TITAN).
That is, if app work OK with this driver (314.14) on another NV GPU and what about different driver for TITAN GPU ?

____________

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4141
Credit: 33,634,155
RAC: 28,283
United Kingdom
Message 1347283 - Posted: 16 Mar 2013, 12:24:38 UTC - in response to Message 1347187.

Can you post the ouput from CLinfo please

http://boinc.berkeley.edu/dl/clinfo.zip

Claggy

Chris Oliver
Avatar
Send message
Joined: 4 Jul 99
Posts: 39
Credit: 53,263,809
RAC: 31,759
United Kingdom
Message 1347291 - Posted: 16 Mar 2013, 12:52:04 UTC - in response to Message 1347283.


Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.1 CUDA 4.2.1
Platform Name: NVIDIA CUDA
Platform Vendor: NVIDIA Corporation
Platform Extensions: cl_khr_byte_addressable_store c
l_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_
sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query
cl_nv_pragma_unroll


Platform Name: NVIDIA CUDA
Number of devices: 1
Device Type: CL_DEVICE_TYPE_GPU
Device ID: 4318
Max compute units: 14
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 64
Max work group size: 1024
Preferred vector width char: 1
Preferred vector width short: 1
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Max clock frequency: 875Mhz
Address bits: 14757395255531667488
Max memory allocation: 1610530816
Image support: Yes
Max number of images read arguments: 256
Max number of images write arguments: 16
Max image 2D width: 32768
Max image 2D height: 32768
Max image 3D width: 4096
Max image 3D height: 4096
Max image 3D depth: 4096
Max samplers within kernel: 32
Max size of kernel argument: 4352
Alignment (bits) of base address: 4096
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 128
Cache size: 229376
Global memory size: 6442123264
Constant buffer size: 65536
Max number of constant args: 9
Local memory type: Scratchpad
Local memory size: 49152
Error correction support: 0
Profiling timer resolution: 1000
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 02720EC8
Name: GeForce GTX TITAN
Vendor: NVIDIA Corporation
Driver version: 314.14
Profile: FULL_PROFILE
Version: OpenCL 1.1 CUDA
Extensions: cl_khr_byte_addressable_store c
l_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_
sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query
cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extend
ed_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics c
l_khr_fp64



C:\>




Can you post the ouput from CLinfo please

http://boinc.berkeley.edu/dl/clinfo.zip

Claggy


____________

Profile SonicAgamemnon
Avatar
Send message
Joined: 8 Apr 06
Posts: 31
Credit: 11,893,945
RAC: 0
United States
Message 1347295 - Posted: 16 Mar 2013, 12:59:58 UTC
Last modified: 16 Mar 2013, 13:00:20 UTC

This may be related: Could the issue be incorrect processor detection-- Fermi versus Kepler?

For example, I have a Quadro K5000 (Kepler chip) but I've noticed all the WUs are labeled Fermi. I think the Titan uses a Kepler chip as well. Shouldn't the WUs be formatted for Kepler instead of Fermi, or since the processing uses an open API (OpenCL) perhaps this doesn't matter at all? It's a bit confusing.

Here is my system start-up log:

3/13/2013 5:56:23 AM | | Starting BOINC client version 7.0.28 for windows_x86_64
3/13/2013 5:56:23 AM | | log flags: file_xfer, sched_ops, task
3/13/2013 5:56:23 AM | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
3/13/2013 5:56:23 AM | | Data directory: C:\ProgramData\BOINC
3/13/2013 5:56:23 AM | | Running under account SonicAgamemnon
3/13/2013 5:56:23 AM | | Processor: 32 GenuineIntel Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz [Family 6 Model 45 Stepping 7]
3/13/2013 5:56:23 AM | | Processor: 256.00 KB cache
3/13/2013 5:56:23 AM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm vmx smx tm2 dca popcnt aes pbe
3/13/2013 5:56:23 AM | | OS: Microsoft Windows 7: Professional x64 Edition, Service Pack 1, (06.01.7601.00)
3/13/2013 5:56:23 AM | | Memory: 63.96 GB physical, 127.92 GB virtual
3/13/2013 5:56:23 AM | | Disk: 463.79 GB total, 305.50 GB free
3/13/2013 5:56:23 AM | | Local time is UTC -7 hours
3/13/2013 5:56:23 AM | | VirtualBox version: 4.2.6
3/13/2013 5:56:23 AM | | NVIDIA GPU 0: Quadro K5000 (driver version 311.15, CUDA version 5.0, compute capability 3.0, 4096MB, 8384396MB available, 2167 GFLOPS peak)
3/13/2013 5:56:23 AM | | OpenCL: NVIDIA GPU 0: Quadro K5000 (driver version 311.15, device version OpenCL 1.1 CUDA, 4096MB, 8384396MB available)
3/13/2013 5:56:23 AM | | Config: report completed tasks immediately
3/13/2013 5:56:23 AM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6907411; resource share 100
3/13/2013 5:56:23 AM | SETI@home | General prefs: from SETI@home (last modified 03-Mar-2013 06:11:06)
3/13/2013 5:56:23 AM | SETI@home | Computer location: home
3/13/2013 5:56:23 AM | SETI@home | General prefs: no separate prefs for home; using your defaults
3/13/2013 5:56:23 AM | | Reading preferences override file
3/13/2013 5:56:23 AM | | Preferences:
3/13/2013 5:56:23 AM | | max memory usage when active: 32747.49MB
3/13/2013 5:56:23 AM | | max memory usage when idle: 58945.49MB
3/13/2013 5:56:23 AM | | max disk usage: 25.00GB
3/13/2013 5:56:23 AM | | max download rate: 1022976 bytes/sec
3/13/2013 5:56:23 AM | | max upload rate: 1022976 bytes/sec
3/13/2013 5:56:23 AM | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
3/13/2013 5:56:23 AM | | Not using a proxy
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.24496.72404.206158430216.10.136_0 using setiathome_enhanced version 603 in slot 13
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.2280.70359.206158430217.10.54_0 using setiathome_enhanced version 603 in slot 6
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.192.vlar_1 using setiathome_enhanced version 603 in slot 26
3/13/2013 5:56:23 AM | SETI@home | Restarting task 20oc12ad.7745.93187.206158430219.10.166_2 using setiathome_enhanced version 603 in slot 3
3/13/2013 5:56:23 AM | SETI@home | Restarting task 19se12ae.26599.41934.206158430219.10.250_2 using setiathome_enhanced version 603 in slot 8
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.212.vlar_1 using setiathome_enhanced version 603 in slot 25
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.232.vlar_0 using setiathome_enhanced version 603 in slot 22
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.89.vlar_1 using setiathome_enhanced version 603 in slot 19
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.206.vlar_1 using setiathome_enhanced version 603 in slot 24
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.73.vlar_0 using setiathome_enhanced version 603 in slot 11
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.9474.206158430214.10.83.vlar_1 using setiathome_enhanced version 603 in slot 29
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.9714.70768.206158430218.10.78_1 using setiathome_enhanced version 603 in slot 32
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.9714.70768.206158430218.10.84_0 using setiathome_enhanced version 603 in slot 4
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.13155.206158430214.10.235.vlar_1 using setiathome_enhanced version 603 in slot 30
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.9714.70768.206158430218.10.12_0 using setiathome_enhanced version 603 in slot 27
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.9714.70768.206158430218.10.72_0 using setiathome_enhanced version 603 in slot 9
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.18827.480.206158430219.10.57_0 using setiathome_enhanced version 603 in slot 15
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.19290.206158430214.10.148.vlar_1 using setiathome_enhanced version 603 in slot 16
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.24198.206158430214.10.40_0 using setiathome_enhanced version 603 in slot 17
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.24198.206158430214.10.28_0 using setiathome_enhanced version 603 in slot 20
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.24198.206158430214.10.22_1 using setiathome_enhanced version 603 in slot 12
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.24198.206158430214.10.34_0 using setiathome_enhanced version 603 in slot 18
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.19785.24198.206158430214.10.81_1 using setiathome_enhanced version 603 in slot 1
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.1851.889.206158430221.10.26.vlar_1 using setiathome_enhanced version 603 in slot 10
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.126.vlar_1 using setiathome_enhanced version 603 in slot 23
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.157.vlar_1 using setiathome_enhanced version 603 in slot 21
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.1851.889.206158430221.10.39.vlar_0 using setiathome_enhanced version 603 in slot 0
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.145.vlar_1 using setiathome_enhanced version 603 in slot 5
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.57.vlar_1 using setiathome_enhanced version 603 in slot 14
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.151.vlar_0 using setiathome_enhanced version 603 in slot 28
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29no12af.1851.889.206158430221.10.41.vlar_1 using setiathome_enhanced version 603 in slot 2
3/13/2013 5:56:23 AM | SETI@home | Restarting task 29oc12ab.6357.67.206158430217.10.163.vlar_0 using setiathome_enhanced version 603 in slot 7
3/13/2013 5:56:23 AM | SETI@home | Restarting task 20oc12ad.7745.93187.206158430219.10.140_2 using setiathome_enhanced version 610 (cuda_fermi) in slot 31
____________
"History is a pack of lies about events that never happened told by people who weren't there." - Santayana

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4141
Credit: 33,634,155
RAC: 28,283
United Kingdom
Message 1347297 - Posted: 16 Mar 2013, 13:06:04 UTC - in response to Message 1347295.

This may be related: Could the issue be incorrect processor detection-- Fermi versus Kepler?

For example, I have a Quadro K5000 (Kepler chip) but I've noticed all the WUs are labeled Fermi. I think the Titan uses a Kepler chip as well. Shouldn't the WUs be formatted for Kepler instead of Fermi, or since the processing uses an open API (OpenCL) perhaps this doesn't matter at all? It's a bit confusing.

That doesn't matter, it's just a plan_class, the same app will still be used.

Claggy

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8634
Credit: 51,617,393
RAC: 48,807
United Kingdom
Message 1347313 - Posted: 16 Mar 2013, 14:00:31 UTC - in response to Message 1347297.

This may be related: Could the issue be incorrect processor detection-- Fermi versus Kepler?

For example, I have a Quadro K5000 (Kepler chip) but I've noticed all the WUs are labeled Fermi. I think the Titan uses a Kepler chip as well. Shouldn't the WUs be formatted for Kepler instead of Fermi, or since the processing uses an open API (OpenCL) perhaps this doesn't matter at all? It's a bit confusing.

That doesn't matter, it's just a plan_class, the same app will still be used.

Claggy

And you are strongly advised never to change the plan class label unilaterally. It won't help processing in the slightest, and it is likely to cause you unexpected problems (like, the loss of cached work) when you come to upgrade applications in the future.

Profile Cliff HardingProject donor
Volunteer tester
Avatar
Send message
Joined: 18 Aug 99
Posts: 1009
Credit: 53,050,513
RAC: 43,107
United States
Message 1347353 - Posted: 16 Mar 2013, 16:47:49 UTC

Have you added the CUDA_GRID_SIZE_COMPAT to you system environment yet?

http://setiathome.berkeley.edu/forum_thread.php?id=69735#1296126
____________


I don't buy computers, I build them!!

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4141
Credit: 33,634,155
RAC: 28,283
United Kingdom
Message 1347355 - Posted: 16 Mar 2013, 16:51:05 UTC - in response to Message 1347353.

Have you added the CUDA_GRID_SIZE_COMPAT to you system environment yet?

http://setiathome.berkeley.edu/forum_thread.php?id=69735#1296126

Yes, he has:

http://lunatics.kwsn.net/12-gpu-crunching/nvidia-titan-error-on-astropulse.msg51553.html#msg51553

Claggy

ExchangeMan
Volunteer tester
Send message
Joined: 9 Jan 00
Posts: 113
Credit: 143,449,717
RAC: 204,927
United States
Message 1347358 - Posted: 16 Mar 2013, 16:59:51 UTC - in response to Message 1347187.

Is anybody successfully running any version of Astropulse on their Titan cards ???

I have tried both Lunatics and Seti's standard version but all workunits fail.

http://setiathome.berkeley.edu/show_host_detail.php?hostid=6758290

Hi Chris,

I also have a Titan and am experiencing lots of problems running AP on it. I keep getting a -4 allocation error. The app just crashes and restarts with the same error again. You either have to suspend the job or abort it to stop it. You're using a newer driver than I am. I'm using 314.09 which is the first driver released with Titan.

I've been working with Raistmer and trying different things, but so far to no avail.

I've tried a new rebuild of windows, reverting back to 7.0.28 of Boinc, changing various parameters in the command line file, various versions of the AP application etc.

I thought it was just me or my configuration, but apparently I'm not alone. I was going to try driver 314.14, but looks like that's what your running.

The interesting thing is, I actually had a successful run using R1761 but can't get it to run anymore no matter what I do. I believe it just happened once, maybe twice. It kept getting the -4 error over and over, then magically for no explainable reason something happened and the job ran. Here's a link to the work unit.

http://setiathome.berkeley.edu/result.php?resultid=2855921645

____________

Profile jason_gee
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 24 Nov 06
Posts: 5051
Credit: 73,860,851
RAC: 12,145
Australia
Message 1347364 - Posted: 16 Mar 2013, 17:12:05 UTC
Last modified: 16 Mar 2013, 17:14:51 UTC

Looking through various error tasks to find an AP OpenCL one, I found this example:

http://setiathome.berkeley.edu/result.php?resultid=2877559084

Which I notice has restarted many times choking on:

ERROR: clEnqueueNDRangeKernel: GPU_coadd_kernel_cl: -4


which is:
#define CL_MEM_OBJECT_ALLOCATION_FAILURE -4


the only thing that comes to mind there is perhaps Raistmer's using a lot of shared memory (in Cuda terminology), which might make sense for Coadds, so the extra SMX clusters per multiprocessor put some size out of bounds.

I'm not familiar with the existing options for these applications, but if there is one to reduce threads per block ( or whatever the equivalent OpenCL terminology is), or switch the Kepler Cache mode to 'preferShared' from the default preferNone, then that may help (for debugging/test builds etc).
____________
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin

Chris Oliver
Avatar
Send message
Joined: 4 Jul 99
Posts: 39
Credit: 53,263,809
RAC: 31,759
United Kingdom
Message 1347378 - Posted: 16 Mar 2013, 17:44:57 UTC - in response to Message 1347358.

Hi ExchangeMan,

It sounds like we are experiencing exactly the same issue, it would be interesting to know if this issue occurs with the Tesla's as the Titan shares a lot of the internals.


Hi Chris,

I also have a Titan and am experiencing lots of problems running AP on it. I keep getting a -4 allocation error. The app just crashes and restarts with the same error again. You either have to suspend the job or abort it to stop it. You're using a newer driver than I am. I'm using 314.09 which is the first driver released with Titan.

I've been working with Raistmer and trying different things, but so far to no avail.

I've tried a new rebuild of windows, reverting back to 7.0.28 of Boinc, changing various parameters in the command line file, various versions of the AP application etc.

I thought it was just me or my configuration, but apparently I'm not alone. I was going to try driver 314.14, but looks like that's what your running.

The interesting thing is, I actually had a successful run using R1761 but can't get it to run anymore no matter what I do. I believe it just happened once, maybe twice. It kept getting the -4 error over and over, then magically for no explainable reason something happened and the job ran. Here's a link to the work unit.

http://setiathome.berkeley.edu/result.php?resultid=2855921645


____________

ExchangeMan
Volunteer tester
Send message
Joined: 9 Jan 00
Posts: 113
Credit: 143,449,717
RAC: 204,927
United States
Message 1347383 - Posted: 16 Mar 2013, 17:59:29 UTC - in response to Message 1347378.

Hi ExchangeMan,

It sounds like we are experiencing exactly the same issue, it would be interesting to know if this issue occurs with the Tesla's as the Titan shares a lot of the internals.


Hi Chris,

I also have a Titan and am experiencing lots of problems running AP on it. I keep getting a -4 allocation error. The app just crashes and restarts with the same error again. You either have to suspend the job or abort it to stop it. You're using a newer driver than I am. I'm using 314.09 which is the first driver released with Titan.

I've been working with Raistmer and trying different things, but so far to no avail.

I've tried a new rebuild of windows, reverting back to 7.0.28 of Boinc, changing various parameters in the command line file, various versions of the AP application etc.

I thought it was just me or my configuration, but apparently I'm not alone. I was going to try driver 314.14, but looks like that's what your running.

The interesting thing is, I actually had a successful run using R1761 but can't get it to run anymore no matter what I do. I believe it just happened once, maybe twice. It kept getting the -4 error over and over, then magically for no explainable reason something happened and the job ran. Here's a link to the work unit.

http://setiathome.berkeley.edu/result.php?resultid=2855921645


Ya, that question about AP and Tesla would be interesting. I would love to have a Tesla card to play with, but from the little I read about it MB runs as good or better on a 690 for a lot less dollars. Perhaps the source code for the Seti apps would need to be tweaked somehow to take advantage of Tesla and maybe that's true a little bit here on Titan - I'm not a Seti developer so I might just be talking out of you know what.


____________

Paul BowyerProject donor
Send message
Joined: 15 Aug 99
Posts: 9
Credit: 73,500,195
RAC: 133,234
United States
Message 1347426 - Posted: 16 Mar 2013, 20:42:04 UTC

I installed the latest Nvidia beta driver - 314.21 - and just completed 4 AP's.
Using 1761 and app_info set to:
<avg_ncpus>1</avg_ncpus>
<count>1</count>
processpriority = abovenormal
pfblockspersm = 16
pfperiodsperlaunch = 250

GPU load was at 9% running just 1 wu, but I'm going to try running 2.

[/url]
____________

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3504
Credit: 47,772,147
RAC: 46,739
Russia
Message 1347429 - Posted: 16 Mar 2013, 20:53:40 UTC - in response to Message 1347426.
Last modified: 16 Mar 2013, 20:55:17 UTC

I installed the latest Nvidia beta driver - 314.21 - and just completed 4 AP's.
Using 1761 and app_info set to:
<avg_ncpus>1</avg_ncpus>
<count>1</count>
processpriority = abovenormal
pfblockspersm = 16
pfperiodsperlaunch = 250

GPU load was at 9% running just 1 wu, but I'm going to try running 2.

[/url]


I'm not sure about listed params. Looks like you listed params for CUDA MB app.
You used these params for AP app:

DATA_CHUNK_UNROLL set to:12
FFA thread block override value:8192
FFA thread fetchblock override value:4096

Perhaps, it's driver issue indeed cause your host has few finished OK and looking good tasks with 314.21.

Could others update to this driver version ?

Regarding shared memory overflow - AP was developed to be HD4xxx compatible so no shared memory. I need to check with code but as far as I can recall I did not add any "HD5" kernels to it since initial development.
Some additional potential for further optimization here of course, but all in own time.
____________

Paul BowyerProject donor
Send message
Joined: 15 Aug 99
Posts: 9
Credit: 73,500,195
RAC: 133,234
United States
Message 1347433 - Posted: 16 Mar 2013, 21:12:45 UTC

I'm not sure about listed params. Looks like you listed params for CUDA MB app.
You used these params for AP app:


oops. I was all over the place today, trying to tweak MB's as well.[/quote]
____________

Chris Oliver
Avatar
Send message
Joined: 4 Jul 99
Posts: 39
Credit: 53,263,809
RAC: 31,759
United Kingdom
Message 1347516 - Posted: 17 Mar 2013, 3:20:07 UTC - in response to Message 1347429.

Indeed 314.21(beta) is allowing Astropulse. All looks good so far.






I'm not sure about listed params. Looks like you listed params for CUDA MB app.
You used these params for AP app:

DATA_CHUNK_UNROLL set to:12
FFA thread block override value:8192
FFA thread fetchblock override value:4096

Perhaps, it's driver issue indeed cause your host has few finished OK and looking good tasks with 314.21.

Could others update to this driver version ?

Regarding shared memory overflow - AP was developed to be HD4xxx compatible so no shared memory. I need to check with code but as far as I can recall I did not add any "HD5" kernels to it since initial development.
Some additional potential for further optimization here of course, but all in own time.


____________

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4141
Credit: 33,634,155
RAC: 28,283
United Kingdom
Message 1347621 - Posted: 17 Mar 2013, 10:36:35 UTC - in response to Message 1347516.

Indeed 314.21(beta) is allowing Astropulse. All looks good so far.


What does the CLinfo output look like on this driver?

Claggy

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8634
Credit: 51,617,393
RAC: 48,807
United Kingdom
Message 1347624 - Posted: 17 Mar 2013, 10:56:04 UTC

There is mention of a specific OpenCL fix for TITAN in the 314.21 beta driver release notes:

[GeForce GTX TITAN]: Application crashes or a TDR occurs when running
applications that use OpenCL. [1237211]

ExchangeMan
Volunteer tester
Send message
Joined: 9 Jan 00
Posts: 113
Credit: 143,449,717
RAC: 204,927
United States
Message 1347821 - Posted: 17 Mar 2013, 18:02:21 UTC

I've upgraded to the beta driver and am seeing success also. My first job aborted, but that was because I was fiddling around in the slots directory. Other than that, so far so good.

BTW, I increased unroll to 14, ffa_block to 16384 and ffa_block_fetch to 8192. Don't know if that will make things run faster, but according to V4 Precision X the GPU usage is running over 90% most of the time - and that's with only a single job running.
____________

1 · 2 · Next

Message boards : Number crunching : Astropulse fails on Nvidia Titan

Copyright © 2014 University of California