I've Built a Couple OSX CUDA Apps...

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 48 · 49 · 50 · 51 · 52 · 53 · 54 . . . 58 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1919160 - Posted: 16 Feb 2018, 3:27:09 UTC - in response to Message 1919157.  

What about App_Config.xml??? (I'm currently using one with the CUDA75 App to run 2 Units at a time...) I'm assuming that I want to ONLY run one Unit at a time...??? (Unless the OpenCL App will allow me to run two Units at a time, like CUDA...???)

Running 2 GPU WUs at a time using SoG with anything other than high end hardware will result in even less work being done. And each WU being processed requires 1 CPU core to support it.
Grant
Darwin NT
ID: 1919160 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1919163 - Posted: 16 Feb 2018, 3:33:04 UTC - in response to Message 1919160.  

What about App_Config.xml??? (I'm currently using one with the CUDA75 App to run 2 Units at a time...) I'm assuming that I want to ONLY run one Unit at a time...??? (Unless the OpenCL App will allow me to run two Units at a time, like CUDA...???)

Running 2 GPU WUs at a time using SoG with anything other than high end hardware will result in even less work being done. And each WU being processed requires 1 CPU core to support it.

OK; so, (for me), the Max_Concurrent Line would be "2". What other changes would need to be made to the App_Config that I've pasted here? I'm assuming that the two CommandLine Parameters need to be changed to work with my two 750TI SC cards, right?


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1919163 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1919191 - Posted: 16 Feb 2018, 4:16:28 UTC - in response to Message 1919163.  
Last modified: 16 Feb 2018, 4:32:08 UTC

Ah, you should be running One OpenCL task at a time, you are Not running any CPU tasks. There isn't any need for an App_Config. None. Just use the files in the download with the changes I posted. Either rename the App_Config so it isn't used or remove it from the folder.
If you want to try a speedup, just lower the cmd -period_iterations_num 16 to something lower say;
-period_iterations_num 10

BTW, behold...the fastest Mac App at SETI, https://setiathome.berkeley.edu/results.php?hostid=8424399&offset=220
Now that is a Fast Mac.
ID: 1919191 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1919200 - Posted: 16 Feb 2018, 5:47:11 UTC - in response to Message 1919191.  

Ah, you should be running One OpenCL task at a time, you are Not running any CPU tasks. There isn't any need for an App_Config. None. Just use the files in the download with the changes I posted. Either rename the App_Config so it isn't used or remove it from the folder.
If you want to try a speedup, just lower the cmd -period_iterations_num 16 to something lower say;
-period_iterations_num 10

BTW, behold...the fastest Mac App at SETI, https://setiathome.berkeley.edu/results.php?hostid=8424399&offset=220
Now that is a Fast Mac.

OK - no App_Config... Once my CUDA75 Units complete, I'll remove the file.

Now, from the setiathome.berkeley.edu Folder; I need to remove the x41zi-CUDA75 File, the App_Config - and then, what??? Copy and paste the five files from the r3709 Extracted Folder...??? (Three "exec" Files, one .txt, (Command Line File), and the App_Info.xml File.) Let me know if this is right. Thanks. :-)


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1919200 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1919451 - Posted: 17 Feb 2018, 14:47:15 UTC

Hi,

I found this http://setiathome.berkeley.edu/result.php?resultid=6412085953

An NVIDIA quadro 4000 is experiencing difficulties when launching a kernel.

Looks like it is an official cuda 7.5 version (not an anonymous platform).
"Too many resources reuested" indicates too much shared mem, blocks, registers or threads requested at launch.
First guess: If it is not asking for too much shared mem, then there may be a __launch_bounds__(threads, blocks) directive at source code for the kernel requesting too many simultaneous blocks.

Petri
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1919451 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1919485 - Posted: 17 Feb 2018, 18:40:00 UTC - in response to Message 1919451.  

There's too much wrong with that scenario to even bother with;
1) The Quadro 4000 is a Fermi GPU, which we know doesn't work with CUDA 7.5;
The CUDA 6.0 Special App is for the older Kepler CC 3.5 GPUs that might not work well with CUDA 7.5...
2) Fermi GPUs are supposed to be Blocked from the 7.5 App;
I added a compute capability >= 3.0 to the plan class.
3) It might still work on that Fermi GPU if He were using close to the Correct Driver;
a) He is using Driver 4600.61
b) He should be using Driver 5243.59

I gave up on that machine a while back. It's similar to the ATI HD4 GPUs that keep being sent the HD5 Apps. There doesn't seem to be a way to Stop it.

Kinda like how All the versions of the CUDA Special zi3x Apps I've built keep producing Invalids.... ;-)
ID: 1919485 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1919604 - Posted: 18 Feb 2018, 0:37:19 UTC
Last modified: 18 Feb 2018, 1:07:50 UTC

@TBar,

Just finished crunching the last of the CUDA75 Units. I still need to know about the five extracted files... (Three "Exec" Files, one App_Info.xml, and the CommandLine-".txt" file.) Do I place all five of these files into the "setiathome.berkeley.edu" Project Folder???

I'd like to start crunching with the new OpenCL App as soon as possible. :-)


Thanks. :-)


TL

[EDIT:]

Out of 200 Units assigned to Andromeda/Hackintosh, 150 are now Valid, 48 are Pending, and 2 are marked as Inconclusive. (CUDA75 App.) Even though this is a slower App, these numbers are an improvement over what I had when I was on 10.11.4.

I'm looking forward to even greater improvement with the new OpenCL App.


TL

[EDIT 2:]

[Modified App_Info.xml]

<app_info>
<app>
<name>setiathome_v8</name>
</app>
<file_info>
<name>MBv8_8.22r3709_NV_ssse3_x86_64-apple-darwin</name>
<executable/>
</file_info>
<file_info>
<name>MultiBeam_Kernels_r3709.cl</name>
</file_info>
<file_info>
<name>mb_cmdline_mac_OpenCL_NV_sah.txt</name>
</file_info>
<app_version>
<app_name>setiathome_v8</app_name>
<platform>x86_64-apple-darwin</platform>
<version_num>819</version_num>
<plan_class>opencl_nvidia_mac</plan_class>
<avg_ncpus>0.1</avg_ncpus>
<max_ncpus>0.1</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>MBv8_8.22r3709_NV_ssse3_x86_64-apple-darwin</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>MultiBeam_Kernels_r3709.cl</file_name>
</file_ref>
<file_ref>
<file_name>mb_cmdline_mac_OpenCL_NV_sah.txt</file_name>
<open_name>mb_cmdline.txt</open_name>
</file_ref>
</app_version>
</app_info>


Is this right, now??? I think I've taken out the CPU Section properly; but, would like you to double check what I've done. (This is my first "hack" at an App_Info.xml file. (I'm more proficient with App_Config.xml...))


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1919604 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1919625 - Posted: 18 Feb 2018, 2:44:42 UTC

[Update:]

2-17-2018 at 6:45 PM - PST

I took a chance, copied and pasted the modified App_Info.xml, the CommandLine.txt File, and two of the three "Exec" Files. (I did NOT copy over the CPU "Exec" File.)

I Resumed SETI processing, but Event Log states No Tasks Available... No errors, though, in the Event Log; I think everything is OK...


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1919625 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1919628 - Posted: 18 Feb 2018, 2:57:08 UTC - in response to Message 1919625.  

I Resumed SETI processing, but Event Log states No Tasks Available...

Seti is still having problems getting data from Green Bank & Arecibo, hence no data to split, and so no work at the moment,
Grant
Darwin NT
ID: 1919628 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1919645 - Posted: 18 Feb 2018, 5:11:20 UTC - in response to Message 1919628.  

I Resumed SETI processing, but Event Log states No Tasks Available...
T.L. you should know to check the current Panic Mode On thread 1st before posting. :-O

Anyhow there's more fodder to chew on now. :-D

Cheers.
ID: 1919645 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1919691 - Posted: 18 Feb 2018, 10:52:40 UTC

[Update:]

StdErr Report on a recent OpenCL Unit:

[Task ID: 6414974466

Name blc12_2bit_guppi_58137_27028_HIP45688_0013.4244.0.22.45.197.vlar_1
Workunit 2865978324
Created 18 Feb 2018, 8:25:53 UTC
Sent 18 Feb 2018, 8:25:54 UTC
Report deadline 12 Apr 2018, 13:25:36 UTC
Received 18 Feb 2018, 10:14:15 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x00000000)
Computer ID 7952666
Run time 13 min 44 sec
CPU time 5 min 4 sec
Validate state Valid
Credit 62.68
Device peak FLOPS 1,605.76 GFLOPS
Application version SETI@home v8
Anonymous platform (NVIDIA GPU)
Peak working set size 64.48 MB
Peak swap size 3,205.29 MB
Peak disk usage 0.04 MB


<core_client_version>7.6.22</core_client_version>
<![CDATA[
<stderr_txt>
Running on device number: 0
Maximum single buffer size set to:256MB
SpikeFind FFT size threshold override set to:2048
TUNE: kernel 1 now has workgroup size of (64,1,4)
Number of period iterations for PulseFind set to 16
OpenCL platform detected: Apple
Number of OpenCL devices found : 2
BOINC assigns slot on device #1 of 2 devices.
Info: BOINC provided OpenCL device ID used

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_INTEL OCL_ZERO_COPY OCL_CHIRP3 ASYNC_SPIKE FFTW SSSE3 64bit
System: Darwin x86_64 Kernel: 15.6.0
CPU : Intel(R) Core(TM)2 Extreme CPU X9650 @ 3.00GHz
GenuineIntel x86, Family 6 Model 23 Stepping 6
Features : FPU TSC PAE APIC MTRR MMX SSE SSE2 HT SSE3 SSSE3 SSE4.1

OpenCL-kernels filename : MultiBeam_Kernels_r3709.cl
ar=0.007971 NumCfft=94343 NumGauss=0 NumPulse=24008612992 NumTriplet=36950910368
Currently allocated 313 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized setiathome_v8 application
Version info: SSSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSSE3x OS X 64bit Build 3709 , Ported by : Raistmer, JDWhale, Urs Echternacht
OpenCL version by Raistmer, r3709

Number of OpenCL platforms: 1


OpenCL Platform Name: Apple
Number of devices: 2
Max compute units: 5
Max work group size: 1024
Max clock frequency: 1254Mhz
Max memory allocation: 536870912
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 2147483648
Constant buffer size: 65536
Max number of constant args: 9
Local memory type: Scratchpad
Local memory size: 49152
Queue properties:
Out-of-Order: No
Name: GeForce GTX 750 Ti
Vendor: NVIDIA
Driver version: 10.11.14 346.03.15f12
Version: OpenCL 1.2
Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_APPLE_fp64_basic_ops cl_khr_fp64 cl_khr_3d_image_writes cl_khr_depth_images cl_khr_gl_depth_images cl_khr_gl_msaa_sharing cl_khr_image2d_from_buffer cl_APPLE_ycbcr_422 cl_APPLE_rgb_422
Max compute units: 5
Max work group size: 1024
Max clock frequency: 1254Mhz
Max memory allocation: 536870912
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 2147483648
Constant buffer size: 65536
Max number of constant args: 9
Local memory type: Scratchpad
Local memory size: 49152
Queue properties:
Out-of-Order: No
Name: GeForce GTX 750 Ti
Vendor: NVIDIA
Driver version: 10.11.14 346.03.15f12
Version: OpenCL 1.2
Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_APPLE_fp64_basic_ops cl_khr_fp64 cl_khr_3d_image_writes cl_khr_depth_images cl_khr_gl_depth_images cl_khr_gl_msaa_sharing cl_khr_image2d_from_buffer cl_APPLE_ycbcr_422 cl_APPLE_rgb_422


Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.007971
Used GPU device parameters are:
Number of compute units: 5
Single buffer allocation size: 256MB
Total device global memory: 2048MB
max WG size: 1024
local mem type: Real
LotOfMem path: no
LowPerformanceGPU path: no
HighPerformanceGPU path: no
period_iterations_num=16
Triplet: peak=11.30171, time=30.11, period=17.81, d_freq=6904594590.93, chirp=-28.995, fft_len=128
Pulse: peak=5.341069, time=45.99, period=12.35, d_freq=6904598221.83, score=1.009, chirp=-34.258, fft_len=4k
Pulse: peak=7.414871, time=45.99, period=21.36, d_freq=6904602743.84, score=1, chirp=-42.308, fft_len=4k
Pulse: peak=3.600743, time=45.99, period=9.037, d_freq=6904595177.28, score=1.006, chirp=-47.361, fft_len=4k
Triplet: peak=13.23665, time=49.21, period=31.9, d_freq=6904602060.89, chirp=-53.529, fft_len=1024
Triplet: peak=11.6078, time=58.25, period=8.73, d_freq=6904600604.13, chirp=-59.105, fft_len=128
Pulse: peak=0.3184023, time=45.82, period=0.1789, d_freq=6904601057.76, score=1.025, chirp=-84.755, fft_len=128
Pulse: peak=3.855636, time=45.9, period=7.807, d_freq=6904602425.63, score=1.058, chirp=87.264, fft_len=2k

Best spike: peak=22.21977, time=42.95, d_freq=6904600030.2, chirp=17.943, fft_len=64k
Best autocorr: peak=17.60487, time=85.9, delay=4.9566, d_freq=6904600619.42, chirp=27.752, fft_len=128k
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+11, d_freq=0,
score=-12, null_hyp=0, chirp=0, fft_len=0
Best pulse: peak=3.855636, time=45.9, period=7.807, d_freq=6904602425.63, score=1.058, chirp=87.264, fft_len=2k
Best triplet: peak=13.23665, time=49.21, period=31.9, d_freq=6904602060.89, chirp=-53.529, fft_len=1024

Spike count: 0
Autocorr count: 0
Pulse count: 5
Triplet count: 3
Gaussian count: 0
Time cpu in use since last restart: 304.5 seconds

GPU device sync requested... ...GPU device synched
02:07:25 (4231): called boinc_finish(0)

</stderr_txt>
]]>


------------------------------------------------- BREAK ------------------------------------------------

I guess this is good...??? It is MUCH faster than the old CUDA75.

Units seem to be finishing at 12.5 to 13.5 Min each. :-) :-D

Now have 200 active in queue. One Inconclusive still listed; but, this is from the old CUDA75 App. Also, something like 84+ Units in Pending.

I still have 20+ Tabs open in FF Quantum 58.0.2. :-)

I'm glad to be contributing more, now. :-)


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1919691 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1920226 - Posted: 21 Feb 2018, 15:08:18 UTC
Last modified: 21 Feb 2018, 15:09:55 UTC

@TBar,

A friend and I just built another Hackintosh, (for him), and it has an integrated HD630 on an Intel i7 7770 CPU. Is there an Intel OpenCL App, yet??? He could make use of it, if there is one...

Thanks in advance.


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1920226 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1920244 - Posted: 21 Feb 2018, 16:19:57 UTC - in response to Message 1920226.  

Ah, it's Right below the App you downloaded earlier;
* ATi5r3710&CPU-AVX.7z (1897.05 kB - downloaded 5 times.)
* ATi5r3710&CPU-AVX2.7z (2398.86 kB - downloaded 9 times.)
* nVidia_r3709&CPUr3711.7z (1269.87 kB - downloaded 8 times.)
* Intel_r3708&CPUr3711.7z (1269.68 kB - downloaded 3 times.)


If you look above that, you will also find;
The Intel iGPU App is the same as the nVidia GPU App and is completely Untested. Please Report if it is better than the App on SETI Main.

If you look at earlier posts in this thread, for over a Year now, You will see that the OSX 10.11.4 Update Broke OpenCL on nVidia, I think you should remember that?
Since 10.11.4 the only OpenCL build that works decently on the nVidia cards is the iGPU build. For over a Year now, the Mac nVidia App has been the Intel iGPU build...it's in this thread.
Whatever...

Besides that, Most of the iGPUs work Badly using the iGPU App even though the App works very well on the nVidia GPUs.
In Most cases you Should Not Use the Intel iGPU as it not only produces around 50% Inconclusive tasks, it Slows down the CPU tasks by 100%, i.e. the CPU tasks take Twice as long when using the iGPU. About Half of my Inconclusive Results are from People using the Mac Intel iGPU App.

If you still insist on trying the New 3708 version, I suggest you stop using it if it produces 50% Inconclusive Results. A 7770 should have AVX2, you would be much better off just running the AVX2 App on the CPU. One of the New iMac Pros switched from the Stock Mac CPU App to the AVX2 App and so far his RAC has increased by 10,000 and is still rising, it's a good thing he can't use an iGPU, https://setiathome.berkeley.edu/show_host_detail.php?hostid=8427868
ID: 1920244 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1920335 - Posted: 21 Feb 2018, 23:34:57 UTC

TBar,

Sorry, I typed in too many "7's"... His CPU is a 7700, not a 7770... I assume the 7700 is still capable of AVX2 support...

I've e-Mailed him Links to obtain Keka, and the AVX2 App. If he's still interested, I'll help him get set up on his Hackintosh.


Thanks,


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1920335 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1920445 - Posted: 22 Feb 2018, 14:43:29 UTC - in response to Message 1920244.  

They need to run multiple WU’s if they can. That’s the only way I get better output from the ATI cards on the Mac. Right now I’m running roughy the same average times they are getting by running 3 at a time and my cards only have 32 compute units.
ID: 1920445 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1920450 - Posted: 22 Feb 2018, 15:48:36 UTC - in response to Message 1920445.  
Last modified: 22 Feb 2018, 16:22:14 UTC

I was going to suggest using different cmdline settings as some people are getting much better results with the AMD GPUs using detailed cmdlines. The problem is, some settings work on some platforms but not on others, and can cause the App to not start or create false overflows with the wrong settings. Then there is the problem with having to reset the file permissions after editing the cmdline file when you restart BOINC. It would be best to make all the changes to the cmdline file before restarting BOINC so you only have to reset the file permissions once. This machine running the AMD Vega seems to be getting nice run-times but has a very long list of settings and it only takes One bad setting to cause problems;
https://setiathome.berkeley.edu/result.php?resultid=6427152128
It's much more complicated than the CUDA Special App where about the only setting is to either use a full CPU or not.

BTW, the first change I would make is to change the 'Number of period iterations for PulseFind set to 16' to 'Number of period iterations for PulseFind set to 1' and see how that worked. It's possible that One setting will result is most of the gains possible, https://setiathome.berkeley.edu/result.php?resultid=6426684584
ID: 1920450 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1920499 - Posted: 22 Feb 2018, 20:11:20 UTC - in response to Message 1919451.  

Hi,

I found this http://setiathome.berkeley.edu/result.php?resultid=6412085953

An NVIDIA quadro 4000 is experiencing difficulties when launching a kernel.

Looks like it is an official cuda 7.5 version (not an anonymous platform).
"Too many resources reuested" indicates too much shared mem, blocks, registers or threads requested at launch.
First guess: If it is not asking for too much shared mem, then there may be a __launch_bounds__(threads, blocks) directive at source code for the kernel requesting too many simultaneous blocks.

Petri

It seems the Server Finally sent him another CUDA 42 task, and it seems he has updated to a suitable driver. The results are the same as the One other CUDA 42 task he has been sent, it errored out immediately with;
CUFFT error in file 'cuda/cudaAcc_fft.cu' in line 37.
Now, that Fermi Quadro should work with the CUDA 42 App. The CUDA 42 App works with All my CUDA cards, everything from an 8800 GT to a GTX 1060. The closest I have to a Fermi card is a GTS 250, and the GTS 250 works fine with the CUDA 42 App in every OS that supports CUDA on the GTS 250. Recently even a GTX 1070 has worked with the CUDA 42 App, https://setiathome.berkeley.edu/result.php?resultid=6399215767, So, I really don't think it's the App, that has Passed through even SETI Beta. The full stderr is here, just so it doesn't disappear again, because so far the Server has only sent the CUDA 42 task to the Quadro Twice;
<core_client_version>7.8.3</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
v8 task detected
setiathome_CUDA: Found 1 CUDA device(s):
  Device 1: Quadro 4000, 2047 MiB, regsPerBlock 32768
     computeCap 2.0, multiProcs 8 
     pciBusID = 5, pciSlotID = 0
     clockRate = 950 MHz
In cudaAcc_initializeDevice(): Boinc passed DevPref 1
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: Quadro 4000 is okay
SETI@home using CUDA accelerated device Quadro 4000

setiathome enhanced x41zi (baseline v8), Cuda 4.20

setiathome_v8 task detected
Detected Autocorrelations as enabled, size 128k elements.
Work Unit Info:
...............
WU true angle range is :  0.010184
re-using dev_GaussFitResults array for dev_AutoCorrIn, 4194304 bytes
re-using dev_GaussFitResults+524288x8 array for dev_AutoCorrOut, 4194304 bytes
Thread call stack limit is: 1k
CUFFT error in file 'cuda/cudaAcc_fft.cu' in line 37.

</stderr_txt>
]]>

So, why is this Quadro failing with the CUDA 42 App when all other GPUs that I know of doesn't?
The line 37 code area reads;
}

void cudaAcc_execute_dfts(int FftNum) {
	CUFFT_SAFE_CALL( (cufftExecC2C(fft_analysis_plans[FftNum][0], dev_cx_ChirpDataArray, dev_WorkData, CUFFT_INVERSE)) );
}
ID: 1920499 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1920828 - Posted: 24 Feb 2018, 5:09:14 UTC

[Update:]

Been crunching with the new OpenCL App since the 18th. :-) :-D

<<------- RAC is CLIMBING!!! :-) Have broken 4K in 5 Days and a few hours. (Still ONLY crunching from 6 PM to 9 AM - PST.)

Things are looking MUCH brighter, now. :-)

Thanks TBar!!! :-)


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1920828 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1920876 - Posted: 24 Feb 2018, 15:19:49 UTC - in response to Message 1920445.  

Hey Chris,

CreditFew is going to give you a hard time with that setup. Since CF works from an App APR average it gives the faster card fewer credits. You would think it would give the slower cards more credit, but what usually happens is the slower GPUs get the same as usual and the faster gets less. I don't know if the Tahiti would work in the Mac, but that would probably be better due to the AMD card using a different App with a different APR. Of course, it would be better with Two 1080 Ti by themselves.

Anyway, lets see how far it will go.
ID: 1920876 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1920907 - Posted: 24 Feb 2018, 20:38:19 UTC - in response to Message 1920876.  

Hey Chris,

CreditFew is going to give you a hard time with that setup. Since CF works from an App APR average it gives the faster card fewer credits. You would think it would give the slower cards more credit, but what usually happens is the slower GPUs get the same as usual and the faster gets less. I don't know if the Tahiti would work in the Mac, but that would probably be better due to the AMD card using a different App with a different APR. Of course, it would be better with Two 1080 Ti by themselves.

Anyway, lets see how far it will go.


Yeah, was noticing that going on. I was half hoping the lower APR due to the 2 750ti's in there would give me higher credit on to 1080 ti’s work. I also thought about getting an external thunderbolt enclosure for the 2013 Mac Pro but it is on High Sierra (required for some of my software) and the last time I tried the special app on high Sierra it barfed... I’ll see how this goes and see if I’m “forced” to get another 1080ti in there.=).
ID: 1920907 · Report as offensive
Previous · 1 . . . 48 · 49 · 50 · 51 · 52 · 53 · 54 . . . 58 · Next

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.