I've Built a Couple OSX CUDA Apps...

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 49 · 50 · 51 · 52 · 53 · 54 · 55 . . . 58 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1920982 - Posted: 25 Feb 2018, 2:04:53 UTC - in response to Message 1920907.  

All you have to remember is the faster you complete the task, the less credit you get ;-(
That's why SETI keeps giving fewer and fewer credits, the hardware keeps getting faster so it gives fewer credits.
Jason though it might have something to do with the CPU Benchmark they use being so far from reality. That's why when you start a new App the estimates are so high, the Benchmark is many times lower than reality so the time estimates are many times higher. Take for instance this machine and look at what it says for the CPU FLOPS, Measured floating point speed = 3.63 billion ops/sec Now look at what it actually does, SETI@home v8 (anonymous platform, CPU) = Average processing rate = 41.37 GFLOPS
It would be interesting to see what would happen if CreditFew used the 41.37 number instead of 3.6.

You don't really need another 1080 Ti, but it would be nice. Anything would be better than matching a 750 Ti with a 1080 Ti, as the 750 Ti is about as slow as it gets with the Special App. Even just a single 1060 would be better, two would be even better. Or perhaps a single 1070. You can't get worse than what you have. Matching Two 1060s with a 1050 Ti isn't that bad cause the cards aren't that far a part. Of course, I'd really like to have another 1060 in there at some point.
ID: 1920982 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1922056 - Posted: 2 Mar 2018, 9:46:09 UTC
Last modified: 2 Mar 2018, 9:46:49 UTC

[Update:]

3-2-2018 at 1:46 AM - PST

I'm concerned about the ultimate Validation of this Unit. It is up against an "Intel - Darwin", and an "Intel GPU_sah, Windows_Intelx86"... I fear that the two Intels will Validate against each other and cause my Unit to be listed as an Error. :-( (Though, I'm convinced that, in actuality, my Unit is truly correct.)


wuid=2875306058


[Andromeda: ID - 7952666:]

Task 6434202091

Name blc13_2bit_guppi_58137_31263_HIP46580_0025.11682.409.21.44.7.vlar_1
Workunit 2875306058
Created 24 Feb 2018, 11:38:23 UTC
Sent 24 Feb 2018, 16:11:18 UTC
Report deadline 18 Apr 2018, 21:11:00 UTC
Received 26 Feb 2018, 9:31:44 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x00000000)
Computer ID 7952666
Run time 13 min 38 sec
CPU time 4 min 59 sec
Validate state Checked, but no consensus yet
Credit 0.00
Device peak FLOPS 1,605.76 GFLOPS
Application version SETI@home v8
Anonymous platform (NVIDIA GPU)
Peak working set size 64.28 MB
Peak swap size 3,206.82 MB
Peak disk usage 0.03 MB

----------------------------------------------------------------------

Stderr output

<core_client_version>7.6.22</core_client_version>
<![CDATA[
<stderr_txt>
Running on device number: 0
Maximum single buffer size set to:256MB
SpikeFind FFT size threshold override set to:2048
TUNE: kernel 1 now has workgroup size of (64,1,4)
Number of period iterations for PulseFind set to 16
OpenCL platform detected: Apple
Number of OpenCL devices found : 2
BOINC assigns slot on device #1 of 2 devices.
Info: BOINC provided OpenCL device ID used

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_INTEL OCL_ZERO_COPY OCL_CHIRP3 ASYNC_SPIKE FFTW SSSE3 64bit
System: Darwin x86_64 Kernel: 15.6.0
CPU : Intel(R) Core(TM)2 Extreme CPU X9650 @ 3.00GHz
GenuineIntel x86, Family 6 Model 23 Stepping 6
Features : FPU TSC PAE APIC MTRR MMX SSE SSE2 HT SSE3 SSSE3 SSE4.1

OpenCL-kernels filename : MultiBeam_Kernels_r3709.cl
ar=0.010653 NumCfft=95521 NumGauss=0 NumPulse=25243136128 NumTriplet=38185433504
Currently allocated 313 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized setiathome_v8 application
Version info: SSSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSSE3x OS X 64bit Build 3709 , Ported by : Raistmer, JDWhale, Urs Echternacht
OpenCL version by Raistmer, r3709

Number of OpenCL platforms: 1


OpenCL Platform Name: Apple
Number of devices: 2
Max compute units: 5
Max work group size: 1024
Max clock frequency: 1254Mhz
Max memory allocation: 536870912
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 2147483648
Constant buffer size: 65536
Max number of constant args: 9
Local memory type: Scratchpad
Local memory size: 49152
Queue properties:
Out-of-Order: No
Name: GeForce GTX 750 Ti
Vendor: NVIDIA
Driver version: 10.11.14 346.03.15f12
Version: OpenCL 1.2
Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_APPLE_fp64_basic_ops cl_khr_fp64 cl_khr_3d_image_writes cl_khr_depth_images cl_khr_gl_depth_images cl_khr_gl_msaa_sharing cl_khr_image2d_from_buffer cl_APPLE_ycbcr_422 cl_APPLE_rgb_422
Max compute units: 5
Max work group size: 1024
Max clock frequency: 1254Mhz
Max memory allocation: 536870912
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 2147483648
Constant buffer size: 65536
Max number of constant args: 9
Local memory type: Scratchpad
Local memory size: 49152
Queue properties:
Out-of-Order: No
Name: GeForce GTX 750 Ti
Vendor: NVIDIA
Driver version: 10.11.14 346.03.15f12
Version: OpenCL 1.2
Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_APPLE_fp64_basic_ops cl_khr_fp64 cl_khr_3d_image_writes cl_khr_depth_images cl_khr_gl_depth_images cl_khr_gl_msaa_sharing cl_khr_image2d_from_buffer cl_APPLE_ycbcr_422 cl_APPLE_rgb_422


Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.010653
Used GPU device parameters are:
Number of compute units: 5
Single buffer allocation size: 256MB
Total device global memory: 2048MB
max WG size: 1024
local mem type: Real
LotOfMem path: no
LowPerformanceGPU path: no
HighPerformanceGPU path: no
period_iterations_num=16
Pulse: peak=7.617371, time=45.9, period=19.92, d_freq=6568433865.87, score=1.004, chirp=-10.145, fft_len=2k
Spike: peak=24.02623, time=54.4, d_freq=6568437293.28, chirp=11.403, fft_len=64k
Spike: peak=24.42993, time=54.4, d_freq=6568437293.26, chirp=11.429, fft_len=64k
Spike: peak=25.09982, time=54.4, d_freq=6568437293.29, chirp=11.439, fft_len=64k
Pulse: peak=1.926689, time=45.9, period=3.121, d_freq=6568441942.25, score=1.003, chirp=47.475, fft_len=2k
Pulse: peak=10.35832, time=45.86, period=28.36, d_freq=6568440510.49, score=1.041, chirp=49.2, fft_len=1024
Pulse: peak=6.323389, time=45.99, period=15.39, d_freq=6568440537.72, score=1.029, chirp=62.527, fft_len=4k

Best spike: peak=25.09982, time=54.4, d_freq=6568437293.29, chirp=11.439, fft_len=64k
Best autocorr: peak=16.19209, time=51.54, delay=4.6888, d_freq=6568439084.66, chirp=-7.7413, fft_len=128k
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+11, d_freq=0,
score=-12, null_hyp=0, chirp=0, fft_len=0
Best pulse: peak=10.35832, time=45.86, period=28.36, d_freq=6568440510.49, score=1.041, chirp=49.2, fft_len=1024
Best triplet: peak=0, time=-2.124e+11, period=0, d_freq=0, chirp=0, fft_len=0

Spike count: 3
Autocorr count: 0
Pulse count: 4
Triplet count: 0
Gaussian count: 0
Time cpu in use since last restart: 298.5 seconds

GPU device sync requested... ...GPU device synched
01:26:08 (7112): called boinc_finish(0)

</stderr_txt>
]]>
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1922056 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13722
Credit: 208,696,464
RAC: 304
Australia
Message 1922063 - Posted: 2 Mar 2018, 10:02:46 UTC - in response to Message 1922056.  

I'm concerned about the ultimate Validation of this Unit. It is up against an "Intel - Darwin", and an "Intel GPU_sah, Windows_Intelx86"... I fear that the two Intels will Validate against each other and cause my Unit to be listed as an Error. :-( (Though, I'm convinced that, in actuality, my Unit is truly correct.)

Yeah, it sucks when that happens.
Grant
Darwin NT
ID: 1922063 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1922179 - Posted: 2 Mar 2018, 19:49:12 UTC - in response to Message 1922056.  

The reported signals equal 7 on that WU. The difference between the 2 Hosts is only 1 signal. Since you are allowed up to a 50% difference in reported signals, there isn't any danger of the Hosts not validating. For 7 signals, there would have to be a difference of 4 signals for one result to be invalid. Some of the iGPUs actually give the correct results, the Windows iGPU has few inconclusives indicating it is working. Even though the iGPU is giving the correct results it is still slowing down the CPU tasks and since most CPUs have more cores than iGPUs it is better to run the CPUs at Full speed than at Half speed with the iGPU working.

An update on the New iMac Pro. Since installing the package, ATi5r3710&CPU-AVX2, the Host is now up to a RAC of 36700 even without tweaking the AMD cmdline. If he did tweak the cmdline, he could probably get it over 40000. Not bad for an iMac, it puts the old MacPro 6,1 to shame, tweaked or not. Of course to get the Best score you will need to use one of the Old MacPros, with the CUDA Special App. Then you would be up to where these 2 Macs are, https://setiathome.berkeley.edu/top_hosts.php?sort_by=expavg_credit&offset=20

BTW, I just tried one of the Mining x1 to x16 Powered PCIe extenders on my MacPro...it worked. The last time I had tried an Unpowered x4 to x16 ribbon extender on a 750 Ti and it didn't work. Both cards I tried didn't have an external connector, but it takes a powered connector to work apparently. Next I think I'll try one of the PCIe switches, that can run 4 cards off of One slot and see how that works. I think I read somewhere where you can run up to 5 GPUs in OSX without any trouble.
ID: 1922179 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1922182 - Posted: 2 Mar 2018, 20:09:55 UTC - in response to Message 1922179.  
Last modified: 2 Mar 2018, 20:10:14 UTC

The reported signals equal 7 on that WU. The difference between the 2 Hosts is only 1 signal. Since you are allowed up to a 50% difference in reported signals, there isn't any danger of the Hosts not validating. For 7 signals, there would have to be a difference of 4 signals for one result to be invalid. Some of the iGPUs actually give the correct results, the Windows iGPU has few inconclusives indicating it is working. Even though the iGPU is giving the correct results it is still slowing down the CPU tasks and since most CPUs have more cores than iGPUs it is better to run the CPUs at Full speed than at Half speed with the iGPU working.

An update on the New iMac Pro. Since installing the package, ATi5r3710&CPU-AVX2, the Host is now up to a RAC of 36700 even without tweaking the AMD cmdline. If he did tweak the cmdline, he could probably get it over 40000. Not bad for an iMac, it puts the old MacPro 6,1 to shame, tweaked or not. Of course to get the Best score you will need to use one of the Old MacPros, with the CUDA Special App. Then you would be up to where these 2 Macs are, https://setiathome.berkeley.edu/top_hosts.php?sort_by=expavg_credit&offset=20

BTW, I just tried one of the Mining x1 to x16 Powered PCIe extenders on my MacPro...it worked. The last time I had tried an Unpowered x4 to x16 ribbon extender on a 750 Ti and it didn't work. Both cards I tried didn't have an external connector, but it takes a powered connector to work apparently. Next I think I'll try one of the PCIe switches, that can run 4 cards off of One slot and see how that works. I think I read somewhere where you can run up to 5 GPUs in OSX without any trouble.


72 000 with 3 x 1050's (or less) is I M P R E S S I V E ! ! !
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1922182 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1922188 - Posted: 2 Mar 2018, 20:38:43 UTC - in response to Message 1922182.  

72 000 with 3 x 1050's (or less) is I M P R E S S I V E ! ! !
Actually, the machine has Two 1060s and a 1050Ti. I was going to install another 1060 but the prices went up first. Imagine if it had one of your 1080s and Two 1060s ;-)
Now I'm down to installing more of the GPUs I already have. I have Two 1050 non-Ti in a Linux machine that would probably work well. That would give it Two 1060s, a 1050Ti, and Two 1050s, making it 5 GPUs. I can get a 3 way switcher and an El Cheapo case to mount the Three GPUs much cheaper than another 1060, and I already have a spare 750 watt Power Supply. We'll see what shakes out.
ID: 1922188 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1922203 - Posted: 2 Mar 2018, 22:00:00 UTC - in response to Message 1922188.  

72 000 with 3 x 1050's (or less) is I M P R E S S I V E ! ! !
Actually, the machine has Two 1060s and a 1050Ti. I was going to install another 1060 but the prices went up first. Imagine if it had one of your 1080s and Two 1060s ;-)
Now I'm down to installing more of the GPUs I already have. I have Two 1050 non-Ti in a Linux machine that would probably work well. That would give it Two 1060s, a 1050Ti, and Two 1050s, making it 5 GPUs. I can get a 3 way switcher and an El Cheapo case to mount the Three GPUs much cheaper than another 1060, and I already have a spare 750 watt Power Supply. We'll see what shakes out.


That would definitely make a WOW on its own. :)
I really like people pushing their hardware limits as well as their what ever is limiting factor-

"Explore more! Share the rare!"

--
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1922203 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1922431 - Posted: 3 Mar 2018, 14:45:06 UTC - in response to Message 1922188.  

72 000 with 3 x 1050's (or less) is I M P R E S S I V E ! ! !
Actually, the machine has Two 1060s and a 1050Ti. I was going to install another 1060 but the prices went up first. Imagine if it had one of your 1080s and Two 1060s ;-)
Now I'm down to installing more of the GPUs I already have. I have Two 1050 non-Ti in a Linux machine that would probably work well. That would give it Two 1060s, a 1050Ti, and Two 1050s, making it 5 GPUs. I can get a 3 way switcher and an El Cheapo case to mount the Three GPUs much cheaper than another 1060, and I already have a spare 750 watt Power Supply. We'll see what shakes out.


Yeah, the way the older MacPros do their power connectors is kinda annoying. I scavenged my optical drive connector for my PCIe riser. Mind you my machine is on its side so one of my 750ti is just sitting on top of the drive bay lol.

Don’t hate on the poor old MacPro 6,1. Ha. If I swapped out my CPU with the 12 core variety to match his 10 core, I’d be up in the mid-40k to 50k range. However if he ever run more than one wu at a time I’d be toast. I think my next project though is a thunderbolt enclosure for it to see if I can get the special app/boinc to run on High Sierra. If all else fails I can always just put another ATI card in it, maybe one of those Vega64’s to see how far it can be pushed.
ID: 1922431 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1922738 - Posted: 4 Mar 2018, 17:42:59 UTC - in response to Message 1922431.  
Last modified: 4 Mar 2018, 18:00:30 UTC

Yeah, the way the older MacPros do their power connectors is kinda annoying. I scavenged my optical drive connector for my PCIe riser. Mind you my machine is on its side so one of my 750ti is just sitting on top of the drive bay lol.

Don’t hate on the poor old MacPro 6,1. Ha. If I swapped out my CPU with the 12 core variety to match his 10 core, I’d be up in the mid-40k to 50k range. However if he ever run more than one wu at a time I’d be toast. I think my next project though is a thunderbolt enclosure for it to see if I can get the special app/boinc to run on High Sierra. If all else fails I can always just put another ATI card in it, maybe one of those Vega64’s to see how far it can be pushed.
I noticed Jeyl &Ted have a new Cluster going on and I've been trying to sort out what's going on without any feedback, apparently he's not responding to PMs anymore. So, do you get the same BOINC Cluster I get when I have two different types of GPUs installed? For me, I can use the Anonymous platform CUDA Special App without any problem, it's the OpenCL that's totally borked and doesn't work on the named GPUs. This is what I get;
Sun Mar 4 12:06:59 2018 | | Starting BOINC client version 7.8.6 for x86_64-apple-darwin
Sun Mar 4 12:07:00 2018 | | NVIDIA GPU 1: GeForce GTX 1060 3GB cannot be used for CUDA or OpenCL computation with CUDA driver 6.5 or later
Sun Mar 4 12:07:00 2018 | | NVIDIA GPU 2: GeForce GTX 1060 3GB cannot be used for CUDA or OpenCL computation with CUDA driver 6.5 or later
Sun Mar 4 12:07:00 2018 | | CUDA: NVIDIA GPU 0: GeForce GTX 1060 3GB (driver version 9.0.214, CUDA version 9.0, compute capability 6.1, 3072MB, 2994MB available, 4228 GFLOPS peak)
Sun Mar 4 12:07:00 2018 | | CUDA: NVIDIA GPU 1: GeForce GTX 1060 3GB (driver version 9.0.214, CUDA version 9.0, compute capability 6.1, 3072MB, 2994MB available, 4082 GFLOPS peak)
Sun Mar 4 12:07:00 2018 | | CUDA: NVIDIA GPU 2: GeForce GTX 1050 Ti (driver version 9.0.214, CUDA version 9.0, compute capability 6.1, 4096MB, 3766MB available, 2255 GFLOPS peak)
Sun Mar 4 12:07:00 2018 | | OpenCL: NVIDIA GPU 1: GeForce GTX 1060 3GB (driver version 10.18.5 378.05.05.25f06, device version OpenCL 1.2, 3072MB, 3072MB available, 1071 GFLOPS peak)
Sun Mar 4 12:07:00 2018 | | OpenCL: NVIDIA GPU 2: GeForce GTX 1050 Ti (driver version 10.18.5 378.05.05.25f06, device version OpenCL 1.2, 4096MB, 3766MB available, 2255 GFLOPS peak)
Sun Mar 4 12:07:00 2018 | | OpenCL: NVIDIA GPU 2: GeForce GTX 1060 3GB (driver version 10.18.5 378.05.05.25f06, device version OpenCL 1.2, 3072MB, 3072MB available, 1159 GFLOPS peak)
Sun Mar 4 12:07:00 2018 | | OpenCL CPU: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz (OpenCL driver vendor: Apple, driver version 1.1, device version OpenCL 1.2)
Sun Mar 4 12:07:06 2018 | | OS: Mac OS X 10.12.6 (Darwin 16.7.0)
Actually, I don't have any trouble running the CUDA Special App even with BOINC crying so loud. OpenCL only works on the 1050Ti though, unless I remove the 1050Ti, then the 1060s work with OpenCL. I'm pretty sure that's what Jeyl has going on, and he has installed a couple of GPUs that can't run the Special App, so he's forced to run OpenCL...which doesn't work with his currently installed GPUs.

Oh well.

Along with telling him to remove those GT 120s so his 680s will work, I did suggest him trying the advanced CMDlines on his Vega & 6,1s and then trying 2 tasks at once. I'm not getting any response though.
In other news, I decided to try the Cheap 1 to 4 Splitter first, and the one exactly like the extender that works for me is in China. So, it's on a slow boat from China and will take weeks to get here. I can't wait to see what BOINC says when I try running GPUs off a 4-Way Splitter....that should be fun.
ID: 1922738 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1922814 - Posted: 5 Mar 2018, 0:23:55 UTC - in response to Message 1922738.  
Last modified: 5 Mar 2018, 0:25:45 UTC

Hmm, I’ve never tried running OpenCL on different versions of cards on a Mac... Guess I could do a trail run and see what happens.

What does the power input need for the 1 to 4 splitter. I’ve got a couple more 750ti’s I’d throw into the mix if that works out for you.

Whatever risers I got, I think from somewhere on Amazon, one worked and the other didn’t. Luckily I only needed one at the moment...

Oh, and I’ve been cracking 100k daily RAC the last couple of days on the 5,1. Dunno how long this mix of WU will stick around though and of course Tuesday is almost here and I’ll run dry within a couple of hours, sigh...
ID: 1922814 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1922851 - Posted: 5 Mar 2018, 2:23:40 UTC - in response to Message 1922814.  

We'll, I think you'd know if you are having the problem. Besides the startup warning in the Event Log you get a constant nagware message in the Notices tab, my reads;
Notice from BOINC
NVIDIA GPU 2: GeForce GTX 1060 3GB cannot be used for CUDA or OpenCL computation with CUDA driver 6.5 or later
Sunday, March 04, 2018 'PMt' 12:07:00 PM
BOINC is convinced my 1060s are Pre-Fermi GPUs...for some reason.
Sometimes you can change the slots and have it go away, sometimes the number of different GPUs makes the difference. I have this habit of placing the faster GPUs in the fastest slots meaning the Two 1060s are in the bottom two x16 Slots and the 1050Ti is in the Top x4 Slot. Apparently BOINC hates that kind of Logic. If you aren't seeing the Warning at startup, or the Nag in Notices, you probably don't have the problem.

The Splitter I ordered just has 4 USB data cables that runs to the boards with the slots on them. The boards each have a 6-pin power connector on them. My plan is if it works, to use an empty case and mount the boards as you would a Motherboard so the Slots match the normal GPU mounts. Mount the Power Supply in the normal place with the Board cable connected to a test block so it powers on with the PS switch. I'm going to use an old case I already have, I'll just drill a few more holes to mount the 4 boards on the new standoffs. Then it's a matter of placing the case close enough so the USB cables reach, it shouldn't be a problem. It all depends on whether the switch works.



There are much more expensive switches available, but I'm reluctant to spend much on something that might not work. Those boards work on my Mac, when used one at a time.
ID: 1922851 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1922877 - Posted: 5 Mar 2018, 8:04:28 UTC
Last modified: 5 Mar 2018, 8:44:15 UTC

So, I bit the bullet and took the dremel to the faceplate of the Short, single fan 1060 so it would fit in the Top slot. This freed up the Two middle slots and I connected the single PCIe extender to the 1050Ti. It ran fine for a while and there didn't seem to be much difference in the times. So, I ordered another single extender to use until the Switcher gets here. That should allow 4 GPUs on the Mac, something I haven't done before. I also decided I really didn't need that DVD drive sitting up there, I have a portable anyway if I ever need one. Amazing how much room there is up there without that DVD drive, in fact, there's more than enough room for a 1050Ti and a power cable is sitting right there waiting for it...hmmm. Interesting.

I moved the 1050Ti back to the middle slot for now and looked at BOINC to see how it liked that;
Mon Mar 5 02:35:34 2018 | | Starting BOINC client version 7.8.6 for x86_64-apple-darwin
Mon Mar 5 02:35:35 2018 | | NVIDIA GPU 2: GeForce GTX 1050 Ti cannot be used for CUDA or OpenCL computation with CUDA driver 6.5 or later
Mon Mar 5 02:35:35 2018 | | CUDA: NVIDIA GPU 0: GeForce GTX 1060 3GB (driver version 9.1.128, CUDA version 9.1, compute capability 6.1, 3072MB, 2994MB available, 4082 GFLOPS peak)
Mon Mar 5 02:35:35 2018 | | CUDA: NVIDIA GPU 1: GeForce GTX 1050 Ti (driver version 9.1.128, CUDA version 9.1, compute capability 6.1, 4096MB, 3456MB available, 2255 GFLOPS peak)
Mon Mar 5 02:35:35 2018 | | CUDA: NVIDIA GPU 2: GeForce GTX 1060 3GB (driver version 9.1.128, CUDA version 9.1, compute capability 6.1, 3072MB, 2994MB available, 4228 GFLOPS peak)
Mon Mar 5 02:35:35 2018 | | OpenCL: NVIDIA GPU 0: GeForce GTX 1060 3GB (driver version 10.18.5 378.05.05.25f06, device version OpenCL 1.2, 3072MB, 2994MB available, 4082 GFLOPS peak)
Mon Mar 5 02:35:35 2018 | | OpenCL: NVIDIA GPU 2: GeForce GTX 1060 3GB (driver version 10.18.5 378.05.05.25f06, device version OpenCL 1.2, 3072MB, 2994MB available, 4228 GFLOPS peak)
Mon Mar 5 02:35:35 2018 | | OpenCL: NVIDIA GPU 2: GeForce GTX 1050 Ti (driver version 10.18.5 378.05.05.25f06, device version OpenCL 1.2, 4096MB, 4096MB available, 729 GFLOPS peak)
Mon Mar 5 02:35:35 2018 | | OpenCL CPU: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz (OpenCL driver vendor: Apple, driver version 1.1, device version OpenCL 1.2)
Mon Mar 5 02:35:40 2018 | | OS: Mac OS X 10.12.6 (Darwin 16.7.0)
Stupid BOINC....

Oh, this is the single Extender that works on My Mac, Mining Card, Riser Card, PCIe (PCI Express) 16x to 1x Riser Adapter
ID: 1922877 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1923386 - Posted: 8 Mar 2018, 17:35:21 UTC

I've been trying a few different arrangements. I'd say, BOINC might work correctly with OpenCL if I place the 1050 Ti in Slot #1, which is the bottom slot. But, the current 1060 in the bottom slot has a built-in backplate, and can't be used above any other card as it totally blocks the other card's fan. So, it appears the best I can do is have BOINC Not Work with OpenCL on the 1050 Ti as that's somewhat better than having BOINC Not Work with the Two 1060s using OpenCL. Fortunately, I Don't Use OpenCL on My Mac, so it's really not a Problem...for Me. I don't even use the AP GPU App as using OpenCL and CUDA at the same time Slows Both GPUs to 2 to 3 times slower than when Not mixing OpenCL & CUDA. It's strictly CUDA MBv8 on my machine. You can see the 1050 Ti isn't showing OpenCL on the BOINC list, Coprocessors : [3] NVIDIA GeForce GTX 1050 Ti (4095MB) driver: 5902.08
No OpenCL. It would be nice if this were fixed at some point. It's possible Apple will make another Mac with PCIe slots in the Future.

If you want to see what a Cluster this can create, just look at the Mac, https://setiathome.berkeley.edu/show_host_detail.php?hostid=8460900
That machine has the Original GT 120 and a GTX 680. You can see the 120 isn't even listed, but it's in the stderr.txt. You can also see the 680 is listed as OpenCL 1.0 when it is clearly OpenCL 1.2 in the stderr.txt, the 120 is listed as 1.0. The machine is running Stock, and with a little detailed observation you can see the GT 120 is only running OpenCL, as it should, and the GTX 680 is only running CUDA as BOINC Apparently thinks the 680 Doesn't have OpenCL. Of course the 680 would work better with the VLARs IF it used OpenCL, but, BOINC is BORKED in this machine. The simplest fix would be to try changing slots with the cards, or just Remove the GT 120 so BOINC will use OpenCL on the 680.

Are we having Fun yet?

BTW, if we do go back to only Arecibo tasks only for a while, my Mac will probably shoot back to around 100k RAC. That would be nice, for a little while ;-)
ID: 1923386 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1923809 - Posted: 10 Mar 2018, 19:36:57 UTC
Last modified: 10 Mar 2018, 19:38:19 UTC

It looks as though the recent OSX Systems have been changed to support only 3 GPUs, no matter what arrangements you try. Apparently this changed somewhere around Yosemite. I have a second single Powered Extender and the Max number you can connect and boot the machine is 3. I found a few others with the same problem, one here, OSX 10.13 BIOS can run a maximum 3 GPUs - Is there a workaround?. I looked at the BIOS and I still have the same one from 2008, which appears to be the current version for the Mac Pro 3,1. Back when this machine was new it was touted as being able to run 4 GPUs out of the box, apparently that has been changed with the recent systems. The next chance I get I'll try booting with 4 Maxwell cards in Yosemite, I'm not holding my breath though. I suppose if I wanted, I could build a Linux system to boot 5 but I'm not in a hurry with that either. So, for the Macs it appears the best setup is with 2 or 3 higher end GPUs, perhaps Three 150 watt GTX 1070s in Sierra for a self contained setup. I'm pretty sure I have tried 4 GPUs in El Capitan with modified GTX 750s and that didn't work either.
Oh well.
ID: 1923809 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1924656 - Posted: 15 Mar 2018, 0:40:31 UTC

Well, it seems the problem started with the Maxwell GPUs, which just happened to Appear with Yosemite. I did find a few posts that says the higher end switches work if all the GPUs are the same, and that you can't have more than 2 different type GPUs total. I dunno, I've got a few different pairs of GPUs and none of them work in the machine's built-in slots. As soon as you connect the 4th GPU it won't boot. The switch I ordered will be here in about a week. Meanwhile, I finally broke down and installed one of those Mac Themes into one of my Ubuntu machines. The thing really looks like a Mac now...kinda runs the same Apps too. I wonder if you can run more than 4 GPUs in my 'new' Mac. As soon as I recover I'll see how the 'new' machine works, but, it only has a total of 3 PCIe slots meaning I need a switch for it. It also only has 4 CPU cores, so, no -nobs for it. The Server has seen fit to match it with someone running 6 GPUs with a quad-core CPU. The Windows machine isn't doing so well, https://setiathome.berkeley.edu/show_host_detail.php?hostid=8377417
ID: 1924656 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1924780 - Posted: 15 Mar 2018, 22:08:38 UTC

Just cracked the top 20. Still rising a couple thousand a day but it is starting to slow down a little I think.

https://setiathome.berkeley.edu/show_host_detail.php?hostid=8424399
ID: 1924780 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1924787 - Posted: 15 Mar 2018, 22:43:01 UTC - in response to Message 1924780.  

It usually ends up a little lower than the Top peaks,



Imagine if it really did have Three 1080 Ti instead of just One.
ID: 1924787 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1924820 - Posted: 16 Mar 2018, 2:38:41 UTC - in response to Message 1924787.  

Imagine if it really did have Three 1080 Ti instead of just One.

I really need to get off my keister and get that RedHat box finished. Then I could see what 3 properly tuned 1080Ti's could actually do.. *sigh*

Sadly, right now my business has to be the top priority, but one of these days (soon!)...

ID: 1924820 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1924839 - Posted: 16 Mar 2018, 4:46:39 UTC - in response to Message 1924787.  

Found a 1070ti for good price and it will be replacing one of the 750ti’s sometime next week.

It’s a shame the Titan V’s are more than 3x the cost of the 1080ti’s but only about 2x as fast. Maybe once Petri gets done optimizing it he’ll wring out more power. The space in the old Mac is so limited for these huge cards. I lose a pci-e to the 1080ti just because it is so wide... Would highly recommend to anyone else, get the “minis 1080ti variants for these Macs... That would have saved me a slot and allowed 3 of the big cards in the machine instead of just 2...
ID: 1924839 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1925335 - Posted: 19 Mar 2018, 14:28:14 UTC

It seems you still have the BOINC 'Feature' with a Non-Apple machine. This is another older Intel board running Yosemite;

17-Mar-2018 23:00:20 [---] Starting BOINC client version 7.8.6 for x86_64-apple-darwin
17-Mar-2018 23:00:22 [---] NVIDIA GPU 1: GeForce GTX 960 cannot be used for CUDA or OpenCL computation with CUDA driver 6.5 or later
17-Mar-2018 23:00:22 [---] CUDA: NVIDIA GPU 0: GeForce GTX 960 (driver version 7.5.30, CUDA version 7.5, compute capability 5.2, 2048MB, 1978MB available, 2748 GFLOPS peak)
17-Mar-2018 23:00:22 [---] CUDA: NVIDIA GPU 1: Graphics Device (driver version 7.5.30, CUDA version 7.5, compute capability 5.2, 2048MB, 1846MB available, 2022 GFLOPS peak)
17-Mar-2018 23:00:22 [---] CUDA: NVIDIA GPU 2: Graphics Device (driver version 7.5.30, CUDA version 7.5, compute capability 5.2, 2048MB, 1998MB available, 2022 GFLOPS peak)
17-Mar-2018 23:00:22 [---] OpenCL: NVIDIA GPU 1: Graphics Device (driver version 10.5.3 346.02.03f14, device version OpenCL 1.2, 2048MB, 1846MB available, 2022 GFLOPS peak)
17-Mar-2018 23:00:22 [---] OpenCL: NVIDIA GPU 1: GeForce GTX 960 (driver version 10.5.3 346.02.03f14, device version OpenCL 1.2, 2048MB, 2048MB available, 1031 GFLOPS peak)
17-Mar-2018 23:00:22 [---] OpenCL: NVIDIA GPU 2: Graphics Device (driver version 10.5.3 346.02.03f14, device version OpenCL 1.2, 2048MB, 1998MB available, 2022 GFLOPS peak)
17-Mar-2018 23:00:22 [---] OpenCL CPU: Intel(R) Core(TM)2 Quad CPU Q6700 @ 2.66GHz (OpenCL driver vendor: Apple, driver version 1.1, device version OpenCL 1.2)
17-Mar-2018 23:00:22 [---] OS: Mac OS X 10.10.5 (Darwin 14.5.0)
17-Mar-2018 23:00:22 [---] Config: run apps at regular priority
17-Mar-2018 23:00:22 [---] Config: report completed tasks immediately
17-Mar-2018 23:00:22 [---] Config: use all coprocessors

Graphics Device is what Yosemite calls a GTX 950. The CUDA Device numbers are correct.
Stupid BOINC, a GTX 960 is NOT a Pre-Fermi GPU. This happens in ALL OS versions that I'm aware of.
If I change the slots, BOINC then insists the GTX 950 is a Pre-Fermi GPU.
Otherwise the 'New' Mac seems to be working OK in Yosemite, which is the Oldest OS that can run the CUDA Special App.

The PCIe switch will be here by the end of the week...
ID: 1925335 · Report as offensive
Previous · 1 . . . 49 · 50 · 51 · 52 · 53 · 54 · 55 . . . 58 · Next

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.