I've Built a Couple OSX CUDA Apps...

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 40 · 41 · 42 · 43 · 44 · 45 · 46 . . . 58 · Next

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1829288 - Posted: 9 Nov 2016, 10:27:36 UTC
Last modified: 9 Nov 2016, 10:28:40 UTC

Eric deployed new OpenCL builds for OS X on beta:
Mac OS X/64-bit Intel 8.10 (opencl_ati_mac) 7 Apr 2016, 1:01:54 UTC 17 GigaFLOPS
Mac OS X/64-bit Intel 8.10 (opencl_nvidia_SoG_mac) 7 Apr 2016, 1:01:54 UTC 2 GigaFLOPS
Mac OS X/64-bit Intel 8.11 (cuda42_mac) 6 Aug 2016, 4:12:08 UTC 33 GigaFLOPS
Mac OS X/64-bit Intel 8.11 (cuda75_mac) 6 Aug 2016, 4:12:08 UTC 44 GigaFLOPS
Mac OS X/64-bit Intel 8.19 (opencl_ati5_mac) 8 Nov 2016, 23:03:25 UTC 0 GigaFLOPS
Mac OS X/64-bit Intel 8.19 (opencl_intel_gpu_sah) 8 Nov 2016, 23:03:25 UTC 0 GigaFLOPS
Mac OS X/64-bit Intel 8.19 (opencl_nvidia_mac) 8 Nov 2016, 23:03:25 UTC 0 GigaFLOPS
Mac OS X/64-bit Intel 8.20 (opencl_ati5_SoG_mac) 8 Nov 2016, 23:03:25 UTC 6 GigaFLOPS

But I'm not sure if older ones (8.10) should remain or need to be deprecated. Any comments on that?

BTW, anticipated date of CUDA75/42 release is this week. Fingers crossed...
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1829288 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1829345 - Posted: 9 Nov 2016, 16:21:37 UTC - in response to Message 1829288.  

The new Apps are experiencing Download errors. I was able to supply the needed DL myself to get the SoG App to work on my machine. It seems All the Apps are having problems with at least one needed file, and, I was never sent the nVidia App at all. If the 8.19 (opencl_ati5_mac) & 8.19 (opencl_nvidia_mac) Apps work with Darwin 11.4.2 there shouldn't be any need for the older Apps.

There is a major problem with the CUDA App as people are Not using the correct CUDA Drivers. It's pretty simple, just go to the Dock, Open System Preferences, then use the CUDA Preference Pane to update to the latest CUDA Driver. It Will Not Work using the Out Dated Driver with a Newer version of OSX. The people using the correct Drivers aren't having any trouble.
ID: 1829345 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1829352 - Posted: 9 Nov 2016, 16:44:08 UTC - in response to Message 1829345.  

The new Apps are experiencing Download errors. I was able to supply the needed DL myself ...

Please identify exactly which file is needed but not being downloaded: look at the <app_version> file declarations to identify what the problem is: and tell Eric your findings so that he can correct the problem at source and enable others to test the app as intended. Simply posting here without analysis isn't going to get anything solved.
ID: 1829352 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1829368 - Posted: 9 Nov 2016, 17:24:02 UTC - in response to Message 1829352.  

It seems Beta is down. The missing file(s) show up in the Tasks labeled as Errors.
For setiathome_8.20_x86_64-apple-darwin__opencl_ati5_SoG_mac the missing file is setiathome-8.20-opencl_ati5_SoG_mac_darwin_README_OPENCL
The setiathome_8.19_x86_64-apple-darwin__opencl_ati5_mac App is missing MultiBeam_Kernels_r3552.cl
The MBv8_8.19r3553_Intel_ssse3_x86_64-apple-darwin App is missing MultiBeam_Kernels_r3553.cl.
I couldn't convince the Server to send MBv8_8.19r3551_NV_ssse3_x86_64-apple-darwin.
ID: 1829368 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1829371 - Posted: 9 Nov 2016, 17:30:32 UTC - in response to Message 1829345.  

If the 8.19 (opencl_ati5_mac) & 8.19 (opencl_nvidia_mac) Apps work with Darwin 11.4.2 there shouldn't be any need for the older Apps.

I see. So before any deprecation we should ensure new build works on Darwin 11.4.2.

I'll mail Eric about download issues.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1829371 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1829376 - Posted: 9 Nov 2016, 17:42:30 UTC - in response to Message 1829345.  



There is a major problem with the CUDA App as people are Not using the correct CUDA Drivers. It's pretty simple, just go to the Dock, Open System Preferences, then use the CUDA Preference Pane to update to the latest CUDA Driver. It Will Not Work using the Out Dated Driver with a Newer version of OSX. The people using the correct Drivers aren't having any trouble.


That's why NV OpenCL build still needed for OS X hosts even being slower. Rather we could just lose that computing power at all.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1829376 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1830561 - Posted: 14 Nov 2016, 17:47:25 UTC

So far it seems 5 out of 6 Apps at Beta are working well. The only exception being the notorious Intel iGPU on an iGPU. The same App works quite well on a nVidia GPU. Even the people that can't seem to update the CUDA driver are finding success with the OpenCL App, https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=69829. The nVidia GPUs are working nicely with the iGPU OpenCL App, check these times, https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=80723&offset=60

The ATI/AMD Apps are working well, the nVidia App is working well, it looks promising.
ID: 1830561 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1831069 - Posted: 17 Nov 2016, 22:50:47 UTC
Last modified: 17 Nov 2016, 22:59:35 UTC

Prompted by Petri's queries about efficiency/GFlops, I requested some numbers from Eric which he graciously supplied. That's to try to make sense of some missing fpops.

It would help to backtrack/correlate with some observations. Would any or all of the following statements about stock CPU apps ring (fully or partially) true?
- The 8.00 and 8.05 (windows x86, & Linux both bittages) appear 'reasonably good',
- The 8.03 (darwin/OSX intel) app is half to two-thirds the performance it 'should be'
- Out of those apps, the x86_64-linux build seems considerably more efficient than the others (for whatever reasons)
- the PPC (8.03) & Arm(8.02) builds aren't particularly slow for the devices they are running on (compared to what these devices are capable of anyway)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1831069 · Report as offensive
JLDun
Volunteer tester
Avatar

Send message
Joined: 21 Apr 06
Posts: 573
Credit: 196,101
RAC: 0
United States
Message 1831078 - Posted: 17 Nov 2016, 23:14:50 UTC - in response to Message 1831069.  

{snip}& Arm(8.02) builds aren't particularly slow for the devices they are running on (compared to what these devices are capable of anyway)

I've "gathered", based on personal experience, the ARM app will run for xx hours, and CPU Run Time will be xx-≤1 Hour if the device is in use if there's not a lot of restarts. Don't know about slow, but not especially inefficient. As for xx, the WU's lately have been in the 25-35 hour range (at most) lately, where an Android/x86_64 will run under 13 hours; CPU seems to be a big influence in this case.
ID: 1831078 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1831081 - Posted: 17 Nov 2016, 23:24:06 UTC - in response to Message 1831078.  
Last modified: 17 Nov 2016, 23:44:48 UTC

{snip}& Arm(8.02) builds aren't particularly slow for the devices they are running on (compared to what these devices are capable of anyway)

I've "gathered", based on personal experience, the ARM app will run for xx hours, and CPU Run Time will be xx-≤1 Hour if the device is in use if there's not a lot of restarts. Don't know about slow, but not especially inefficient. As for xx, the WU's lately have been in the 25-35 hour range (at most) lately, where an Android/x86_64 will run under 13 hours; CPU seems to be a big influence in this case.


Good to know thanks. Never had much luck with the Android variants myself.
If you have fairly steady APRs for these apps, would you say the ARM app GFlops is roughly 50% of Boinc Whetstone, while the x86_64 variant <20% of its Boinc Whetstone, despite being noticeably more efficient ? [Or other way around perhaps ?]

[Edit:] looking at some of yours seems to suggest other way around indeed, with APRs of 2*Boinc_Whetstone [for the Arm], If I'm looking right. Will have to look for where APR is calculated (i.e. that's not supposed to happen)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1831081 · Report as offensive
JLDun
Volunteer tester
Avatar

Send message
Joined: 21 Apr 06
Posts: 573
Credit: 196,101
RAC: 0
United States
Message 1831093 - Posted: 18 Nov 2016, 0:17:43 UTC - in response to Message 1831081.  
Last modified: 18 Nov 2016, 0:32:27 UTC

Not technically minded, so 'lost in the lingo', but... to point out some specific details:

Current Android Handset:
Host 8100533. What I'm currently using as a phone.

Android/x86_64
Host 8038053. Regardless of usage, (Run Time)-(CPU Usage) is (almost always) under 60 minutes.

[Edit]
Mixed use; used to be my phone handset
Host 7915058.
(Sorry. Had problems editing. I've been sending feedback to Google/Chrome about it.)

The other two Android entries USED to be phone handsets, but I 'ignore' them now.
ID: 1831093 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1831096 - Posted: 18 Nov 2016, 0:24:56 UTC - in response to Message 1831093.  
Last modified: 18 Nov 2016, 0:50:34 UTC

Not technically minded, so 'lost in the lingo', but... to point out some specific details:

Current Android Handset:
Host 8100533. What I'm currently using as a phone.


Yeah, very strange numbers. Will see what TBar says on the Mac side of things.

your phone:
Measured floating point speed 803.53 million ops/sec (~0.8 GFlops)
Application details Average Processing rates ( ~1.5-1.6 GFlops )

I'm wondering if they took the neon/vfp Boinc whetstone code out of the arm/android clients (quite possible)

[Edit:] Looks like no vectorised form of the whetstone was completed. Though the android client code comments imply vfp/neon, there isnt any vfp/neon code in it. 'Just means the Gflops numbers will be úpside down' like the Intel ones.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1831096 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1831153 - Posted: 18 Nov 2016, 6:20:41 UTC - in response to Message 1831096.  
Last modified: 18 Nov 2016, 6:26:29 UTC

I'm not sure what you're looking for. From my experience the current stock CPU App works differently on different CPUs. On the Older CPUs, such as mine, it can be almost half as fast as the optimized App compiled from the AKv8 folder. Seems to be a lot of variance with different CPUs. Mine says 3633.84 million ops/sec and the App section says 25.68 GFLOPS using the SSE41 App. A machine running stock with CPU W3580 @ 3.33GHz is showing 3879.41 million ops/sec and 12.58 GFLOPS.
ID: 1831153 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1831155 - Posted: 18 Nov 2016, 6:49:51 UTC - in response to Message 1831153.  
Last modified: 18 Nov 2016, 6:55:43 UTC

I'm not sure what you're looking for. From my experience the current stock CPU App works differently on different CPUs. On the Older CPUs, such as mine, it can be almost half as fast as the optimized App compiled from the AKv8 folder. Seems to be a lot of variance with different CPUs. Mine says 3633.84 million ops/sec and the App section says 25.68 GFLOPS using the SSE41 App. A machine running stock with CPU W3580 @ 3.33GHz is showing 3879.41 million ops/sec and 12.58 GFLOPS.


Exactly that ratio thanks. ~3.9 'Device Peak GFlops' (supposed), running with APR 12.58 GFlops Actual. (A discrepancy of >3x, which more or less fills a missing piece of a 3 year old puzzle. 2x in the ARM case.)

Many of the questions I had are now fairly moot, as probing since I posted turned up some things. I've been able to verify the APR higher number is the 'truer' value from the source (php of the application details page, and the 'Device Peak Flops'derived from Boinc Whetstone (CPU) and fudge factors.

It has ramifications for scheduling new apps or hosts coming online that have been problematic in specific situations, and I've passed on a suggestion to Eric and Richard, should either consider it worthwhile looking at any deeper.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1831155 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1831175 - Posted: 18 Nov 2016, 11:25:16 UTC - in response to Message 1831155.  

It has ramifications for scheduling new apps or hosts coming online that have been problematic in specific situations, and I've passed on a suggestion to Eric and Richard, should either consider it worthwhile looking at any deeper.

Ack receipt of that email, but it's going to take a while (and much coffee) to get my head back to where we were two years ago. We're only going to get one chance at this, so let's make sure we get it right first time (and get it right for all the other projects, too).
ID: 1831175 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1831185 - Posted: 18 Nov 2016, 12:26:20 UTC - in response to Message 1831175.  

It has ramifications for scheduling new apps or hosts coming online that have been problematic in specific situations, and I've passed on a suggestion to Eric and Richard, should either consider it worthwhile looking at any deeper.

Ack receipt of that email, but it's going to take a while (and much coffee) to get my head back to where we were two years ago. We're only going to get one chance at this, so let's make sure we get it right first time (and get it right for all the other projects, too).


Yeah, slow, steady and carefully acknowledged. It isn't about credit here'. This is correctness of estimates first (which controls the whole scheduling chain). A little odd the right numbers seem to be there for visual display and not propagated to function. There has been a schism, most likely at a similar point where past attempts stopped.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1831185 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1831187 - Posted: 18 Nov 2016, 12:49:59 UTC - in response to Message 1831185.  

It has ramifications for scheduling new apps or hosts coming online that have been problematic in specific situations, and I've passed on a suggestion to Eric and Richard, should either consider it worthwhile looking at any deeper.

Ack receipt of that email, but it's going to take a while (and much coffee) to get my head back to where we were two years ago. We're only going to get one chance at this, so let's make sure we get it right first time (and get it right for all the other projects, too).

Yeah, slow, steady and carefully acknowledged. It isn't about credit here'. This is correctness of estimates first (which controls the whole scheduling chain). A little odd the right numbers seem to be there for visual display and not propagated to function. There has been a schism, most likely at a similar point where past attempts stopped.

In broad terms, I've not seen any problem with runtime estimates, once the two separate onramp stages have been successfully negotiated (the initial conditions stages are complete cobblers, of course).

And that's paying fairly close attention to runtime estimates, both under Anonymous Platform here, and under stock running at other projects - except at projects where, despite protestations, the administrators have acknowledged that "our automated work submission tools" are incapable of adjusting rsc_fpops_est to the {known a priori, deterministic} task performance.
ID: 1831187 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1831205 - Posted: 18 Nov 2016, 14:10:08 UTC - in response to Message 1831187.  
Last modified: 18 Nov 2016, 14:14:00 UTC

Yeah, you won't see the problem since normalisation fixes that. The discrepancy is purely the two different GFlops Numbers [In Plain sight]. One connected to what you see, and the other connected to the actual backend drive scheduling.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1831205 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1831208 - Posted: 18 Nov 2016, 14:25:53 UTC - in response to Message 1831205.  

Yeah, you won't see the problem since normalisation fixes that. The discrepancy is purely the two different GFlops Numbers [In Plain sight]. One connected to what you see, and the other connected to the actual backend drive scheduling.

OK, I'll finish lunch and head downstairs to code-walk the line numbers in your email. That may take some time...
ID: 1831208 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1831214 - Posted: 18 Nov 2016, 14:38:09 UTC - in response to Message 1831208.  

Yeah, you won't see the problem since normalisation fixes that. The discrepancy is purely the two different GFlops Numbers [In Plain sight]. One connected to what you see, and the other connected to the actual backend drive scheduling.

OK, I'll finish lunch and head downstairs to code-walk the line numbers in your email. That may take some time...


If you can explain two different GFlops estimates for the same device as anything better than "WTF', then I will owe you even more respect than I already grant you. If you can explain to me why we should deliberately underestimate by a factor of four or more, then that's bonus points.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1831214 · Report as offensive
Previous · 1 . . . 40 · 41 · 42 · 43 · 44 · 45 · 46 . . . 58 · Next

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.