I've Built a Couple OSX CUDA Apps...

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 58 · Next

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1757189 - Posted: 17 Jan 2016, 20:27:49 UTC - in response to Message 1757171.  
Last modified: 17 Jan 2016, 20:31:10 UTC

Does the nametool command on the end of my flat makefile script work on that version of OSX ?

[Edit:] example for customisation:
install_name_tool -change @rpath/libcufft.7.5.dylib @executable_path/libcufft.7.5.dylib seti_cudamb

[seti_cudamb being the executable name before renaming to something proper]

Older Cuda versions will need similar for Cudart
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1757189 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1757202 - Posted: 17 Jan 2016, 21:09:09 UTC - in response to Message 1757189.  

Hmmm, it might be best if you looked at the file.
It's over here in the Alpha thread, http://www.arkayn.us/forum/index.php?topic=192.msg4382#msg4382
"nametool command on the end of my flat makefile script", not sure about that one.
ID: 1757202 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1757207 - Posted: 17 Jan 2016, 21:59:46 UTC - in response to Message 1757202.  
Last modified: 17 Jan 2016, 22:10:40 UTC

Hmmm, it might be best if you looked at the file.
It's over here in the Alpha thread, http://www.arkayn.us/forum/index.php?topic=192.msg4382#msg4382
"nametool command on the end of my flat makefile script", not sure about that one.


The install_name_tool command is the one you need to modify the executable to include the origin in the search paths for those libraries (old Linux style origin doesn't work [It's OSX's attempt at making the equivalent of DLL injection hacks difficult --- i.e security we don't need/want under Boinc])

Adapting my example:
install_name_tool -change @rpath/libcufft.7.5.dylib @executable_path/libcufft.7.5.dylib seti_cudamb



For an older Cuda version (Say Cuda 5) that needs cuda runtime as well would simply be:
install_name_tool -change @rpath/libcufft.5.0.dylib @executable_path/libcufft.5.0.dylib seti_cudamb
install_name_tool -change @rpath/libcudart.5.0.dylib @executable_path/libcudart.5.0.dylib seti_cudamb


replacing seti_cudamb with your exe name, and checking dependency names are correct.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1757207 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1757229 - Posted: 18 Jan 2016, 0:44:31 UTC - in response to Message 1757207.  

It doesn't appear to have worked. I ran the cmds in Yosemite with the files in my home folder, it didn't give any errors and I really have no way of knowing what transpired. I then booted to the last untried OS, El Capitan, and placed the files in the Beta folder. It's using the same BOINC Folder Yosemite was using. The first task to run gave;
<stderr_out>
<![CDATA[
<message>
process got signal 5
</message>
<stderr_txt>
dyld: Library not loaded: @rpath/libcudart.dylib
  Referenced from: /Volumes/Mov1/BOINC/Yosemite/BOINC Data/slots/4/../../projects/setiweb.ssl.berkeley.edu_beta/setiathome_x41zi_x86_64-apple-darwin_cuda42
  Reason: Incompatible library version: setiathome_x41zi_x86_64-apple-darwin_cuda42 requires version 1.1.0 or later, but libcudart.dylib provides version 0.0.0
</stderr_txt>

That's what happens when the App follows the Path to the ToolKit and finds 6.5 libraries there instead of 4.2 libraries. IF I would have had ToolKit 4.2 installed, such as I have in Mountain Lion, everything would have been fine. Now I'm going to have to either replace the libraries in the ToolKit with the 4.2 version Or Remove the ToolKit altogether. Then, it will just say it can't find the libraries even though they are sitting right next to the App.
ID: 1757229 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1757247 - Posted: 18 Jan 2016, 3:02:49 UTC - in response to Message 1757229.  
Last modified: 18 Jan 2016, 3:03:29 UTC

Then you may have some or another entry (possible internally oy DYLD_PATH or whatever, environment variables no longer being used) misdirecting it. This is controlled via the system plist entries (IIRC). Grabbing some rest at the moment after marathon cleanup session, but will be Back on the Mac version after that. Can dig out better detail when back behind that host tonight, as to what I did to get the development environment out of the default confused state.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1757247 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1757310 - Posted: 18 Jan 2016, 12:35:09 UTC - in response to Message 1757247.  
Last modified: 18 Jan 2016, 12:41:24 UTC

There isn't a problem with the Apps compiled with ToolKit 6.5, I've compiled and tested over a dozen. The problem is with compiling with the older ToolKit and has been noted in the past. If it were an internal problem you would expect the same with the recent repository and Petri builds. I loaded the last Petri build and it fired up without a whimper. Seems his last build does produce many less inconclusives and might be a few seconds faster. There's an easy way to distinguish Petri's Mac results, his uses nearly 100% CPU whereas the repository build uses around 30%.
ID: 1757310 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1757323 - Posted: 18 Jan 2016, 16:35:28 UTC - in response to Message 1757310.  
Last modified: 18 Jan 2016, 16:35:41 UTC

Posting a tweaked copy of your 4.2 exe on CA to try there, to see if it's just your dev environment + internal paths (tweaked one runs at least to looking for a task here, as provided doesn't find the library images)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1757323 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1757910 - Posted: 21 Jan 2016, 18:49:02 UTC

It all looks good now. All 3 CUDA Apps looking in the setiathome.berkeley.edu folders for the numbered Libraries. Everything is a GO.

TomsMacPro:setiathome.berkeley.edu Tom$ otool -L setiathome_x41zi_x86_64-apple-darwin_cuda42
setiathome_x41zi_x86_64-apple-darwin_cuda42:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.2.11)
	/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.3)
	/System/Library/Frameworks/Carbon.framework/Versions/A/Carbon (compatibility version 2.0.0, current version 152.0.0)
	@executable_path/libcudart.4.2.dylib (compatibility version 1.1.0, current version 4.2.0)
	@executable_path/libcufft.4.2.dylib (compatibility version 1.1.0, current version 4.2.0)
	/usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.9.0)
TomsMacPro:setiathome.berkeley.edu Tom$ otool -L setiathome_x41zi_x86_64-apple-darwin_cuda65
setiathome_x41zi_x86_64-apple-darwin_cuda65:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 169.3.0)
	/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.5)
	/System/Library/Frameworks/Carbon.framework/Versions/A/Carbon (compatibility version 2.0.0, current version 155.0.0)
	@executable_path/libcudart.6.5.dylib (compatibility version 0.0.0, current version 6.5.14)
	@executable_path/libcufft.6.5.dylib (compatibility version 0.0.0, current version 6.5.14)
	/usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 56.0.0)
TomsMacPro:setiathome.berkeley.edu Tom$ otool -L setiathome_x41p_zi_x86_64-apple-darwin_cuda65
setiathome_x41p_zi_x86_64-apple-darwin_cuda65:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 169.3.0)
	/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.5)
	/System/Library/Frameworks/Carbon.framework/Versions/A/Carbon (compatibility version 2.0.0, current version 155.0.0)
	@executable_path/libcudart.6.5.dylib (compatibility version 0.0.0, current version 6.5.14)
	@executable_path/libcufft.6.5.dylib (compatibility version 0.0.0, current version 6.5.14)
	/usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 56.0.0)

setiathome_x41p_zi_x86_64-apple-darwin_cuda65 is remarkably Fast...
ID: 1757910 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1757917 - Posted: 21 Jan 2016, 19:15:36 UTC - in response to Message 1757910.  
Last modified: 21 Jan 2016, 19:17:40 UTC

great! alpha area is setup as well for staged integration into the generic branch. If baseline v8 can look after itself for a little while (barring I have to get some generic builds to Eric for OSX + Linux), then it'll be fielding petri based alphas/beta/possible-special-releases while I continue on x42 infrastructure. Win+Mac+Linux releases should gradually become more synchronised (especially when I bring gradle in for automated build/test/deployment).
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1757917 · Report as offensive
Profile Gianfranco Lizzio
Volunteer tester
Avatar

Send message
Joined: 5 May 99
Posts: 39
Credit: 28,049,113
RAC: 87
Italy
Message 1758153 - Posted: 22 Jan 2016, 11:12:54 UTC - in response to Message 1757910.  

TBar the setiathome_x41zi_x86_64-apple-darwin_cuda65 app is 50% more faster compared with the MBv8_8.05r3346_nvidia_ssse3_x86_64-apple-darwin and uses 25% CPU against 33% the OpenCL app. It's a very good result.
I don't want to believe, I want to know!
ID: 1758153 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1758219 - Posted: 22 Jan 2016, 15:49:13 UTC - in response to Message 1758153.  

TBar the setiathome_x41zi_x86_64-apple-darwin_cuda65 app is 50% more faster compared with the MBv8_8.05r3346_nvidia_ssse3_x86_64-apple-darwin and uses 25% CPU against 33% the OpenCL app. It's a very good result.

On my GTX750Ti the shorties are about the same as with OpenCL but the longer tasks are significantly faster with cuda65. Task taking around 1100 secs in OpenCL finish in a little under 800 secs in cuda65. The cuda42 App is a little slower but is still faster than OpenCL on the longer tasks with my 750Ti. Testing the cuda42 App in Mountain Lion with my GTS250 gives about the same times as it was receiving in Windows 8.1. The cuda42 App should work with the Pre-Fermi GPUs in Snow Leopard to Mavericks. CUDA 6.5 required by Yosemite Will Not work with the Pre-Fermi GPUs.
ID: 1758219 · Report as offensive
Profile Gianfranco Lizzio
Volunteer tester
Avatar

Send message
Joined: 5 May 99
Posts: 39
Credit: 28,049,113
RAC: 87
Italy
Message 1758222 - Posted: 22 Jan 2016, 15:56:21 UTC - in response to Message 1758219.  

TBar the setiathome_x41zi_x86_64-apple-darwin_cuda65 app is 50% more faster compared with the MBv8_8.05r3346_nvidia_ssse3_x86_64-apple-darwin and uses 25% CPU against 33% the OpenCL app. It's a very good result.

On my GTX750Ti the shorties are about the same as with OpenCL but the longer tasks are significantly faster with cuda65. Task taking around 1100 secs in OpenCL finish in a little under 800 secs in cuda65. The cuda42 App is a little slower but is still faster than OpenCL on the longer tasks with my 750Ti. Testing the cuda42 App in Mountain Lion with my GTS250 gives about the same times as it was receiving in Windows 8.1. The cuda42 App should work with the Pre-Fermi GPUs in Snow Leopard to Mavericks. CUDA 6.5 required by Yosemite Will Not work with the Pre-Fermi GPUs.


OK it's the same for me. For high angle range the computing time is the same as the OpenCL app.
I don't want to believe, I want to know!
ID: 1758222 · Report as offensive
Profile Gianfranco Lizzio
Volunteer tester
Avatar

Send message
Joined: 5 May 99
Posts: 39
Credit: 28,049,113
RAC: 87
Italy
Message 1758271 - Posted: 22 Jan 2016, 18:16:31 UTC - in response to Message 1758222.  

Computer use with CUDA app is much more fluid unlike OpenCL where I experience slowdowns in the animations on the screen.
Someone else has noticed these slowdowns using OpenCL?
I don't want to believe, I want to know!
ID: 1758271 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1758276 - Posted: 22 Jan 2016, 18:36:11 UTC - in response to Message 1758271.  
Last modified: 22 Jan 2016, 18:39:10 UTC

Computer use with CUDA app is much more fluid unlike OpenCL where I experience slowdowns in the animations on the screen.
Someone else has noticed these slowdowns using OpenCL?

It's more or less part by design/engineering, and part serendipitous reflection on the different goals of different developers involved.

The current baseline Cuda app is engineered to tread pretty lightly (support old GPUs, run with multiple instances if needed). Other higher performance builds exist in alpha testing and development, that I'm told start to manifest similar characteristics to the OpenCL builds.

Been researching ways to have the minimal user impact + High performance for some time (like 5 years), The next generation of Cuda enabled app (x42) will probably self scale and/or allow configuration to how you want. That's more or less a complete redesign and implementation, being prepared for as soon as v8 transition settles.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1758276 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1758663 - Posted: 23 Jan 2016, 14:45:35 UTC

Any other Success stories? The Apps have been downloaded 17 times now, http://www.arkayn.us/forum/index.php?topic=191.msg4411#msg4411
It would be nice to know how it's working, especially in Snow Leopard as I wasn't able to test the 4.2 App there. A reminder about the drivers, if you are using a Pre-Fermi GPU don't update past Driver 6.0.51. For Fermi and above you can use the latest driver in Mavericks and above by either going to System Preferences > Other > CUDA or downloading from here if you don't have a driver installed, http://www.nvidia.com/object/mac-driver-archive.html, you will need at least Driver 6.5.14 if you are running a Fermi in Yosemite.
ID: 1758663 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1758682 - Posted: 23 Jan 2016, 15:22:42 UTC - in response to Message 1758663.  

Haven't had the chance to bring my Mac up for main yet. Should be able to see what they all do sometime today, gather all the notes for a Mac readme, and the other docs, then make Sure Eric gets everything needed. Probably fielding to a few hundred Macs will get more feedback (assuming Eric has a chance to put up either on Beta or here, maybe Tuesday's outage)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1758682 · Report as offensive
Tom Rinehart
Volunteer tester

Send message
Joined: 12 Dec 01
Posts: 113
Credit: 13,255,975
RAC: 6
United States
Message 1758787 - Posted: 23 Jan 2016, 20:47:48 UTC

TBar -

I've been running your new SSE4.1 CPU app (MBv8_8.05r3344_sse41_x86_64-apple-darwin) for a few days. It has been working well. It looks like it is a version 8.05 app, so I suggest updating the app_info.xml file to change <version_num>800</version_num> to <version_num>805</version_num>. It then shows the proper version in Boinc.

I've also been running the ATI GPU app (MBv8_8.4r3323_clGPU_ssse3_x86_64-apple-darwin) and it continues to work well. All WUs validate.

I'm still watching the results of test I did last week on the AVX and Intel GPU. The CPU WUs all validated. So far 4 of the 6 GPU WUs validated. Of these, one was initially validation inconclusive, but it ended up be validated. It must have been that the other computer had an issue. I still have one that is validation inconclusive (waiting to see if it validates) and one that hasn't been validated yet.

- Tom
ID: 1758787 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1758808 - Posted: 23 Jan 2016, 21:32:11 UTC - in response to Message 1758787.  

It looks like it is a version 8.05 app, so I suggest updating the app_info.xml file to change <version_num>800</version_num> to <version_num>805</version_num>. It then shows the proper version in Boinc.

Please don't. The version numbers (apart from v7, v8) are not significant: Eric's policy is to deploy all new applications at v8.00 (or equivalent) on first promotion from Beta to the Main project - in other words, the version counter is re-set when the application is formally released.

It will greatly help future users transitioning from stock to Anonymous Platform if you follow that convention.

(something I've been telling Lunatics developers for years, but Ageless still got caught out by a Beta version number yesterday)
ID: 1758808 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1759023 - Posted: 24 Jan 2016, 15:15:24 UTC

Yes, I was trying to avoid people creating ghosts as a result of changing the version number. If you Change the Version Number Or Plan Class in the app_info All your tasks in the client_state file that were assigned under the old numbers Will become Ghosts. It's best to just leave the numbers the same under Anonymous platform as the numbers really don't mean much under Anonymous platform. Those numbers are used by the Server to decide which Stock Applications to send the host. Under Anonymous platform you have already made the decision for the Server on which Application will be used. If you change those numbers in your app_info, you had better go through the client_state file and change all the result entries for the existing tasks to the new numbers or you will become haunted. Best to just leave the app_info version numbers/plan_class the same.

As for the Intel iGPU, it appears hopeless. Just as with the version 7 version it appears to work fine on some machines while not so fine on others. I'm seeing a number of inconclusives against the Intel iGPU from both platforms. The clue was when the same generic OpenCL App that worked well on AMDs & NVs didn't work so well on the Intel iGPU. Only time and better drivers will solve that problem.

The CUDA situation looks much better, even Petri's Super App is giving much fewer inconclusives. When that App makes it to Main the nVidias will have a large advantage in MB tasks. I would suggest people start acquiring NV cards with Compute Capabilities of 3.2 or higher and at least 2gb of memory so they can use the App. It seems to work well on my 750Ti with CC 5.0, Compute Capability Table
ID: 1759023 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1759875 - Posted: 28 Jan 2016, 3:24:16 UTC

It appears we have another Success story in the making. A laptop(?) with a NV GT650m going from over 1 hour on an AR 0.44 task using OpenCL to a mere ~35 minutes with the cuda65 App. Seems the cuda App is also going straight to Valid and not stopping at Inconclusive as well, http://setiathome.berkeley.edu/results.php?hostid=7366840&offset=100
Nice.
ID: 1759875 · Report as offensive
Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 58 · Next

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.