I've Built a Couple OSX CUDA Apps...

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 25 · 26 · 27 · 28 · 29 · 30 · 31 . . . 58 · Next

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1791866 - Posted: 29 May 2016, 22:20:54 UTC - in response to Message 1791854.  
Last modified: 29 May 2016, 22:23:09 UTC

Well those are questions for Petri and Eric (not my decisions).
Sadly I know little about the current other Mac apps, other than mine seems to work fine here, which is a (not very useful) sample size of one. Presuming the GPL requirements are met I don't see the obstacles you're perhaps seeing. Asking them would be IMO polite though.

Side Notes: I appreciate that not everyone works the same way I do, and that's a good thing, but doesn't pressure me to do other people's documentation & packaging or rush my own system or timetable for builds, testing, packages and releases. Thankfully there are some kind and helpful people at Lunatics and CA that help with a lot of that when the time is right for my own stuff. It's that collaborative (and time consuming) effort that makes a 'Release' as opposed to just posting builds on Forums.

Although I have become able to compile Apps that function, I know nothing about producing 'documentation' on code someone else wrote. I only compiled the App, I didn't write the code. My stance is if you want to know about the App you should ask the person that wrote it. I don't see how someone that didn't create the code can be asked about 'documentation'. So if someone asks me about about an App, you can find the answer here, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt
*nods head*


For 'Release', as opposed to test/private builds, for GPL compliance, you would need to do the minimum of update & supply a docs folder in packages similar to:
https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/Xbranch#docs, adding links to any code changes you had to make to get the build functional (e.g. for petri's code, indicate in the alpha folder where his modified code is available), adding or adding to a file saying "Builds compiled by TBar", would be a reasonable way to brand it (though not required afaik). Doing so in the Stderr would make differentiation easier for people looking at the results, should there be some need to trace the build (many possible scenarios).
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1791866 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1791881 - Posted: 29 May 2016, 22:49:41 UTC - in response to Message 1791865.  

Also, any similarity with AMD issue on OS X that Chris follows very good? Gaussians missing there - what pattern here (NV) if any?

My guess would be yes. It seems the problems started around the same time on the desktops, the change from Darwin 15.3 to 15.4. Some nVidia machines were working fine with the OpenCL App with 15.3 and then began producing mostly Inconclusives with 15.4. On my machine the ATI 6850 takes almost an hour to complete a blc5 in Yosemite 14.5. After booting into 15.4 the exact same arrangement completes the blc5 in around 42 minutes. Obviously a major change with the OS. I was hoping you might come up with a fix before blocking a number of machines.
ID: 1791881 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1791922 - Posted: 30 May 2016, 1:18:46 UTC - in response to Message 1791881.  

Kinda unrelated, but is documentation all that's missing for releasing the optimized Mac apps you put together Tbar? I'm not sure what all I'd need to do but I'd be happy to help getting those pushed to main. There are lots of 12,16 and 24 core Mac Pros out there that could really help process some guppies through with some better apps. Yours seems to be about twice as fast as the current stock apps. Just curious.

Thanks,

Chris
ID: 1791922 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1791946 - Posted: 30 May 2016, 3:31:53 UTC - in response to Message 1791922.  

AFAICT, that'd be about all that's missing for CPU. Eric's usually particularly careful about the completion of that for Stock CPU, because he apparently gets a call from FFTW library's Matteo Frigo if something's out of order there. Fortunately if there aren't any stock codebase changes, then duplicates of the text files distributed with existing stock CPU should be fine.

Caveat would be that having not provided a stock CPU application directly myself (just small source patches), I'm not clear if Eric/Berkeley prefer to control those builds, performance secondary, for other reasons.

GPUs, reliant on third party development from the start, and decoupled from FFTW, are less reliant on layers of compliance/bureaucracy, and the user-base + combined throughput considerably lower. That isn't to say there isn't still a process for those - making Eric's Job as easy as possible like Raistmer's description of reminders and prodding helps.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1791946 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1791971 - Posted: 30 May 2016, 4:21:35 UTC - in response to Message 1791946.  

The CPU apps were compiled from the AKv8 folder which also has it's own Copyright file;
https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AKv8/COPYRIGHT
FFTW: Copyright (c) 2003,2006-2011 Matteo Frigo
4	//       Copyright (c) 2003,2006-2011 Massachusets Institute of Technology
5	//
6	// fft8g.[cpp,h]: Copyright (c) 1995-2001 Takya Ooura
7	
8	// Brook+ and OpenCL parts of code: Copyright (c) 2010-2015 Raistmer
9	
10	// This program is free software; you can redistribute it and/or modify it
11	// under the terms of the GNU General Public License as published by the
12	// Free Software Foundation; either version 2, or (at your option) any later
13	// version.

None of my Apps have any stock codebase changes, only custom Configure lines. In fact, Chris helped track down the configure flag that helped the AVX App preform better. Considering the Stock Windows & Linux CPU App has AVX I see No reason for it to be left out of the Stock Mac CPU App.
I believe I mentioned that to Eric the last time I recommended the Apps to Beta a few months ago.
Any documentation required for the Apps I've compiled can be found in the Berkeley Repository as that is the location of the code.
ID: 1791971 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1791973 - Posted: 30 May 2016, 4:26:37 UTC - in response to Message 1791971.  
Last modified: 30 May 2016, 4:37:18 UTC

Has Eric released any direct AKv8 derivatives as stock CPU? Historically they may have hardware specific limitations built in that make them less suitable for generic stock. Boinc server doesn't do hardware dispatch unless Eric creates a custom scheduler path for it, which could be one part of explaining why such a build isn't already stock.

[Edit:] That's why Lunatics and Joe Segur, before my time, ported Alex Kan's and others' code into stock CPU with internal dispatch, and Joe Later added hand AVX kernels there.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1791973 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1791978 - Posted: 30 May 2016, 4:45:25 UTC - in response to Message 1791973.  

The Apps here, http://setiathome.berkeley.edu/apps.php
7.07 (avx) 7.07 (sse41) 7.07 (ssse3)
Were All compiled from the AKv8 folder and preformed Very well with the exception of those few AVX LapTops running Lion. Those Machines are Still choking on the Current Stock CPU App and producing Thousands of Invalids a Day. They did work with the ssse3 App though.

The only problem was with those handful of people who seem to only run SETI to watch the ScreenSaver. It appears that way from their RAC anyway. For those few I propose SETI assemble a 50 frame GIF they can watch the few times a year they actually run SETI.
That should solve that.
ID: 1791978 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1791979 - Posted: 30 May 2016, 4:51:23 UTC - in response to Message 1791978.  

Very well, seems like Ducks in a row then :D To my mind if you're providing drop-in-equivalent builds that work better, have the legals in order, and make it easy for Eric, then probably the only blockages are Eric's timetable & priorities. As Raistmer pointed out the last part can be sometimes tough. Depends on how hard they're turning the screws on him at Berkeley.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1791979 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1791984 - Posted: 30 May 2016, 5:22:10 UTC - in response to Message 1791979.  

Hmmm, after reporting another 56 tasks I just realized I haven't received a single SIGBUS error since placing the 3rd CUDA card back in the machine early this morning.
Interesting.
ID: 1791984 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1791986 - Posted: 30 May 2016, 5:29:07 UTC - in response to Message 1791984.  

Hmmm, after reporting another 56 tasks I just realized I haven't received a single SIGBUS error since placing the 3rd CUDA card back in the machine early this morning.
Interesting.


Haven't been game to click the update button on the Mac Pro until I have more time. Simply reseating cards can do things interrupt related.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1791986 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1791992 - Posted: 30 May 2016, 6:06:03 UTC - in response to Message 1791986.  
Last modified: 30 May 2016, 6:09:20 UTC

I dunno, but I'd suspect it doesn't help with BOINC claiming you have GPUs 1,2,2 instead of GPUs 0,1,2.
Sat May 21 17:25:37 2016 | | OpenCL: NVIDIA GPU 1: GeForce GTX 950 (driver version 10.11.10 346.03.10f01, device version OpenCL 1.2, 2048MB, 2048MB available, 758 GFLOPS peak)
Sat May 21 17:25:37 2016 | | OpenCL: NVIDIA GPU 2: GeForce GTX 750 Ti (driver version 10.11.10 346.03.10f01, device version OpenCL 1.2, 2048MB, 1911MB available, 1421 GFLOPS peak)
Sat May 21 17:25:37 2016 | | OpenCL: NVIDIA GPU 2: GeForce GTX 950 (driver version 10.11.10 346.03.10f01, device version OpenCL 1.2, 2048MB, 2048MB available, 758 GFLOPS peak)

I'll wait another day before making any conclusions.
ID: 1791992 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1791996 - Posted: 30 May 2016, 6:23:05 UTC - in response to Message 1791992.  
Last modified: 30 May 2016, 6:23:39 UTC

Agreed, that's a Boinc device enumeration issue which is why on Windows MB Cuda for device specific identification and control, I chose to use PCI Bus and slot number, which at least is afaik the only recommended way that is guaranteed consistent. Bit of a case of someone somewhere not having read something, and making up their own 'best practices'
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1791996 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1792031 - Posted: 30 May 2016, 8:41:44 UTC - in response to Message 1791881.  

Obviously a major change with the OS. I was hoping you might come up with a fix before blocking a number of machines.

Fix to... OS X? As I read it correctly, only some OS X version range produce invalids. Latest version works OK again. So, range exclusion will help to clean from invalids and inconclusives. I'm not sure anyone will try to produce any workaround for that range. It's quite time consuming to develop workaround to semi-broken system code (I suspect broken precision for some trigonometry or alike) and hardly anyone has free time for that.

So, would be good to have definitive range for exclusion then plan class could be corrected.
ID: 1792031 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1792130 - Posted: 30 May 2016, 15:42:25 UTC - in response to Message 1792031.  
Last modified: 30 May 2016, 15:59:14 UTC

As usual, the solution is not that simple. Since going back to the AMD App r3347 Chris's Mac Pro is again finding Gaussians and has moved up in the standings, http://setiathome.berkeley.edu/top_hosts.php?sort_by=expavg_credit&offset=20
It would seem there was a code change somewhere between r3347 and r3430 that results in the Mac Pros having problems using the newer App in Darwin 15.4 and above. The Current OS is 15.5 and it still has the problem with r3430. Right now Darwin 16.0 is somewhere in the future and hasn't even reached Beta stage yet. There is no guarantee 16.0 will work with r3430 in it's final release, which is a long way off.

The nVidia LapTops are a different story, they were having problems long before Darwin 15.4. Most of the Mac nVidia GPUs are in LapTops.
ID: 1792130 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1792136 - Posted: 30 May 2016, 16:04:29 UTC - in response to Message 1792130.  
Last modified: 30 May 2016, 16:06:28 UTC

The nVidia LapTops are a different story, they were having problems long before Darwin 15.4. Most of the Mac nVidia GPUs are in LapTops.


Minor point/question of order. Do these laptops 'have problems'? or do the apps 'have problems on them' ?

My 2009 Mac pro has problems, because it's obnoxiously shiny and ugly looking all fancy in the rubble, but OS and hardware seem to work.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1792136 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1792137 - Posted: 30 May 2016, 16:09:43 UTC

From my POV the problem starts with having to get a driver...
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1792137 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1792141 - Posted: 30 May 2016, 16:17:57 UTC - in response to Message 1792137.  
Last modified: 30 May 2016, 16:19:43 UTC

From my POV the problem starts with having to get a driver...


To some extent I agree. people definitely need to consider driver/OS support [,and compatibility of course,] when purchasing/installing hardware, more than often is.

A friend of mine paid a fortune for his old creative labs sound card, and he won;t let it go. They stopped making drivers for it with Vista (because the driver+hardware model changed), but there is always a hack/workaround.

Depends on how much you value your time, as to how much you're willing to flog a dead horse I suppose.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1792141 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1792151 - Posted: 30 May 2016, 16:44:27 UTC - in response to Message 1792137.  

The OpenCL drivers are built into the OS. You can't 'get a driver' in OSX. nVidia offers their own Video driver that is again Custom Built for a particular OS version. Each driver will only work with that OS version. To change the driver you must change the OS version. You can look at the nVidia Video driver yourself and see if there is an OpenCL driver in there, I couldn't find one, http://www.nvidia.com/download/driverResults.aspx/103826/en-us

Now nVidia does offer a CUDA driver, and you must use the correct version of that as well when using CUDA. The CUDA Apps work with those nVidia LapTops rather nicely even though the OpenCL App doesn't. That's sorta why I posted the Mac CUDA Apps over here, http://www.arkayn.us/forum/index.php?topic=191.msg4411#msg4411
Unfortunately SETI has changed to VLARs which takes almost as long with the CUDA App as it does with the OpenCL App on those LapTops. But, the CUDA App produces Valid results on those LapTops whereas the OpenCL App produces mainly Inconclusives. The only drawback is you must manually update the CUDA driver when you update the OS, or else CUDA may stop working.
ID: 1792151 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1792153 - Posted: 30 May 2016, 16:49:00 UTC - in response to Message 1792151.  

Those don't sound like serious serious issues to a Cuda developer intent on producing valid results (i.e. me)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1792153 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1792155 - Posted: 30 May 2016, 16:51:28 UTC

sorry, NV for me is still synonymous with CUDA.
I forget they can do OpenCL as well.
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1792155 · Report as offensive
Previous · 1 . . . 25 · 26 · 27 · 28 · 29 · 30 · 31 . . . 58 · Next

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.