I've Built a Couple OSX CUDA Apps...

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 26 · 27 · 28 · 29 · 30 · 31 · 32 . . . 58 · Next

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1792159 - Posted: 30 May 2016, 16:57:10 UTC - in response to Message 1792155.  
Last modified: 30 May 2016, 16:58:38 UTC

sorry, NV for me is still synonymous with CUDA.
I forget they can do OpenCL as well.


With following the Vulkan initiative over the last 6 months, I've been constantly suprised how invested NV is in the open field (despite continuing with the closed platforms it's familiar for). Don't know if AMD has a working Vulkan driver or not yet, but I've been doing all sorts of Cubes & triangle things. Not sure when the OpenCL to Spir-V frontend will make an appearance, though have been impressed with the results since NV moved from their own custom compilers to LLVM (open source low level virtual machine) since Cuda 3.0. Cuda to Spir-V would be just as easy AFAICT
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1792159 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1792195 - Posted: 30 May 2016, 18:22:11 UTC - in response to Message 1792130.  


The nVidia LapTops are a different story, they were having problems long before Darwin 15.4. Most of the Mac nVidia GPUs are in LapTops.

And we discuss range limiting for NV, not AMD. So? What affected range?
ID: 1792195 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1792208 - Posted: 30 May 2016, 18:42:10 UTC - in response to Message 1792195.  


The nVidia LapTops are a different story, they were having problems long before Darwin 15.4. Most of the Mac nVidia GPUs are in LapTops.

And we discuss range limiting for NV, not AMD. So? What affected range?


I think with Apple, range means whether you can warm your hamburger in a (MAcPro) slot, and laptop means you are awesome.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1792208 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1792222 - Posted: 30 May 2016, 19:01:41 UTC - in response to Message 1792195.  
Last modified: 30 May 2016, 19:11:12 UTC


The nVidia LapTops are a different story, they were having problems long before Darwin 15.4. Most of the Mac nVidia GPUs are in LapTops.

And we discuss range limiting for NV, not AMD. So? What affected range?

Isn't there someway you can pull those stats off the server? That way you can see which OS is giving problems.
Sorta as this; http://setiweb.ssl.berkeley.edu/beta/setiathome_v8_x86_64-apple-darwin__opencl_nvidia_SoG_mac.html
Is that helpful? There should be someway to look more closely at the results.

BTW, I just got another SIGBUS Error. The First since yesterday morning;
http://setiathome.berkeley.edu/result.php?resultid=4959511605
WU true angle range is :  2.365899
Sigma 1
Thread call stack limit is: 1k
cudaAcc_free() called...
cudaAcc_free() running...
cudaAcc_free() PulseFind freed...
cudaAcc_free() Gaussfit freed...
cudaAcc_free() AutoCorrelation freed...
1,2,3,4,5,6,7,8,9,10,10,11,12,cudaAcc_free() DONE.
13
Flopcounter: 16198768654084.421875

Spike count:    12
Autocorr count: 0
Pulse count:    0
Triplet count:  1
Gaussian count: 0
SIGBUS: bus error

Crashed executable name: setiathome_x41p_zj_x86_64-apple-darwin_cuda75
Machine type Intel 80486 (64-bit executable)
System version: Macintosh OS 10.11.4 build 15E65

The task is Finished, the Correct results Printed, then it says it crashed.
The last couple crashes have been on Shorties...
ID: 1792222 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1792224 - Posted: 30 May 2016, 19:09:07 UTC - in response to Message 1792222.  

Wow, a testing pool of 28 hosts ? Now I need to escalate my station from pleb to Peon or something.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1792224 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1792229 - Posted: 30 May 2016, 19:18:38 UTC - in response to Message 1792222.  
Last modified: 30 May 2016, 19:18:47 UTC


Isn't there someway you can pull those stats off the server? That way you can see which OS is giving problems.
Sorta as this; http://setiweb.ssl.berkeley.edu/beta/setiathome_v8_x86_64-apple-darwin__opencl_nvidia_SoG_mac.html
Is that helpful? There should be someway to look more closely at the results.

Those tables Eric made available always puzzle me. Don't see how they could be used on purpose, but will look again.
And of course, there is the way - to ask Eric... or ask volunteers with Macs and Macs knowledge. What would be preferred way?
ID: 1792229 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1792231 - Posted: 30 May 2016, 19:21:36 UTC - in response to Message 1792229.  


Isn't there someway you can pull those stats off the server? That way you can see which OS is giving problems.
Sorta as this; http://setiweb.ssl.berkeley.edu/beta/setiathome_v8_x86_64-apple-darwin__opencl_nvidia_SoG_mac.html
Is that helpful? There should be someway to look more closely at the results.

Those tables Eric made available always puzzle me. Don't see how they could be used on purpose, but will look again.
And of course, there is the way - to ask Eric... or ask volunteers with Macs and Macs knowledge. What would be preferred way?


They are there for people like me, that by nature spot odd patterns.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1792231 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1792234 - Posted: 30 May 2016, 19:22:27 UTC - in response to Message 1792224.  

Wow, a testing pool of 28 hosts ? Now I need to escalate my station from pleb to Peon or something.

LoL, I looked closer...

os_version
15.5.0
15.4.0
14.5.0
13.4.0
12.6.0

So, we talk about 5 variants here that require server statistics? Don't make me laugh...

Maybe someone with base Mac knowledge could say what of those 5 variants work OK and what doesn't?? Not too much variants after all...
ID: 1792234 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1792241 - Posted: 30 May 2016, 19:28:40 UTC - in response to Message 1792234.  

Wow, a testing pool of 28 hosts ? Now I need to escalate my station from pleb to Peon or something.

LoL, I looked closer...

os_version
15.5.0
15.4.0
14.5.0
13.4.0
12.6.0

So, we talk about 5 variants here that require server statistics? Don't make me laugh...

Maybe someone with base Mac knowledge could say what of those 5 variants work OK and what doesn't?? Not too much variants after all...


Exactly. Seems pretty tightly controlled doesn't it ? Maybe they have something there :D
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1792241 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1792246 - Posted: 30 May 2016, 19:35:08 UTC - in response to Message 1792229.  
Last modified: 30 May 2016, 19:42:40 UTC


Isn't there someway you can pull those stats off the server? That way you can see which OS is giving problems.
Sorta as this; http://setiweb.ssl.berkeley.edu/beta/setiathome_v8_x86_64-apple-darwin__opencl_nvidia_SoG_mac.html
Is that helpful? There should be someway to look more closely at the results.

Those tables Eric made available always puzzle me. Don't see how they could be used on purpose, but will look again.
And of course, there is the way - to ask Eric... or ask volunteers with Macs and Macs knowledge. What would be preferred way?

I would say the preferred way would be to find out how to list the Mac hosts in one area then look at the Inconclusive rates. Otherwise you are reduced to making a list from the Wingpeople you run across. I just ran across a couple more;
http://setiathome.berkeley.edu/results.php?hostid=7841876
http://setiathome.berkeley.edu/results.php?hostid=7173489

Just look in your Inconclusive list, they are usually there.

Here's one that seems to be working;
Darwin 14.5.0, http://setiathome.berkeley.edu/results.php?hostid=7452292
ID: 1792246 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1792598 - Posted: 1 Jun 2016, 15:55:23 UTC - in response to Message 1791996.  

Agreed, that's a Boinc device enumeration issue which is why on Windows MB Cuda for device specific identification and control, I chose to use PCI Bus and slot number, which at least is afaik the only recommended way that is guaranteed consistent. Bit of a case of someone somewhere not having read something, and making up their own 'best practices'

After the first 'Crash After Finish' a few more came quickly. They seem to be happening with Shorties with that arrangement. I decided to change back to One 950 and Two 750s, BOINC doesn't seem to mind that;

01-Jun-2016 01:44:28 [---] Starting BOINC client version 7.4.36 for x86_64-apple-darwin
01-Jun-2016 01:44:28 [---] CUDA: NVIDIA GPU 0: Graphics Device (driver version 7.5.29, CUDA version 7.5, compute capability 5.2, 2048MB, 1578MB available, 2022 GFLOPS peak)
01-Jun-2016 01:44:28 [---] CUDA: NVIDIA GPU 1: GeForce GTX 750 Ti (driver version 7.5.29, CUDA version 7.5, compute capability 5.0, 2048MB, 1733MB available, 1421 GFLOPS peak)
01-Jun-2016 01:44:28 [---] CUDA: NVIDIA GPU 2: GeForce GTX 750 Ti (driver version 7.5.29, CUDA version 7.5, compute capability 5.0, 2048MB, 1743MB available, 1388 GFLOPS peak)
01-Jun-2016 01:44:28 [---] OpenCL: NVIDIA GPU 0: Graphics Device (driver version 10.5.2 346.02.03f07, device version OpenCL 1.2, 2048MB, 1578MB available, 2022 GFLOPS peak)
01-Jun-2016 01:44:28 [---] OpenCL: NVIDIA GPU 1: GeForce GTX 750 Ti (driver version 10.5.2 346.02.03f07, device version OpenCL 1.2, 2048MB, 1733MB available, 1421 GFLOPS peak)
01-Jun-2016 01:44:28 [---] OpenCL: NVIDIA GPU 2: GeForce GTX 750 Ti (driver version 10.5.2 346.02.03f07, device version OpenCL 1.2, 2048MB, 1743MB available, 1388 GFLOPS peak)
01-Jun-2016 01:44:28 [---] OpenCL CPU: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz (OpenCL driver vendor: Apple, driver version 1.1, device version OpenCL 1.2)
01-Jun-2016 01:44:28 [---] Processor: 8 GenuineIntel Intel(R) Xeon(R) CPU E5472 @ 3.00GHz [x86 Family 6 Model 23 Stepping 6]
01-Jun-2016 01:44:28 [---] OS: Mac OS X 10.10.5 (Darwin 14.5.0)

It doesn't seem to matter, I still get the 'Crash after Finish' with the Apps compiled with the 7.5 ToolKit. The Apps compiled with the 6.5 ToolKit don't produce the Error. Also, the OpenCL Apps Don't produce the 'Crash after Finish', seems it's strictly related to the 7.5 ToolKit.

I built a nVidia OpenCL App with the r3306 files and found that version still doesn't work with El Capitan 15.4, every task is Inconclusive when run in Darwin 15.4. The tasks don't have that problem when run with the same App in Yosemite 14.5, most of them Validate on the first check. Whatever is bothering the nVidia build and Darwin 15.4 happened before r3306. How far back can you go with the version 8 files, 3200?
ID: 1792598 · Report as offensive
Profile Gianfranco Lizzio
Volunteer tester
Avatar

Send message
Joined: 5 May 99
Posts: 39
Credit: 28,049,113
RAC: 87
Italy
Message 1792605 - Posted: 1 Jun 2016, 16:30:31 UTC

I have found another host that generate a lot of inconclusive result running Darwin 15.5.0 and the OpenCL App

http://setiathome.berkeley.edu/show_host_detail.php?hostid=7552348
I don't want to believe, I want to know!
ID: 1792605 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1792659 - Posted: 1 Jun 2016, 18:59:30 UTC - in response to Message 1792234.  

Wow, a testing pool of 28 hosts ? Now I need to escalate my station from pleb to Peon or something.

LoL, I looked closer...

os_version
15.5.0
15.4.0
14.5.0
13.4.0
12.6.0

So, we talk about 5 variants here that require server statistics? Don't make me laugh...

Maybe someone with base Mac knowledge could say what of those 5 variants work OK and what doesn't?? Not too much variants after all...

Here you go.
None of the nVidia OpenCL builds back to r3257 work correctly in Darwin 15.4. They all seem to be missing at least One Spike count. There are probably other problems as well.
I even compiled r3185 which is a Version 7 App. I tested that in 15.4 and it also gave the Wrong results. The correct results for version 7 on this task should be;
Spike count: 11
Autocorr count: 2
Pulse count: 3
Triplet count: 2
Gaussian count: 7

Now in Darwin 15.4 the r3185 results are;
Spike count: 10
Autocorr count: 2
Pulse count: 3
Triplet count: 2
Gaussian count: 5
Now in 14.5 the r3185 results are;
Spike count: 11
Autocorr count: 2
Pulse count: 3
Triplet count: 2
Gaussian count: 7
So, even the older Version 7 App Works in 14.5 But Not 15.4.
ID: 1792659 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1792660 - Posted: 1 Jun 2016, 19:14:51 UTC - in response to Message 1792659.  

So, Darwin 15.4, 15.5.
Ok, this match perfectly with what Urs supplied to me yesterday.
Will try to get exclusion of these OS versions.
ID: 1792660 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1793190 - Posted: 3 Jun 2016, 19:39:47 UTC - in response to Message 1791785.  
Last modified: 3 Jun 2016, 20:07:21 UTC

IMO Definitely needs some amount of wider testing (amongst people that know what they're doing at least). Probably best to check with Petri with respect to last emailed code how he'd feel about that. He PM'd that he's digging into the pulsefinding (which is what's needed for those Guppis), so not sure if that's still work in progress. I'd imagine he might be fine with marking it 'special' still, or perhaps beta or alpha.

Problem if making it look generic is people will run it on Pre cc3.2 nomatter how much you warn, so my part of folding the code into baseline with compute capability based dispatch becomes a bit more pressing (along with updating what he's already given me in Berkeley's repo first).

Most likely the final form of next generic stock will be clearer once more is known about the Pascal generation, and Cuda 8's only just out, so just warning that a lot could change very quickly if either of us stumble on some gotchas.

Well, the CUDA 8.0 App isn't any better than the CUDA 7.5 App. The run-times are similar on my GTX 950s & 750Ti and it seems to produce just as many, if not More, SIGBUS Errors. I even tried using the 'Baseline' seti.cpp with 8.0 & 7.5 and didn't see any change. The Baseline Apps and the Special Apps compiled with Toolkit 6.5 doesn't have this problem with SIGBUS Errors After the Results have been printed. The Special CUDA 6.5 app is about 3 to 4 minutes slower than the other 2 on a 30 minute BLC VLAR. This is using the x41p_zi code from the Repository folder, the newer code seems to produce the same SIGBUS Errors with more Inconclusive results.
ID: 1793190 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1793245 - Posted: 4 Jun 2016, 2:32:27 UTC - in response to Message 1793190.  

IMO Definitely needs some amount of wider testing (amongst people that know what they're doing at least). Probably best to check with Petri with respect to last emailed code how he'd feel about that. He PM'd that he's digging into the pulsefinding (which is what's needed for those Guppis), so not sure if that's still work in progress. I'd imagine he might be fine with marking it 'special' still, or perhaps beta or alpha.

Problem if making it look generic is people will run it on Pre cc3.2 nomatter how much you warn, so my part of folding the code into baseline with compute capability based dispatch becomes a bit more pressing (along with updating what he's already given me in Berkeley's repo first).

Most likely the final form of next generic stock will be clearer once more is known about the Pascal generation, and Cuda 8's only just out, so just warning that a lot could change very quickly if either of us stumble on some gotchas.

Well, the CUDA 8.0 App isn't any better than the CUDA 7.5 App. The run-times are similar on my GTX 950s & 750Ti and it seems to produce just as many, if not More, SIGBUS Errors. I even tried using the 'Baseline' seti.cpp with 8.0 & 7.5 and didn't see any change. The Baseline Apps and the Special Apps compiled with Toolkit 6.5 doesn't have this problem with SIGBUS Errors After the Results have been printed. The Special CUDA 6.5 app is about 3 to 4 minutes slower than the other 2 on a 30 minute BLC VLAR. This is using the x41p_zi code from the Repository folder, the newer code seems to produce the same SIGBUS Errors with more Inconclusive results.


Yes Petri definitely indicated to me there was work in progress on the pulsefinds, so some care is needed (alpha after all). I'm convinced due to multiple crash/abort after finish issues appearing on Linux and Mac, where I still use the standard boincapi, that since the Oses are changing to be more like Windows underneath, I need to make similar tweaks to boincapi to minimise the issues, while they figure out the client side.

My question is: if I allow my Mac to update tonight/tomorrow, should my baseline build run as is ? Would I need to update the web driver ? or would it just break in other ways that demand a rebuild ? Obviously demands like that would require a lot of thinking on how/when deployment of Cuda apps could occur. Likely a Lunatics installer like mechanism with a range of detection would be needed, and stock deployment possibilities limited.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1793245 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1793264 - Posted: 4 Jun 2016, 4:58:07 UTC - in response to Message 1793245.  

Every OS X update thus far requires a web driver update to use the new nvidia cards. But I believe that only applies to Maxwell and newer cards as they have never been in a stock Mac computer. Anything older than that works fine without the we driver I believe. Aside of course from Cuda which always has to be downloaded initially.

Chris
ID: 1793264 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1793267 - Posted: 4 Jun 2016, 5:20:26 UTC - in response to Message 1793264.  

Every OS X update thus far requires a web driver update to use the new nvidia cards. But I believe that only applies to Maxwell and newer cards as they have never been in a stock Mac computer. Anything older than that works fine without the we driver I believe. Aside of course from Cuda which always has to be downloaded initially.

Chris


Cheers!, Ugh: bring on Vulkan over metal
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1793267 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1793409 - Posted: 4 Jun 2016, 16:30:04 UTC - in response to Message 1793267.  

Yeah, it will be interesting to see what if anything they have to say about that at the developers conference in a couple of weeks... Not holding my breathe though.

Chris
ID: 1793409 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1793429 - Posted: 4 Jun 2016, 17:33:26 UTC - in response to Message 1793245.  
Last modified: 4 Jun 2016, 17:34:57 UTC

...My question is: if I allow my Mac to update tonight/tomorrow, should my baseline build run as is ? Would I need to update the web driver ? or would it just break in other ways that demand a rebuild ? Obviously demands like that would require a lot of thinking on how/when deployment of Cuda apps could occur. Likely a Lunatics installer like mechanism with a range of detection would be needed, and stock deployment possibilities limited.

Well, all my CUDA Apps have worked the same whether I was running 14.5 or 15.5. If you are using the nVidia Video Driver you have to use a different driver Every time the System Build changes and it will change going from 15.2 to 15.5. Heck, just a Security Update will usually change the System Build number. The CUDA driver is different. The CUDA Driver will Not automatically disable itself after a Build change the way the Video driver will. You will have to go to the System Preferences and update Both the Video Driver and the CUDA Driver after updating to 15.5. If your only display is connected to a card that won't work with the OSX Video driver, you will have to use a different card until you can install the nVidia Video driver. If you have the display connected to a card that works with the OSX driver there isn't a problem. Right now the Maxwell cards will Not work with the OSX video driver, after the Update the Maxwell screen will be Black until you install the nVidia driver and reboot. Cards older than the Maxwells shouldn't have any trouble.

I recommend using the Combined Update when ever updating OSX. The COMBO update will install All the Updates since the first release, which should reduce the chances of a SNAFU. You can find the Update here; https://support.apple.com/kb/DL1876

I was just pondering all the wasted time these Crash After Finish Errors are causing. By my calculations I'd be better Off using the OpenCL App on these GUPPIs. The OpenCL App is almost as fast as the CUDA 7.5 Apps and Doesn't waste time with Crashes after the task is finished. Of course the Non-GUPPIs are much faster with the CUDA App. It would have solved a few problems if the GUPPIs would have been classified as a different App, that way you could have used different Apps with the VLARs.
ID: 1793429 · Report as offensive
Previous · 1 . . . 26 · 27 · 28 · 29 · 30 · 31 · 32 . . . 58 · Next

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.