Message boards :
Number crunching :
I've Built a Couple OSX CUDA Apps...
Message board moderation
Previous · 1 . . . 26 · 27 · 28 · 29 · 30 · 31 · 32 . . . 58 · Next
Author | Message |
---|---|
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
sorry, NV for me is still synonymous with CUDA. With following the Vulkan initiative over the last 6 months, I've been constantly suprised how invested NV is in the open field (despite continuing with the closed platforms it's familiar for). Don't know if AMD has a working Vulkan driver or not yet, but I've been doing all sorts of Cubes & triangle things. Not sure when the OpenCL to Spir-V frontend will make an appearance, though have been impressed with the results since NV moved from their own custom compilers to LLVM (open source low level virtual machine) since Cuda 3.0. Cuda to Spir-V would be just as easy AFAICT "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
And we discuss range limiting for NV, not AMD. So? What affected range? |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
I think with Apple, range means whether you can warm your hamburger in a (MAcPro) slot, and laptop means you are awesome. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Isn't there someway you can pull those stats off the server? That way you can see which OS is giving problems. Sorta as this; http://setiweb.ssl.berkeley.edu/beta/setiathome_v8_x86_64-apple-darwin__opencl_nvidia_SoG_mac.html Is that helpful? There should be someway to look more closely at the results. BTW, I just got another SIGBUS Error. The First since yesterday morning; http://setiathome.berkeley.edu/result.php?resultid=4959511605 WU true angle range is : 2.365899 Sigma 1 Thread call stack limit is: 1k cudaAcc_free() called... cudaAcc_free() running... cudaAcc_free() PulseFind freed... cudaAcc_free() Gaussfit freed... cudaAcc_free() AutoCorrelation freed... 1,2,3,4,5,6,7,8,9,10,10,11,12,cudaAcc_free() DONE. 13 Flopcounter: 16198768654084.421875 Spike count: 12 Autocorr count: 0 Pulse count: 0 Triplet count: 1 Gaussian count: 0 SIGBUS: bus error Crashed executable name: setiathome_x41p_zj_x86_64-apple-darwin_cuda75 Machine type Intel 80486 (64-bit executable) System version: Macintosh OS 10.11.4 build 15E65 The task is Finished, the Correct results Printed, then it says it crashed. The last couple crashes have been on Shorties... |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Wow, a testing pool of 28 hosts ? Now I need to escalate my station from pleb to Peon or something. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Those tables Eric made available always puzzle me. Don't see how they could be used on purpose, but will look again. And of course, there is the way - to ask Eric... or ask volunteers with Macs and Macs knowledge. What would be preferred way? |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
They are there for people like me, that by nature spot odd patterns. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Wow, a testing pool of 28 hosts ? Now I need to escalate my station from pleb to Peon or something. LoL, I looked closer... os_version 15.5.0 15.4.0 14.5.0 13.4.0 12.6.0 So, we talk about 5 variants here that require server statistics? Don't make me laugh... Maybe someone with base Mac knowledge could say what of those 5 variants work OK and what doesn't?? Not too much variants after all... |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Wow, a testing pool of 28 hosts ? Now I need to escalate my station from pleb to Peon or something. Exactly. Seems pretty tightly controlled doesn't it ? Maybe they have something there :D "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I would say the preferred way would be to find out how to list the Mac hosts in one area then look at the Inconclusive rates. Otherwise you are reduced to making a list from the Wingpeople you run across. I just ran across a couple more; http://setiathome.berkeley.edu/results.php?hostid=7841876 http://setiathome.berkeley.edu/results.php?hostid=7173489 Just look in your Inconclusive list, they are usually there. Here's one that seems to be working; Darwin 14.5.0, http://setiathome.berkeley.edu/results.php?hostid=7452292 |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Agreed, that's a Boinc device enumeration issue which is why on Windows MB Cuda for device specific identification and control, I chose to use PCI Bus and slot number, which at least is afaik the only recommended way that is guaranteed consistent. Bit of a case of someone somewhere not having read something, and making up their own 'best practices' After the first 'Crash After Finish' a few more came quickly. They seem to be happening with Shorties with that arrangement. I decided to change back to One 950 and Two 750s, BOINC doesn't seem to mind that; 01-Jun-2016 01:44:28 [---] Starting BOINC client version 7.4.36 for x86_64-apple-darwin It doesn't seem to matter, I still get the 'Crash after Finish' with the Apps compiled with the 7.5 ToolKit. The Apps compiled with the 6.5 ToolKit don't produce the Error. Also, the OpenCL Apps Don't produce the 'Crash after Finish', seems it's strictly related to the 7.5 ToolKit. I built a nVidia OpenCL App with the r3306 files and found that version still doesn't work with El Capitan 15.4, every task is Inconclusive when run in Darwin 15.4. The tasks don't have that problem when run with the same App in Yosemite 14.5, most of them Validate on the first check. Whatever is bothering the nVidia build and Darwin 15.4 happened before r3306. How far back can you go with the version 8 files, 3200? |
Gianfranco Lizzio Send message Joined: 5 May 99 Posts: 39 Credit: 28,049,113 RAC: 87 |
I have found another host that generate a lot of inconclusive result running Darwin 15.5.0 and the OpenCL App http://setiathome.berkeley.edu/show_host_detail.php?hostid=7552348 I don't want to believe, I want to know! |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Wow, a testing pool of 28 hosts ? Now I need to escalate my station from pleb to Peon or something. Here you go. None of the nVidia OpenCL builds back to r3257 work correctly in Darwin 15.4. They all seem to be missing at least One Spike count. There are probably other problems as well. I even compiled r3185 which is a Version 7 App. I tested that in 15.4 and it also gave the Wrong results. The correct results for version 7 on this task should be; Spike count: 11 Autocorr count: 2 Pulse count: 3 Triplet count: 2 Gaussian count: 7 Now in Darwin 15.4 the r3185 results are; Spike count: 10 Autocorr count: 2 Pulse count: 3 Triplet count: 2 Gaussian count: 5 Now in 14.5 the r3185 results are; Spike count: 11 Autocorr count: 2 Pulse count: 3 Triplet count: 2 Gaussian count: 7 So, even the older Version 7 App Works in 14.5 But Not 15.4. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
So, Darwin 15.4, 15.5. Ok, this match perfectly with what Urs supplied to me yesterday. Will try to get exclusion of these OS versions. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
IMO Definitely needs some amount of wider testing (amongst people that know what they're doing at least). Probably best to check with Petri with respect to last emailed code how he'd feel about that. He PM'd that he's digging into the pulsefinding (which is what's needed for those Guppis), so not sure if that's still work in progress. I'd imagine he might be fine with marking it 'special' still, or perhaps beta or alpha. Well, the CUDA 8.0 App isn't any better than the CUDA 7.5 App. The run-times are similar on my GTX 950s & 750Ti and it seems to produce just as many, if not More, SIGBUS Errors. I even tried using the 'Baseline' seti.cpp with 8.0 & 7.5 and didn't see any change. The Baseline Apps and the Special Apps compiled with Toolkit 6.5 doesn't have this problem with SIGBUS Errors After the Results have been printed. The Special CUDA 6.5 app is about 3 to 4 minutes slower than the other 2 on a 30 minute BLC VLAR. This is using the x41p_zi code from the Repository folder, the newer code seems to produce the same SIGBUS Errors with more Inconclusive results. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
IMO Definitely needs some amount of wider testing (amongst people that know what they're doing at least). Probably best to check with Petri with respect to last emailed code how he'd feel about that. He PM'd that he's digging into the pulsefinding (which is what's needed for those Guppis), so not sure if that's still work in progress. I'd imagine he might be fine with marking it 'special' still, or perhaps beta or alpha. Yes Petri definitely indicated to me there was work in progress on the pulsefinds, so some care is needed (alpha after all). I'm convinced due to multiple crash/abort after finish issues appearing on Linux and Mac, where I still use the standard boincapi, that since the Oses are changing to be more like Windows underneath, I need to make similar tweaks to boincapi to minimise the issues, while they figure out the client side. My question is: if I allow my Mac to update tonight/tomorrow, should my baseline build run as is ? Would I need to update the web driver ? or would it just break in other ways that demand a rebuild ? Obviously demands like that would require a lot of thinking on how/when deployment of Cuda apps could occur. Likely a Lunatics installer like mechanism with a range of detection would be needed, and stock deployment possibilities limited. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
Every OS X update thus far requires a web driver update to use the new nvidia cards. But I believe that only applies to Maxwell and newer cards as they have never been in a stock Mac computer. Anything older than that works fine without the we driver I believe. Aside of course from Cuda which always has to be downloaded initially. Chris |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Every OS X update thus far requires a web driver update to use the new nvidia cards. But I believe that only applies to Maxwell and newer cards as they have never been in a stock Mac computer. Anything older than that works fine without the we driver I believe. Aside of course from Cuda which always has to be downloaded initially. Cheers!, Ugh: bring on Vulkan over metal "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
Yeah, it will be interesting to see what if anything they have to say about that at the developers conference in a couple of weeks... Not holding my breathe though. Chris |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
...My question is: if I allow my Mac to update tonight/tomorrow, should my baseline build run as is ? Would I need to update the web driver ? or would it just break in other ways that demand a rebuild ? Obviously demands like that would require a lot of thinking on how/when deployment of Cuda apps could occur. Likely a Lunatics installer like mechanism with a range of detection would be needed, and stock deployment possibilities limited. Well, all my CUDA Apps have worked the same whether I was running 14.5 or 15.5. If you are using the nVidia Video Driver you have to use a different driver Every time the System Build changes and it will change going from 15.2 to 15.5. Heck, just a Security Update will usually change the System Build number. The CUDA driver is different. The CUDA Driver will Not automatically disable itself after a Build change the way the Video driver will. You will have to go to the System Preferences and update Both the Video Driver and the CUDA Driver after updating to 15.5. If your only display is connected to a card that won't work with the OSX Video driver, you will have to use a different card until you can install the nVidia Video driver. If you have the display connected to a card that works with the OSX driver there isn't a problem. Right now the Maxwell cards will Not work with the OSX video driver, after the Update the Maxwell screen will be Black until you install the nVidia driver and reboot. Cards older than the Maxwells shouldn't have any trouble. I recommend using the Combined Update when ever updating OSX. The COMBO update will install All the Updates since the first release, which should reduce the chances of a SNAFU. You can find the Update here; https://support.apple.com/kb/DL1876 I was just pondering all the wasted time these Crash After Finish Errors are causing. By my calculations I'd be better Off using the OpenCL App on these GUPPIs. The OpenCL App is almost as fast as the CUDA 7.5 Apps and Doesn't waste time with Crashes after the task is finished. Of course the Non-GUPPIs are much faster with the CUDA App. It would have solved a few problems if the GUPPIs would have been classified as a different App, that way you could have used different Apps with the VLARs. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.