I've Built a Couple OSX CUDA Apps...

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 43 · 44 · 45 · 46 · 47 · 48 · 49 . . . 58 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1857109 - Posted: 22 Mar 2017, 23:40:43 UTC

Well.... I do have a nicely working copy of x41p_zi3k+ for OSX. In fact, it seems to be working slightly better than the Linux version. Just finishing testing on the CUDA 75 version so it will work with Yosemite. The problem is it will need the CUDA 7.5 Libraries, there is a noticeable difference between the CUDA 6.5 and 7.5 Libraries with the App. So, unless someone can Host the 7.5 Libraries, everyone is going to have to download the 7.5 Toolkit and Extract the Libraries. Right now setiathome_x41p_zi3k+_x86_64-apple-darwin_cuda75 is working well in El Capitan with the CUDA 8.0 driver.
ID: 1857109 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1857112 - Posted: 22 Mar 2017, 23:51:00 UTC - in response to Message 1857109.  

Well.... I do have a nicely working copy of x41p_zi3k+ for OSX. In fact, it seems to be working slightly better than the Linux version. Just finishing testing on the CUDA 75 version so it will work with Yosemite. The problem is it will need the CUDA 7.5 Libraries, there is a noticeable difference between the CUDA 6.5 and 7.5 Libraries with the App. So, unless someone can Host the 7.5 Libraries, everyone is going to have to download the 7.5 Toolkit and Extract the Libraries. Right now setiathome_x41p_zi3k+_x86_64-apple-darwin_cuda75 is working well in El Capitan with the CUDA 8.0 driver.


Can double check the legals and look at hosting the large files on the weekend (compressed or otherwise). Wrestling with work, though will likely want to be comparing WIn/Linux/Mac performance with the 980 myself in due course.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1857112 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1857118 - Posted: 23 Mar 2017, 0:11:02 UTC - in response to Message 1857112.  

...Can double check the legals and look at hosting the large files on the weekend (compressed or otherwise). Wrestling with work, though will likely want to be comparing WIn/Linux/Mac performance with the 980 myself in due course.
I just checked at C.A. and there is a copy of libcufft.7.5.dylib in the Alpha section bundled with the old "Baseline App". I think it has the RT file included. I haven't tested it yet, perhaps that file can be used even if it does has the old App included. The way I remember, the only problem I had running the Special App in Yosemite with the 7.5 driver was a few Overflows with the 750Ti. I don't have the 750Ti in the Mac now, so, I don't expect any problems in Yosemite. I was going to swap over to Yosemite later tonight.
ID: 1857118 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1857127 - Posted: 23 Mar 2017, 0:26:53 UTC - in response to Message 1857118.  

...Can double check the legals and look at hosting the large files on the weekend (compressed or otherwise). Wrestling with work, though will likely want to be comparing WIn/Linux/Mac performance with the 980 myself in due course.
I just checked at C.A. and there is a copy of libcufft.7.5.dylib in the Alpha section bundled with the old "Baseline App". I think it has the RT file included. I haven't tested it yet, perhaps that file can be used even if it does has the old App included. The way I remember, the only problem I had running the Special App in Yosemite with the 7.5 driver was a few Overflows with the 750Ti. I don't have the 750Ti in the Mac now, so, I don't expect any problems in Yosemite. I was going to swap over to Yosemite later tonight.


Yes, will be a juggle here as well. At least I preserved my el Capitan install alongside Sierra. Unfortunately I did not preserve the Yosemite install that came on the machine, and don't have installation media for that. Are media available for older versions via Apple ? Bootcamp on el Capitan and Sierra seems to not work or be supported on this older Mac Pro, so have been using rEFInd instead (fine). Would not mind have some physical media in case of emergency anyhow (whichever versions)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1857127 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1857130 - Posted: 23 Mar 2017, 0:50:40 UTC - in response to Message 1857127.  

Yes, will be a juggle here as well. At least I preserved my el Capitan install alongside Sierra. Unfortunately I did not preserve the Yosemite install that came on the machine, and don't have installation media for that. Are media available for older versions via Apple ? Bootcamp on el Capitan and Sierra seems to not work or be supported on this older Mac Pro, so have been using rEFInd instead (fine). Would not mind have some physical media in case of emergency anyhow (whichever versions)
The last time I dealt with that situation it was easy to download the Older OS Installer if you are registered as having "purchased" your Free copy earlier. Just go to the App Store App and Download it from your Purchased Tab. If you never Downloaded the Older Installer back when it was current it's worse than pulling teeth to get a copy now. That was well over a year ago. I also found they have Time Limits on the Installers, some of my older Downloaded OS Installers don't work anymore, I had to download a new copy.
ID: 1857130 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1857133 - Posted: 23 Mar 2017, 1:13:56 UTC - in response to Message 1857130.  

Yes, will be a juggle here as well. At least I preserved my el Capitan install alongside Sierra. Unfortunately I did not preserve the Yosemite install that came on the machine, and don't have installation media for that. Are media available for older versions via Apple ? Bootcamp on el Capitan and Sierra seems to not work or be supported on this older Mac Pro, so have been using rEFInd instead (fine). Would not mind have some physical media in case of emergency anyhow (whichever versions)
The last time I dealt with that situation it was easy to download the Older OS Installer if you are registered as having "purchased" your Free copy earlier. Just go to the App Store App and Download it from your Purchased Tab. If you never Downloaded the Older Installer back when it was current it's worse than pulling teeth to get a copy now. That was well over a year ago. I also found they have Time Limits on the Installers, some of my older Downloaded OS Installers don't work anymore, I had to download a new copy.


Oh well, yeah will cross that bridge if and when I need to (probably won't need to bother). For now I'm surprised that I got the IOMMU cooperating under Linux, so enough juggling for the weekend.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1857133 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1857418 - Posted: 24 Mar 2017, 13:13:00 UTC

TBar and Jason,

On TonyMacx86.com there were questions about not being able to reinstall older OS X versions that had been successfully downloaded prior... The answer, (for the Hackintosh community), was to set the BIOS date back on the computer to "trick" the OS to install again. Once the OS was successfully reinstalled, the Hack user just changes the date back to the current/correct date and runs the system as normal. This works for all Hackintoshes; but, don't know if you can do this on a real MAC due to MACs not having a BIOS...


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1857418 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1857438 - Posted: 24 Mar 2017, 15:00:37 UTC - in response to Message 1857418.  

Hmm, Interesting theory. Sure it has EFI rather than BIOS, though setting the system clock should conceivably do the same thing. Most likely I'll end up virtualising both Win10 and OSX (some version), and passing through alternate GPUs at leisure anyway. A lot will depend on what happens with this weekend's round of fiddling. I quite like Sierra and ElCapitan (for what they are) and will preserve their native installs. It's really only the development automation where it would be nice to be able to use near identical setups on each OS simultaneously. That'll hopefully work passing through the 980, we'll see . Worst case on failure will just mean running 3 separate machines with different quirky hardware. Either way nothing insurmountable.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1857438 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1857446 - Posted: 24 Mar 2017, 15:47:08 UTC

That's strange. The CUDA 75 zi3k+ App I built a couple days ago worked fine in Yosemite, but the Inconclusive results started increasing. I switched back to El Capitan and now they are up to 70 instead of the high 40s. I used the same files to build the zi3k+ 75 version as I did the CUDA 80 version, and the CUDA 80 version is set to use the same 7.5 Libraries. I just switched back to the CUDA 80 zi3k+ version to see if the Inconclusive number will go back down. The CUDA 80 version will only work in El Capitan and higher. It'd be nice when all these False overflows it racked up with zi3t1b finish working through the system, I think it still has a couple to go.
ID: 1857446 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1857858 - Posted: 27 Mar 2017, 8:48:32 UTC - in response to Message 1857446.  

The usual delays my end, though am running down the Linux/Mac host at the moment for the GPU switchover and passthrough experiments. Will likely end up updating the alpha in svn to Petri's latest supplied sources, at least once the 980 is operational natively in Ubuntu and Sierra and/or El Capitan. Platform cross comparison and migration to Gradle automation will occur gradually, a pulsefind reduction patch likely priority before integration (depending on the inner changes).
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1857858 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1857891 - Posted: 27 Mar 2017, 11:17:44 UTC - in response to Message 1857858.  

Well, it appears the increase in Inconclusives was a result of the return of the BLCs. Seems everyones numbers went up, including my 2 Linux machines. So, I decided to give zi3t1e another try even though it produced False Overflows on the Linux machines the last time. I never did get False overflows on the Mac with zi3t1e, it just didn't seem any better than zi3k+ at the time and I expected False overflows just as with the Linux version. I did make a change in confsettings.cpp this time with the new build of zi3t1e, since the autotune seems to be setting -pfp to the unroll number I changed it to pfPeriodsPerLaunch = 8; down near the bottom instead of 128. I don't have a clue if it will make a difference, I just felt I needed to change something from the build that produced the Linux False overflows.

The two 3 GPU machines are running the new build of zi3t1e. It will probably take a couple days to see any real changes, but it does seem most of the Inconclusives are the Normal Unmatched Overflows. I haven't seen any False overflows...yet.
ID: 1857891 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1857898 - Posted: 27 Mar 2017, 11:28:21 UTC - in response to Message 1857891.  

Hmmm, will need to remember to check if the pulsefind launches have a tail on the unroll, in case the number of periods is not divisible by pfPeriodsPerLaunch * unroll . If it's missing a tail, or there is a bug in it, then yes a lower value would be less likely to cause an issue. Ultimately we'll need to stick some timers around those launches, and aim for < ~20ms or so between synchs. Probably Linux/Mac will adopt the Windows style driver optimisations (such as kernel fusion in the streams), which means automatic scaling there may become necessary sooner rather than later.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1857898 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1857904 - Posted: 27 Mar 2017, 12:33:45 UTC - in response to Message 1857898.  
Last modified: 27 Mar 2017, 13:28:56 UTC

Oh well, first False overflow on the Linux machine, https://setiathome.berkeley.edu/workunit.php?wuid=2482791582
Another Linux machine is having the same trouble with zi3t1e, yesterday it had a few False overflows that turned invalid and disappeared, https://setiathome.berkeley.edu/workunit.php?wuid=2480586800.

My other Linux machine is just getting over the zi3t1e False overflows from a few days ago, it wasn't using the autotune, http://setiathome.berkeley.edu/results.php?hostid=6906726&state=5

Still No False overflows on the Mac with zi3t1e, although this is a strange one, http://setiathome.berkeley.edu/workunit.php?wuid=2482791515 The other machine did Overflow....eventually.

Oops, Finally a False Overflow on the Mac, http://setiathome.berkeley.edu/workunit.php?wuid=2482383846
I suppose it's back to zi3k+ which doesn't produce False Overflows, just lottsa Inconclusives.
ID: 1857904 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1857917 - Posted: 27 Mar 2017, 14:42:39 UTC - in response to Message 1857904.  

heh, spike overflows at 0 chirp. That's weird, since that's just a copy, fft and powerspectrum (normally). Perhaps there's a silent failure or incomplete kernel launch. Will give something to poke at early in the piece.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1857917 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1857924 - Posted: 27 Mar 2017, 15:19:24 UTC - in response to Message 1857917.  
Last modified: 27 Mar 2017, 15:40:59 UTC

Yes, it seems all of the zi3t1e False overflows are at 0 chirp. This task mentioned earlier has them at differing chirps where zi3t1e had them all at chirp=0, http://setiathome.berkeley.edu/result.php?resultid=5616173398
So, how do we fix that? It does appear zi3t1e produces fewer inconclusives otherwise.

Starting with zi3l I noticed the BLC tasks would run normally as long as they weren't the First task run at BOINC Startup. If the First task at Startup was an Arecibo task, the following BLC would run normally. If the First task at startup was a BLC it would immediately Overflow, and keep overflowing all the BLCs until it found an Arecibo task. After the Arecibo task ran, the following BLC tasks would run normally. Yes, Very strange. It would appear something wasn't starting correctly. That was when I built zi3k+, because, it didn't/doesn't have that problem.
ID: 1857924 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1857931 - Posted: 27 Mar 2017, 16:05:37 UTC - in response to Message 1857917.  

heh, spike overflows at 0 chirp. That's weird, since that's just a copy, fft and powerspectrum (normally). Perhaps there's a silent failure or incomplete kernel launch. Will give something to poke at early in the piece.


That is one probable cause. Another is a buffer overflow before the copy.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1857931 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1857932 - Posted: 27 Mar 2017, 16:07:02 UTC - in response to Message 1857924.  

Yes, it seems all of the zi3t1e False overflows are at 0 chirp. This task mentioned earlier has them at differing chirps where zi3t1e had them all at chirp=0, http://setiathome.berkeley.edu/result.php?resultid=5616173398
So, how do we fix that? It does appear zi3t1e produces fewer inconclusives otherwise.

Starting with zi3l I noticed the BLC tasks would run normally as long as they weren't the First task run at BOINC Startup. If the First task at Startup was an Arecibo task, the following BLC would run normally. If the First task at startup was a BLC it would immediately Overflow, and keep overflowing all the BLCs until it found an Arecibo task. After the Arecibo task ran, the following BLC tasks would run normally. Yes, Very strange. It would appear something wasn't starting correctly. That was when I built zi3k+, because, it didn't/doesn't have that problem.


Sounds like an uninitialized buffer/buffer overflow.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1857932 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1857934 - Posted: 27 Mar 2017, 16:08:33 UTC
Last modified: 27 Mar 2017, 16:12:25 UTC

Thank you guys,

You can read the code and I definitely will a couple of next days. I'll reserve a stack of paper and a pencil and do some calculations ... and test runs.
You could insert a cudaMemsetAsync(SomeParams_and, streamThatProcessesTheTask) to zero out the result buffers way ahead (like before chirp) on your platforms, since I can not reproduce the error.

Petri
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1857934 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1857948 - Posted: 27 Mar 2017, 16:46:14 UTC - in response to Message 1857934.  

Can you give me an exact line and location to paste it? I'll see how it goes. Right now I don't have much else to do, zi3k+ looked good back with mostly Arecibo tasks and running around 1000 pendings to less than 50 Inconclusives. Now it doesn't look so good. Might as well try something different.
ID: 1857948 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1857957 - Posted: 27 Mar 2017, 17:28:16 UTC - in response to Message 1857948.  

Can you give me an exact line and location to paste it? I'll see how it goes. Right now I don't have much else to do, zi3k+ looked good back with mostly Arecibo tasks and running around 1000 pendings to less than 50 Inconclusives. Now it doesn't look so good. Might as well try something different.


Yes,
you'll just have to wait. This is family night. Tomorrow is the outage (starting at 18 pm here) so then I'll have time to dust my computer and GPUs and the code.

However if you are feeling impatient you can try to find the first call to cudaAcc...dfts() in analyzeFuncs.cpp. Right before or after that.
The parameters to cudaMemsetAsync are in the CUDA documentation and the size of the reserved mem buffer can be found in cudaAcceleration.cu where the buffer is allocated using CUDA device memory allocation function. The size is in bytes and one float (short and fast form of decimal number) takes four bytes (4 chunks of 8 bit integers, totalling of 32 bits i.e. four bytes)

There is a possibility of an error either in allocation size or fetching of the result or another stage of signal finding overwriting. I know that one can get blind to his own errors. That's why I need a third eye or seventh sense.

Petri
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1857957 · Report as offensive
Previous · 1 . . . 43 · 44 · 45 · 46 · 47 · 48 · 49 . . . 58 · Next

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.