I've Built a Couple OSX CUDA Apps...

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 . . . 58 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1770557 - Posted: 9 Mar 2016, 15:13:44 UTC - in response to Message 1770538.  

The only way I could compile with Xcode 7.2.1 was to remove everything between lines 164 & 304. Not sure about the effects that would have, except it stopped the errors.
https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/Xbranch/client/alpha/PetriR_raw/cuda/cudaAcceleration.h

I suppose I could let it run for a while and see if it helps with the inconclusives. In testing it uses just as much CPU as the other version, nearly 100% on my Mac.
ID: 1770557 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1770569 - Posted: 9 Mar 2016, 17:37:20 UTC

It looks as though the same 'crash after finish' error is still there. This is about the 6th one I've seen. Seems to happen with All the builds. The task is finished, the results match the wingpeoples, the runtime is normal for that AR. Yet it gives a crash;
http://setiathome.berkeley.edu/result.php?resultid=4779898044
Spike count:    5
Autocorr count: 0
Pulse count:    0
Triplet count:  2
Gaussian count: 0
SIGBUS: bus error

Crashed executable name: setiathome_x41p_zi_x86_64-apple-darwin_cuda75
Machine type Intel 80486 (64-bit executable)
System version: Macintosh OS 10.10.5 build 14F1605
Wed Mar  9 12:19:46 2016

atos cannot load symbols for the file setiathome_x41p_zi_x86_64-apple-darwin_cuda75 for architecture x86_64.
0   setiathome_x41p_zi_x86_64-apple-darwin_cuda75 0x0000000108cf8e88  
SIGPIPE: write on a pipe with no reader
1   setiathome_x41p_zi_x86_64-apple-darwin_cuda75 0x0000000108ce7e66  
SIGPIPE: write on a pipe with no reader
2   libsystem_platform.dylib            0x00007fff9746ff1a  
SIGPIPE: write on a pipe with no reader
3   libsystem_malloc.dylib              0x00007fff8fcd9b1d  
SIGPIPE: write on a pipe with no reader
4   setiathome_x41p_zi_x86_64-apple-darwin_cuda75 0x0000000108b735b7  
SIGPIPE: write on a pipe with no reader
5   setiathome_x41p_zi_x86_64-apple-darwin_cuda75 0x0000000108b833f2  
SIGPIPE: write on a pipe with no reader
6   setiathome_x41p_zi_x86_64-apple-darwin_cuda75 0x0000000108b89f46  
SIGPIPE: write on a pipe with no reader
7   setiathome_x41p_zi_x86_64-apple-darwin_cuda75 0x0000000108b65dc9  
SIGPIPE: write on a pipe with no reader
8   libdyld.dylib                       0x00007fff98f7e5c9  
SIGPIPE: write on a pipe with no reader
9   ???                                 0x0000000000000003  

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x0100001f  rbx: 0x00000000  rcx: 0x7fff57098fc8  rdx: 0x00000028
  rdi: 0x7fff57099030  rsi: 0x00000003  rbp: 0x7fff57099010  rsp: 0x7fff57098fc8
   r8: 0x00000a0f   r9: 0x00000000  r10: 0x000003b0  r11: 0x00000206
  r12: 0x000003b0  r13: 0x00000028  r14: 0x7fff57099030  r15: 0x00000a0f
  rip: 0x7fff98e834de  rfl: 0x00000206
...

It usually happens with a shorty...
???
ID: 1770569 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1770643 - Posted: 9 Mar 2016, 23:56:07 UTC

Oh well, that seemed to make the 'SIGBUS: bus error' even worse...

I found App Cleaner does a very good job cleaning out Xcode. If you have a local copy you can have Xcode 6.1.1 up and running in just a few minutes. Seems using it with the 10.10 SDK brings back the Object.h error, however, the ASM Errors are gone. If you use Xcode 6.1.1 with the 10.9 SDK All the Errors are Gone...nice.

The resulting 6.1.1 App even worked on the same shorty that had failed twice with the Xcode 7.2.1 generated App. This could be interesting. All I need is a few more shorties to test, there doesn't appear to be many around though.
Bring on the Shorties please.
ID: 1770643 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 31009
Credit: 53,134,872
RAC: 32
United States
Message 1770673 - Posted: 10 Mar 2016, 4:24:18 UTC - in response to Message 1770569.  

SIGPIPE error write on pipe with no reader means that the science app is trying to tell BOINC that it is done, but BOINC already knows it is done and closed the pipe. Should not happen. Possibly a error return on a pipe call was missed setting up a race. I doubt that the issue is related to different libraries or language standards, e.g POSIX vs GNU, but possible.
ID: 1770673 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1770684 - Posted: 10 Mar 2016, 5:11:25 UTC - in response to Message 1770673.  

So far I've received one SIGBUS error with the new build. After the 2nd try I turned the thing into a ghost, that would be this one; http://setiathome.berkeley.edu/workunit.php?wuid=2088328123
I'll deal with that one later, maybe bring it back as a cpu task.

For now, forget about the Error list and just watch the Inconclusive list. I don't think you're going to see many of those anymore. The last build matches the results from a known good cpu app very closely. Usually that translates into very few Inconclusives.
Now to fix the SIGBUS problem, if possible. This sounds similar to an ancient problem with BOINC a few of us are familiar with.
ID: 1770684 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1770818 - Posted: 10 Mar 2016, 21:38:10 UTC

Well, I've tried everything I can, and I'm still getting these blasted SIGBUS Errors on shorties. I tried compiling it with bonic-master 7.5 & 7.7, switched to the newest version of BOINC, no luck. It appears the Inconclusives have been tamed, about the only inconclusive results now are Immediate Overflows and unreliable wingpeople.

So, we fix the SIGBUS Errors and it's ready for Prime Time, right?
Right???
ID: 1770818 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1770825 - Posted: 10 Mar 2016, 22:40:32 UTC - in response to Message 1770818.  

So, we fix the SIGBUS Errors and it's ready for Prime Time, right?
Right???


'PrimeTime' as is Advanced user widescale testing. Considerably more work to do for stock integration.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1770825 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1771418 - Posted: 13 Mar 2016, 19:38:44 UTC - in response to Message 1770825.  

I still haven't been able to compile an App using Petri's code in Yosemite or El Capitan that doesn't have the SIGBUS problem. I don't have that problem with the Apps compiled in Mountain Lion. The highest CUDA Toolkit you can use in ML is 6.5. All the earlier Apps were compiled in Mountain Lion, I went back to one of those Apps for now.

The latest problem is trying to use Xcode 6.1.1 in El Capitan with ToolKit 7.5. The build is stopping with two Errors;
analyzeFuncs.cpp:1691:9: error: use of undeclared identifier 'cdft'
cdft(NumPointsInChunk*2, 1, DataOutChunk, BitRevTab, CoeffTab);
analyzeFuncs.cpp:1762:30: error: use of undeclared identifier 'cdft'
cdft(NumPointsInChunk*2, -1, DataOutChunk, BitRevTab, CoeffTab);

If I try adding either of the fft8g files it gives another error and just keeps snowballing.
Any idea how to stop that Error without creating new ones?
ID: 1771418 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1771428 - Posted: 13 Mar 2016, 21:04:35 UTC - in response to Message 1771418.  

Gianfranco Lizzio had the same problem with cdft. He might be able to help.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1771428 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1771469 - Posted: 14 Mar 2016, 0:21:39 UTC
Last modified: 14 Mar 2016, 0:22:11 UTC

If using the gnutools mechanism, you may need to go through and take out the compiler define to use FFTW. Probably by Analyzefuncs.cpp a #define is clobbering oura FFT's include and other parts. That's because Xbranch hasn't used FFTW for a very long time, as the setup time for the baseline smoothing with fftw exceeds the time to just baseline smooth with Ourra. Naturally my flat Makefile just doesn't define USE_FFTW at all

Next major version all the FFT's become plugins anyway, so the awkward preprocessor directive blocks disappear from most of the code, being one of the least nice ways of having separate copepaths.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1771469 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1771475 - Posted: 14 Mar 2016, 1:50:49 UTC - in response to Message 1771469.  

Seems to be a large number of lines dealing with FFTW in Analyzefuncs.cpp.
Apparently I'm not choosing the correct ones to delete.
Would you please post the part(s) I should remove.
ID: 1771475 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1771486 - Posted: 14 Mar 2016, 3:32:14 UTC - in response to Message 1771475.  

Simply everywhere it says #ifdef USE_FFTW, or #if defined(USE_FFTW), just change it to #uf 0, and the remaining spaghetti code will activate what's needed, most importantly the fft8g include near the top
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1771486 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1771492 - Posted: 14 Mar 2016, 4:15:11 UTC - in response to Message 1771486.  
Last modified: 14 Mar 2016, 5:05:52 UTC

Unfortunately that doesn't seem to work either. It just turns a couple errors into many. Each attempt to correct a new error merely results in more...they're replicators...

Hmmmm, if you put include fft8g.cpp Right under the "----CUFFT----" at the top the errors go away. I suppose it matters Where you put it.
ID: 1771492 · Report as offensive
Profile Gianfranco Lizzio
Volunteer tester
Avatar

Send message
Joined: 5 May 99
Posts: 39
Credit: 28,049,113
RAC: 87
Italy
Message 1771505 - Posted: 14 Mar 2016, 7:32:14 UTC - in response to Message 1771418.  

Tom after ./compile ... the problem is in analyzeFuncs.cpp in the client folder. You have to append the follow line code

#include <fft8g.h>

after #endif // USE_IPP

It works successfully for me.
I don't want to believe, I want to know!
ID: 1771505 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1771508 - Posted: 14 Mar 2016, 8:17:19 UTC - in response to Message 1771505.  

should do :), who knows what gnutools is doing with the #defines, lol.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1771508 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1771600 - Posted: 14 Mar 2016, 17:23:39 UTC

I ended up adding #include "fft8g.h" under CUFFT instead of "fft8g.cpp". So the change is now;
#pragma message ("-----mmx-----")
#include <ipp_px.h>
#endif // T7
#include <ipp.h>
#elif defined(USE_FFTWF)
#pragma message ("----FFTW----")
#include "fftw3.h"
#elif defined(USE_CUDA)
#pragma message ("----CUFFT----")
#include "fft8g.h"
#else
#pragma message ("----ooura----")
#include "fft8g.h"
#endif // USE_IPP

Even with cleaning up the Rube work it still failed in Yosemite with a linker error.
After once again removing the 750s and booting into Mountain Lion the same code compiled without any trouble.
So, I added a couple frameworks and built a new CUDA 6.5 App in ML.
Seems to be working very well, the only inconclusives are from Immediate Overflows and Suspect wingpeople. Unfortunately it still uses a full CPU, unlike the Apps compiled with the older code.
Since it uses a full CPU the -poll command doesn't have much affect. The -poll command does help the older App though, at the expense of using a full CPU. Take a look at the CUDA 42 App using the -poll command, the last part is with the command on; http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=72013
It speeds up the lower angle range tasks as well.
ID: 1771600 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1771923 - Posted: 16 Mar 2016, 16:35:22 UTC
Last modified: 16 Mar 2016, 16:50:53 UTC

Two days later and the New Special Cuda 65 App, compiled in Mountain Lion, appears to be working well. Seems the SIGBUS Errors have disappeared. It's possible there may be more Inconclusives in El Capitan than Yosemite, but it's too early to determine. The Inconclusives in Yosemite are Very Low considering the amount of Valid results. I ran the app on Beta last night and the results were very consistent; http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=63959

Speaking of Inconclusive results, I was looking at one of the recent returns; http://setiathome.berkeley.edu/workunit.php?wuid=2095009935
I looked at the nVidia tasks on that host and the best I can tell EVERY task was Inconclusive, except the Overflows, taking at least three results to Validate. That would be around a 100% Inconclusive rate. Now compare that to the Standard CUDA 65 App which has been available since Jan 21st, http://setiathome.berkeley.edu/results.php?hostid=7366840&offset=180. The CUDA 65 App has ZERO Inconclusives and preforms much faster on similar Hardware. That would be a 100% Validation rate as opposed to around a 100% Inconclusive rate. Heck, even the Special CUDA 65 App beats the current 'Official' App.
Whatever...

In other news I'm about to switch one of the Linux machines from ATI cards to two old nVidia cards and run the CUDA 42 App for a while. I plan on testing what happens when I use the opencl_ati5_sah plan class on the CUDA 42 App. Any predictions on the outcome? Of course I plan on disabling networking and having a backup BOINC folder...I remember the last time I tried that.
ID: 1771923 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1771982 - Posted: 16 Mar 2016, 20:32:17 UTC - in response to Message 1771923.  

I use <plan_class>opencl_nvidia_100</plan_class> with my cuda 6.5 special.
You might get some more VLAR tasks and depending on your hardware it might be OK or not. I'd leave an entry for the old plan class too if the cache is not empty.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1771982 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1771985 - Posted: 16 Mar 2016, 20:39:44 UTC - in response to Message 1771600.  

Hi TBar,

Do you have a CUDA app that does not use a full core on your mac?

If yes, I'd compare the piece of code that sets the cuda driver to yield, poll or whatever the third word is (a temporary dementia/amnesia has hit me).

Another place to look at is the lines in my code that call nanosleep in a loop. They could be replaced with a normal cuda synchronization code for a CPU thread. You can look at the code in the cuda part of the pulsefind when the CPU is waiting for a stream to finish its work.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1771985 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1772000 - Posted: 16 Mar 2016, 21:51:17 UTC - in response to Message 1771985.  

Sidenote: might have the facilities in place to parametrise those settings soon, if you locate any of help during experiments. I've been using adding .cfg file parsing for Mac+Linux as a test for gradle build automation.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1772000 · Report as offensive
Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 . . . 58 · Next

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.