I've Built a Couple OSX CUDA Apps...

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 58 · Next

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1749384 - Posted: 15 Dec 2015, 5:45:05 UTC - in response to Message 1749383.  
Last modified: 15 Dec 2015, 5:48:35 UTC

I'm just wondering what would happen if the App was compiled as 32-bit, would it even run?


On Windows, sure ;) (provided it was old enough toolkit used)
32 bit Exe on x64 Linux, even with 32 bit compat libraries, always resulted in build failure at best.

In examining make documentation/recommendations, and comparing with the quite functional sample oceanFFT on my Mac Pro (which has some striking similarities to what we really need), I came across this research paper that says 'don't do recursive makefiles', like this codebase does through inheriting from stock SaH.
http://aegis.sourceforge.net/auug97.pdf

IOW I believe I'll ditch what is probably the cause of most of the build issues, and go with a clean, more or less best practices, build. I'll cross the Boincapi/lib bridge when I come to it again, since clearly the approach described in the paper demands not using XCode either.

Who knows, could end up easier to use a similar approach on Linux and Windows as well, and easy to slot in gradle automation in place of Make if/when desired.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1749384 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 31347
Credit: 53,134,872
RAC: 32
United States
Message 1749391 - Posted: 15 Dec 2015, 6:26:11 UTC - in response to Message 1749354.  

Obviously the number is real, as it is running the system out of VM space. This problem doesn't exist with the other OSX Apps which are around 3 GBs or less. It is strictly a CUDA problem. I've run 6 tasks at once on my 3 ATI cards without a problem. It's a joke there is a problem running just 2 tasks on two nVidia cards. Someone is going to have to read the manual and then try to apply it to the code. Seeings as how there are some around here that seem to know where to start looking I would suggest they give hints. Just remember, the other Apps don't have this problem.

can you give the results of
$ ulimit -a

for your system.
I get
virtual memory          (kbytes, -v) unlimited
on my Mac O/S.
IIRC that will support the entire 2^64 address space, of course it can't actually use more than the free disk space (max swap file size) at once, and use means written to. And note this is a different number than top reports. IIRC top reports the amount of disk space available to be used, but calls it VM.

also the results of
$ ls -al /private/var/vm

might be interesting
ID: 1749391 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1749396 - Posted: 15 Dec 2015, 6:32:33 UTC

It is not the program (AP or MB) I think. Or then it is something that is common for both.
Look at the listing: Both MB and AP are reporting having allocated huge amounts of VM.
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                           
 2087 root      39  19   52972  49400   4236 R 100,3  0,6  13:55.25 ../../projects/setiathome.berkeley.edu/MBv7_7.05r2549_avx_linux32                                 
 2086 root      39  19   52460  48624   4244 R 100,0  0,6  13:55.58 ../../projects/setiathome.berkeley.edu/MBv7_7.05r2549_avx_linux32                                 
 2088 root      39  19   52972  45724   4232 R 100,0  0,6  13:56.27 ../../projects/setiathome.berkeley.edu/MBv7_7.05r2549_avx_linux32                                 
 2089 root      39  19   52972  49408   4236 R 100,0  0,6  13:56.26 ../../projects/setiathome.berkeley.edu/MBv7_7.05r2549_avx_linux32                                 
 2092 root      39  19   52972  49464   4236 R 100,0  0,6  13:55.77 ../../projects/setiathome.berkeley.edu/MBv7_7.05r2549_avx_linux32                                 
 2423 root      39  19   52944  43088   4272 R 100,0  0,5  11:37.75 ../../projects/setiathome.berkeley.edu/MBv7_7.05r2549_avx_linux32                                 
 3111 root      30  10 32,744g 713484 584612 R  29,2  8,8   0:09.04 ../../projects/setiathome.berkeley.edu/setiathome_x41zc_x86_64-pc-linux-gnu_cuda65 -pfb 16 -pfp + 
 3156 root      30  10 32,603g 566796 451332 R  26,9  7,0   0:02.08 ../../projects/setiathome.berkeley.edu/setiathome_x41zc_x86_64-pc-linux-gnu_cuda65 -pfb 16 -pfp + 
 3020 root      30  10 32,603g 567252 451780 S  25,9  7,0   0:29.68 ../../projects/setiathome.berkeley.edu/setiathome_x41zc_x86_64-pc-linux-gnu_cuda65 -pfb 16 -pfp + 
 2564 root      30  10 32,708g 492200 454484 S  12,6  6,1   1:09.35 ../../projects/setiathome.berkeley.edu/ap_7.01r2793_sse3_clGPU_x86_64 -unroll 18 -sbs 512 -oclFF+ 
 1972 root      20   0 2231888 103916  66796 S   1,3  1,3   0:12.03 ./boincmgr                                                                                        
  833 root      20   0  243880  75288  47448 S   1,0  0,9   0:18.53 /usr/bin/X -core :0 -seat seat0 -auth /var/run/l

To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1749396 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1749416 - Posted: 15 Dec 2015, 8:38:13 UTC - in response to Message 1749396.  
Last modified: 15 Dec 2015, 8:41:52 UTC

This is the first line of code calling to NVIDIA cuda library.

cerr = cudaGetDeviceCount(&numCudaDevices);

It is in cudaAcceleration.cpp.

Before that VIRT mem alloc is low (104 kb). After that call VIRT mem alloc is 32,xxx Gb.

I guess it is nothing to worry about. It has been that way as long as I can remember.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1749416 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1749424 - Posted: 15 Dec 2015, 8:56:20 UTC - in response to Message 1749416.  
Last modified: 15 Dec 2015, 8:57:28 UTC

This is the first line of code calling to NVIDIA cuda library.

cerr = cudaGetDeviceCount(&numCudaDevices);

It is in cudaAcceleration.cpp.

Before that VIRT mem alloc is low (104 kb). After that call VIRT mem alloc is 32,xxx Gb.

I guess it is nothing to worry about. It has been that way as long as I can remember.


I concur, As far as my computer science background goes. That background just says that once you're running a (Cuda) virtual machine, then you promise the earth (to this virtual machine) and starve it by necessity.... so vietual numbers are not relevant to the host.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1749424 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1749450 - Posted: 15 Dec 2015, 13:14:47 UTC - in response to Message 1749383.  


I found a possible answer to large VM, but it doesn't explain why in my case 6 + 2 + 2 = 23
http://stackoverflow.com/questions/11631191/why-does-the-cuda-runtime-reserve-80-gib-virtual-memory-upon-initialization

I'm just wondering what would happen if the App was compiled as 32-bit, would it even run?


Actually it gives same explanation you was given already yesterday.
What about BOINC/SETI specifics ? Any progress?
ID: 1749450 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1749462 - Posted: 15 Dec 2015, 14:17:32 UTC

And while you in Einstein and CUDA samples testing there is another suggestion: try to trace from where "out of memory" message comes. Is it result of some BOINC lib call? Or direct OS memory allocation failure. From this would depend if it's BOINC-level or OS/Runtime level issue.
ID: 1749462 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1749496 - Posted: 15 Dec 2015, 21:49:38 UTC - in response to Message 1749391.  

Obviously the number is real, as it is running the system out of VM space. This problem doesn't exist with the other OSX Apps which are around 3 GBs or less. It is strictly a CUDA problem. I've run 6 tasks at once on my 3 ATI cards without a problem. It's a joke there is a problem running just 2 tasks on two nVidia cards. Someone is going to have to read the manual and then try to apply it to the code. Seeings as how there are some around here that seem to know where to start looking I would suggest they give hints. Just remember, the other Apps don't have this problem.

can you give the results of
$ ulimit -a

for your system.
I get
virtual memory          (kbytes, -v) unlimited
on my Mac O/S.
IIRC that will support the entire 2^64 address space, of course it can't actually use more than the free disk space (max swap file size) at once, and use means written to. And note this is a different number than top reports. IIRC top reports the amount of disk space available to be used, but calls it VM.

also the results of
$ ls -al /private/var/vm

might be interesting

I was reading here, https://developer.apple.com/library/mac/documentation/Performance/Conceptual/ManagingMemory/Articles/AboutMemory.html
From that it would seem the hard limit is only if it actually writes to disk, something I don't think I've experienced. That's not as I remember it, but if that's the case then the Out of Memory errors are coming from somewhere else. They only happen after both cards have been running APs and then one tries to start a cuda.

The results;
Tom$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 256
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 709
virtual memory (kbytes, -v) unlimited

Tom$ ls -al /private/var/vm
total 0
drwxr-xr-x 2 root wheel 68 Dec 15 05:10 .
drwxr-xr-x 26 root wheel 884 Dec 9 01:42 ..
ID: 1749496 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1749505 - Posted: 15 Dec 2015, 22:08:19 UTC - in response to Message 1749462.  

And while you in Einstein and CUDA samples testing there is another suggestion: try to trace from where "out of memory" message comes. Is it result of some BOINC lib call? Or direct OS memory allocation failure. From this would depend if it's BOINC-level or OS/Runtime level issue.

I don't do Einstein. I still have No idea why both the older cuda app and the newer one uses a set 22.6 GB per task VM setting when the total System/NV GPU memory is 10GB. Any idea how to compile the App at 32-bit? I tried changing the configure line to --enable-bitness=32 but I don't see any major difference except a possible slight change in run-times. I used that setting on both the cuda App and the NV AP App. I'm about to run a few APs on both cards and see if there is any change when going back to cuda.

There is One change in the cuda App. Now it seems the tasks with an AR of around 1.08 error out in a couple of seconds instead of ~5 mintues. The old error was;
CUFFT error in file 'cuda/cudaAcc_fft.cu' in line 64
The New Error is;
Cuda error 'cudaAcc_transpose' in file 'cuda/cudaAcc_transpose.cu' in line 74 : invalid argument

So far that is the only noticeable change after compiling with --enable-bitness=32
ID: 1749505 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1749534 - Posted: 15 Dec 2015, 23:57:58 UTC - in response to Message 1749505.  

And while you in Einstein and CUDA samples testing there is another suggestion: try to trace from where "out of memory" message comes. Is it result of some BOINC lib call? Or direct OS memory allocation failure. From this would depend if it's BOINC-level or OS/Runtime level issue.

I don't do Einstein. I still have No idea why both the older cuda app and the newer one uses a set 22.6 GB per task VM setting when the total System/NV GPU memory is 10GB. Any idea how to compile the App at 32-bit? I tried changing the configure line to --enable-bitness=32 but I don't see any major difference except a possible slight change in run-times. I used that setting on both the cuda App and the NV AP App. I'm about to run a few APs on both cards and see if there is any change when going back to cuda.

There is One change in the cuda App. Now it seems the tasks with an AR of around 1.08 error out in a couple of seconds instead of ~5 mintues. The old error was;
CUFFT error in file 'cuda/cudaAcc_fft.cu' in line 64
The New Error is;
Cuda error 'cudaAcc_transpose' in file 'cuda/cudaAcc_transpose.cu' in line 74 : invalid argument

So far that is the only noticeable change after compiling with --enable-bitness=32


That is what my exe used to do. Have you checked your email if there is a fix you did not notice (3 emails).
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1749534 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1749560 - Posted: 16 Dec 2015, 1:30:42 UTC - in response to Message 1749534.  

That is what my exe used to do. Have you checked your email if there is a fix you did not notice (3 emails).

The last emails are dated Dec 7th, I'm pretty sure I used the two but may have missed the one '#ifdefined (__APPLE__) in analyzeFuncs.cpp'. The shorties did speedup afterwards but with the last compile they slowed back down. I suppose I could combine the files from all three and make sure they are used the next time.

New Error when trying to start a cuda task after running APs on both cards. The new one reads;
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
setiathome_CUDA: Found 2 CUDA device(s):
  Device 1: GeForce GTX 750 Ti, 2047 MiB, regsPerBlock 65536
     computeCap 5.0, multiProcs 5 
     pciBusID = 1, pciSlotID = 0
  Device 2: GeForce GTX 750 Ti, 2047 MiB, regsPerBlock 65536
     computeCap 5.0, multiProcs 5 
     pciBusID = 5, pciSlotID = 0
In cudaAcc_initializeDevice(): Boinc passed DevPref 1
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce GTX 750 Ti is okay
SETI@home using CUDA accelerated device GeForce GTX 750 Ti

setiathome enhanced x41zc, Cuda 6.50 special
Compiled with NVCC 7.5, using 6.5 libraries. Modifications done by petri33.

Detected setiathome_enhanced_v7 task. Autocorrelations enabled, size 128k elements.
Work Unit Info:
...............
WU true angle range is :  0.375063
Sigma 4
cudaMalloc errorNot enough VRAM for Autocorrelations...
setiathome_CUDA: CUDA runtime ERROR in device memory allocation, attempt 1 of 6
cudaAcc_free() called...
cudaAcc_free() running...
cudaAcc_free() PulseFind freed...
cudaAcc_free() Gaussfit freed...
cudaAcc_free() AutoCorrelation freed...
1,2,3,4,5,6,7,8,9,10,10,11,12,cudaAcc_free() DONE.
13 waiting 5 seconds...
 Reinitialising Cuda Device...
Cuda error 'Couldn't get cuda device count
' in file 'cuda/cudaAcceleration.cu' in line 151 : invalid resource handle.

</stderr_txt>

This error actually says 'CUDA runtime ERROR in device memory allocation' instead of the old error;
Cuda error 'cudaMalloc((void**) &dev_WorkData' in file 'cuda/cudaAcceleration.cu' in line 433 : out of memory

So....why is it having trouble with 'VRAM for Autocorrelations' after running APs? It only happens after both cards have been running APs.
ID: 1749560 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1749563 - Posted: 16 Dec 2015, 1:44:51 UTC - in response to Message 1749560.  
Last modified: 16 Dec 2015, 1:45:14 UTC

The messages after initial failure:
Reinitialising Cuda Device...
Cuda error 'Couldn't get cuda device count

indicate the driver had somehow crashed and hasn't had time to reset yet (hence the retries after delays coded in)

Sequence *looks* like some external program chewed up all the VRAM, causing the autcorrelation allocation failure, and causing a reset.

Superficially it looks like the logic on the Cuda app is doing the best it can with the GPU in a wacky state. How it got to that state is another question.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1749563 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1749565 - Posted: 16 Dec 2015, 2:04:36 UTC - in response to Message 1749563.  
Last modified: 16 Dec 2015, 2:32:03 UTC

Well, the ATI card is being used for the main display. The 2 750s are connected to an unused display that stays turned off. I first had the problem after the Stock APr2750 and tried to build another AP App, trying APr2709 in the meantime. Then I was able to get 64-bit APr2935 to work but the same thing happened and now I'm on the 32-bit(?) APr2935. Any AP after 2935 fails to compile, and I think that was the last MB CPU App I was able to compile. The only App I can get to compile with the newer Berkeley code is the MB ATI App, and the cuda app.

I dunno...
Maybe it's related to whatever causes both cards to SlowDown when one is running an AP while the other card runs a CUDA?

Well, after the last AP finished I was able to start a CUDA task on both cards WithOut having to reboot the machine. That's an improvement.
ID: 1749565 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1749572 - Posted: 16 Dec 2015, 2:49:50 UTC - in response to Message 1749565.  

Maybe it's related to whatever causes both cards to SlowDown when one is running an AP while the other card runs a CUDA?


Highly likely. Whichever queues are buried in the Mac OS + drivers, something there is filling up, most likely excessive synchronisation from app or a total combination effect. Since you have petri's mods for CudaMB, that's probably slightly less sync dense than stock, but faster, so probably evens out to be the same. Since faster and faster GPUs will keep hitting those driver restriction walls, especially in multiples, we'll be looking at alternative synchronisation methods, though some infrastructure is needed first.

Since the mac specific makefile approach appears to be playing ball here, it's probably something I'll play with straight after v8 mods are up (weaving in as an option amongst Petri's changes).
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1749572 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 31347
Credit: 53,134,872
RAC: 32
United States
Message 1749578 - Posted: 16 Dec 2015, 3:47:52 UTC - in response to Message 1749496.  

also the results of
$ ls -al /private/var/vm

might be interesting

I was reading here, https://developer.apple.com/library/mac/documentation/Performance/Conceptual/ManagingMemory/Articles/AboutMemory.html
From that it would seem the hard limit is only if it actually writes to disk, something I don't think I've experienced. That's not as I remember it, but if that's the case then the Out of Memory errors are coming from somewhere else. They only happen after both cards have been running APs and then one tries to start a cuda.

The results;
Tom$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 256
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 709
virtual memory (kbytes, -v) unlimited

As expected, unlimited or the 2^64 chip limit.

Tom$ ls -al /private/var/vm
total 0
drwxr-xr-x 2 root wheel 68 Dec 15 05:10 .
drwxr-xr-x 26 root wheel 884 Dec 9 01:42 ..

As that directory does not have an entry for swapfile your system has never run out of available real memory.

I had an idea on the out of memory error. I'm wondering if for some reason the previous job failed to release its GPU buffer space on exit. Essentially a memory leak. Unfortunately I don't know of a tool to test for that, but it may exist.
ID: 1749578 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1749583 - Posted: 16 Dec 2015, 4:32:30 UTC - in response to Message 1749578.  
Last modified: 16 Dec 2015, 4:33:13 UTC

I had an idea on the out of memory error. I'm wondering if for some reason the previous job failed to release its GPU buffer space on exit. Essentially a memory leak. Unfortunately I don't know of a tool to test for that, but it may exist.


Thought about that, and the corresponding Cuda app choke/retries amount to a test of the current state. running a repeated bench on shortened test tasks of the Cuda app, while trying suspect apps alongside, could identify whether the failure mode is triggered by a suspect at startup, during run, or on completion [and after some number of runs). Shutdown Behaviour under Boinc of a suspect app is a bit different than standalone, so probably would need to try the suspect standalone first, then under Boinc.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1749583 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1749699 - Posted: 16 Dec 2015, 16:52:59 UTC - in response to Message 1749534.  
Last modified: 16 Dec 2015, 17:18:09 UTC

That is what my exe used to do. Have you checked your email if there is a fix you did not notice (3 emails).

I compiled another app using the sources from the "#ifdefined (__APPLE__) in analyzeFuncs.cpp" email and the errors with the AR ~1.06 tasks are gone. The 'Memory' errors have also changed and are now saying postponed and something about a CUFFT error, I can't find the exact error I have copied around here somewhere. After the second AP finishes the cards will start the cuda tasks. Seems the number of inconclusives have increased since the last build.

I'll see about compiling a cuda app from the stock sources and try that with the APs the next time I have some APs.

###############

Here is the task that failed to start after running the APs;
A cuFFT plan FAILED, Initiating Boinc temporary exit (180 secs)

http://setiathome.berkeley.edu/result.php?resultid=4600492291
ID: 1749699 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1749737 - Posted: 16 Dec 2015, 20:05:57 UTC - in response to Message 1749699.  

That is what my exe used to do. Have you checked your email if there is a fix you did not notice (3 emails).

I compiled another app using the sources from the "#ifdefined (__APPLE__) in analyzeFuncs.cpp" email and the errors with the AR ~1.06 tasks are gone. The 'Memory' errors have also changed and are now saying postponed and something about a CUFFT error, I can't find the exact error I have copied around here somewhere. After the second AP finishes the cards will start the cuda tasks. Seems the number of inconclusives have increased since the last build.

I'll see about compiling a cuda app from the stock sources and try that with the APs the next time I have some APs.

###############

Here is the task that failed to start after running the APs;
A cuFFT plan FAILED, Initiating Boinc temporary exit (180 secs)

http://setiathome.berkeley.edu/result.php?resultid=4600492291


That validated ..
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1749737 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1749755 - Posted: 16 Dec 2015, 22:17:42 UTC - in response to Message 1749737.  
Last modified: 16 Dec 2015, 22:44:50 UTC

Yes, so far they've all validated except the one that had triplets after a restart. The question is why it had to wait for the other card to finish the AP before it would start the cuda.
I just switched over to the Stock App I compiled about a week ago to see how it handles the APs. The only change from stock is all the CC-1.x cards were removed and the cards up to CC-5.0 were added. So far it seems to be working, although much slower than the New code. Hopefully it won't time out on any tasks.

Here's the first shorty completed, http://setiathome.berkeley.edu/result.php?resultid=4602894114
setiathome enhanced x41zc (Sanity Check #3), Cuda 6.50 special ????

A few had validated already, let's see what happens when it hits this AP,
http://setiathome.berkeley.edu/result.php?resultid=4603595614
Will it slow everything down or keep going?

I need more APs.
ID: 1749755 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1749760 - Posted: 16 Dec 2015, 22:48:37 UTC - in response to Message 1749755.  

Yes, so far they've all validated except the one that had triplets after a restart. The question is why it had to wait for the other card to finish the AP before it would start the cuda.
I just switched over to the Stock App I compiled about a week ago to see how it handles the APs. The only change from stock is all the CC-1.x cards were removed and the cards up to CC-5.0 were added. So far it seems to be working, although much slower than the New code. Hopefully it won't time out on any tasks.

Here's the first shorty completed, http://setiathome.berkeley.edu/result.php?resultid=4602894114
setiathome enhanced x41zc (Sanity Check #3), Cuda 6.50 special ????

A few had validated already, let's see what happens when it hits this AP,
http://setiathome.berkeley.edu/result.php?resultid=4603595614
Will it slow everything down or keep going?

I need more APs.


special refers to my code.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1749760 · Report as offensive
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 58 · Next

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.