I've Built a Couple OSX CUDA Apps...

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 34 · 35 · 36 · 37 · 38 · 39 · 40 . . . 58 · Next

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1813097 - Posted: 28 Aug 2016, 0:17:49 UTC - in response to Message 1813084.  
Last modified: 28 Aug 2016, 0:18:21 UTC

Nice, I see it now. Hopefully the headbanging is over...


Not quite yet. We're probably on some good leads for the pulse variations, though that's the stickiest area in the whole codebase/algorithm (very few relevant science papers). So headbanging could continue for a bit, depending on if Petri finds something simple while I arrange for the more detailed dissection/analysis/autopsy.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1813097 · Report as offensive
Profile Gianfranco Lizzio
Volunteer tester
Avatar

Send message
Joined: 5 May 99
Posts: 39
Credit: 28,049,113
RAC: 87
Italy
Message 1813134 - Posted: 28 Aug 2016, 5:34:18 UTC - in response to Message 1813076.  

Hello Gianfranco,

I see you've gotten 41p_zi3e running. I haven't had any success with Toolkit 7.5 in Darwin 15.4 or Ubuntu 14.04. I get the same Errors, and also get the same Error with Toolkit 8.0 with 15.4;
Undefined symbols for architecture x86_64:
"cudaAcc_initialize(float (*) [2], int, int, unsigned long, double, double, double, double, int, double, long, bool)", referenced from:
seti_analyze(ANALYSIS_STATE&) in seti_cuda-analyzeFuncs.o
ld: symbol(s) not found for architecture x86_64


Hi TBar,
I got the same error but Petri send me a new cudaAcceleration.cu that works correctly without the error.

Gianfranco
I don't want to believe, I want to know!
ID: 1813134 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1813206 - Posted: 28 Aug 2016, 16:00:19 UTC - in response to Message 1813134.  

Hello Gianfranco,

I see you've gotten 41p_zi3e running. I haven't had any success with Toolkit 7.5 in Darwin 15.4 or Ubuntu 14.04. I get the same Errors, and also get the same Error with Toolkit 8.0 with 15.4;
Undefined symbols for architecture x86_64:
"cudaAcc_initialize(float (*) [2], int, int, unsigned long, double, double, double, double, int, double, long, bool)", referenced from:
seti_analyze(ANALYSIS_STATE&) in seti_cuda-analyzeFuncs.o
ld: symbol(s) not found for architecture x86_64


Hi TBar,
I got the same error but Petri send me a new cudaAcceleration.cu that works correctly without the error.

Gianfranco


I've sent that to TBar too now. Jason's already got it.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1813206 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1813524 - Posted: 29 Aug 2016, 18:32:22 UTC

There seems to be some version number confusion. Current minimum opencl_nvidia_mac requires OSX <min_os_version>110402</min_os_version>. From below it looks like you're asking for a max OSX version below the current minimum OSX version.


Anyone knows how this number translates into human-readable OS X version and how "lower than Darwin 15.4" Will be translated into <max_os_version> tag value?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1813524 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1813544 - Posted: 29 Aug 2016, 19:00:33 UTC - in response to Message 1813524.  

There seems to be some version number confusion. Current minimum opencl_nvidia_mac requires OSX <min_os_version>110402</min_os_version>. From below it looks like you're asking for a max OSX version below the current minimum OSX version.


Anyone knows how this number translates into human-readable OS X version and how "lower than Darwin 15.4" Will be translated into <max_os_version> tag value?


If I read this correctly, 11.4.2 is the kernel version, so equates as OSX 10.7.5, Lion.

https://en.wikipedia.org/wiki/Darwin_(operating_system)

ID: 1813544 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1813546 - Posted: 29 Aug 2016, 19:08:26 UTC - in response to Message 1813544.  

So, what restriction tag should looks like?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1813546 · Report as offensive
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 388
Credit: 1,857,738
RAC: 0
Finland
Message 1813799 - Posted: 30 Aug 2016, 14:50:22 UTC - in response to Message 1813546.  

So, what restriction tag should looks like?


<max_os_version>150399</max_os_version> , or in other words, one less than 15.4?
ID: 1813799 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1813801 - Posted: 30 Aug 2016, 15:09:48 UTC
Last modified: 30 Aug 2016, 15:14:44 UTC

Just thought I'd add that it was the DeskTops that began producing mostly Inconclusive results with Darwin 15.4. Most of the nVidia Macs are Laptops, and the Laptops began producing Inconclusives with Darwin 15.0. My own observations confirm the problem started with the Laptops Last Year around October, and is also noted here, We are aware of some issues users of Premiere Pro may experience after upgrading to OS X El Capitan.

Apparently the change in Darwin 15.4 was an attempt to 'correct' the problems that began with Darwin 15.0. The 15.4 update just finished the job that began with 15.0.
ID: 1813801 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1813843 - Posted: 30 Aug 2016, 16:43:33 UTC - in response to Message 1813801.  

So, it should be
<max_os_version>149999</max_os_version>, right?
I'll pass it to Eric.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1813843 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1815062 - Posted: 4 Sep 2016, 19:40:43 UTC
Last modified: 4 Sep 2016, 19:49:23 UTC

We are close to the One Year Anniversary. Next month will be the One Year mark since the Apple nVidia Laptops began Producing mostly Incorrect results due to the OpenCL changes in El Capitan (Darwin 15.0). One can only guess how many of those Laptops managed to co-validate and send incorrect results into the database. One Year, thousands of machines...

The next Anniversary will be March 21st, that will be the One Year mark since the Apple nVidia Desktops also began producing incorrect results with Darwin 15.4. One can only wonder if the problem will still exist Mar 21st, it seems likely it will exist next month.

People using Apple nVidia Laptops might be interested to know there is only One result sent to the Database and used to search for ET. That result, called 'canonical', is usually the first listed result that is matched by another Host. Since your machine's GPU doesn't produce a matched result, it will Never be used as 'canonical'. Basically this means None of your nVidia GPU work will Ever be used in the search for ET. Basically you've wasted your GPUs time for the last Year. Fortunately, your CPU results are usually valid and don't have this problem. Unless you have one of those Laptops running Darwin 11.4.x (Lion) with an AVX CPU, in that case you are just wasting your AVX CPU's time.
Oh well, Happy Anniversary!
ID: 1815062 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30650
Credit: 53,134,872
RAC: 32
United States
Message 1815082 - Posted: 4 Sep 2016, 22:48:36 UTC - in response to Message 1815062.  

We are close to the One Year Anniversary. Next month will be the One Year mark since the Apple nVidia Laptops began Producing mostly Incorrect results due to the OpenCL changes in El Capitan (Darwin 15.0). One can only guess how many of those Laptops managed to co-validate and send incorrect results into the database. One Year, thousands of machines...

The next Anniversary will be March 21st, that will be the One Year mark since the Apple nVidia Desktops also began producing incorrect results with Darwin 15.4. One can only wonder if the problem will still exist Mar 21st, it seems likely it will exist next month.

People using Apple nVidia Laptops might be interested to know there is only One result sent to the Database and used to search for ET. That result, called 'canonical', is usually the first listed result that is matched by another Host. Since your machine's GPU doesn't produce a matched result, it will Never be used as 'canonical'. Basically this means None of your nVidia GPU work will Ever be used in the search for ET. Basically you've wasted your GPUs time for the last Year. Fortunately, your CPU results are usually valid and don't have this problem. Unless you have one of those Laptops running Darwin 11.4.x (Lion) with an AVX CPU, in that case you are just wasting your AVX CPU's time.
Oh well, Happy Anniversary!

So does someone have a fix? Or is the fix crunch other projects?
ID: 1815082 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1815086 - Posted: 4 Sep 2016, 23:54:55 UTC - in response to Message 1815082.  
Last modified: 4 Sep 2016, 23:59:48 UTC

So does someone have a fix? Or is the fix crunch other projects?

I suppose that depends on your definition of 'Fix'. For myself, I'd say there has been a fix since January, back when the CUDA Baseline Apps were first posted at Crunchers Anonymous.
Those Apps worked fine under Anonymous platform with the Desktop Macs, but had strange and inconsistent results on the Laptops. Later it was discovered the Codebase was missing a 5 year old change in the BOINC API. That has been fixed and once again the Desktop Macs are working fine with the App at Beta. The Laptops are also working fine...mostly. It seems they are still giving strange and inconsistent messages in the stderr_txt. They seem to be finishing the task with Valid results, but Sometimes produce the line "A cuFFT plan FAILED, Initiating Boinc temporary exit (180 secs)" in the stderr_txt. Other times the messages don't appear at all, http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24706683...strange. That would be another one for Jason to comment on.

So, All the Machines being tested are finishing the tasks with Valid results in what would be considered normal run times for the Baseline CUDA Apps. Most would call that 'Fixed'. It's definitely much better than the situation that has existed over the last year.

It would be nice if More Machines were running the Apps at Beta. It would Also be nice if the Arecibo VLARs were NOT sent to the Baseline CUDA Apps. It has been known for 7 or 8 years those Arecibo VLARs don't work well with the Baseline App, there isn't any justification to test the App against a task that will never be run. All it does is annoy the tester with screen Lag and waste time. The BLC VLAR tasks don't produce screen Lag on my machines, the Arecibo VLARs do.
ID: 1815086 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1815091 - Posted: 5 Sep 2016, 0:14:31 UTC - in response to Message 1815086.  
Last modified: 5 Sep 2016, 0:15:17 UTC


It would Also be nice if the Arecibo VLARs were NOT sent to the Baseline CUDA Apps.

Perhaps it's quite hard to implement.
Server doesn't know "baseline app".
It knows GPU app. So VLAR could be either sent to GPU app or not. Cause some of GPU apps support VLAR and need testing to support it better we have situation we have. Ultimately host will crunch with capable app.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1815091 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1815096 - Posted: 5 Sep 2016, 0:36:09 UTC - in response to Message 1815091.  

Unfortunately the only Mac nVidia Apps at Beta that consistently produce the Correct results are the CUDA Apps. If you try one of the other Apps you will receive results such as this, Validation inconclusive (53). I don't see any advantage in having a tester waste a day, or more, on running tasks that produce the wrong results and won't be used on Main anyway.

Aren't we supposed to lose Arecibo tasks anyway? Then why are we testing Arecibo VLARs that are an endangered species when all they do is annoy the tester? Again, the BLC VLARs don't annoy the tester, and are not an endangered species.
ID: 1815096 · Report as offensive
JLDun
Volunteer tester
Avatar

Send message
Joined: 21 Apr 06
Posts: 573
Credit: 196,101
RAC: 0
United States
Message 1815126 - Posted: 5 Sep 2016, 5:09:39 UTC - in response to Message 1815096.  

Aren't we supposed to lose Arecibo tasks anyway?

The last I remember hearing is Arecibo was "just" not running as often and was under consideration for shutdown, unless funding was found.

Either way, we'd end up in a situation similar to the end of v7- waiting on the last tasks to run and validate before being able to ignore it....
ID: 1815126 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1815137 - Posted: 5 Sep 2016, 6:36:27 UTC - in response to Message 1815096.  

Unfortunately the only Mac nVidia Apps at Beta that consistently produce the Correct results are the CUDA Apps.

The same situation - too hard to separate. You looking only into OS X, but VLAR either enabled or disabled for GPU apps, not on the OS basis.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1815137 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1815143 - Posted: 5 Sep 2016, 7:18:21 UTC - in response to Message 1815137.  

Nevermind. I just solved my 14 hour Arecibo VLAR problem. I set them to use the x41p_zi3f App. It was either that or Abort them. For anyone else attempting to test Mac Apps for Main, that finds themselves with up to 50 annoying tasks that take an hour a piece, and will never be run on Main or anywhere else soon, I'd just Abort most of them. Absolutely No Good Reason to waste a Day, or 2, on Tasks that will never be run anywhere soon.

I'd like to remind certain people that most people at SETI don't have the latest $500 GPUs that don't have problems with these tasks. The Arecibo VLARs were banned from Main for Good reason. Hopefully they will be removed from the face of the Earth soon, strange how their replacements don't have this problem.
ID: 1815143 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1815204 - Posted: 5 Sep 2016, 16:50:04 UTC - in response to Message 1815143.  

strange how their replacements don't have this problem.

There is no "VLAR replacements". GBT targeted observations have different configuration hence different longest PoT arrays and hence different performance with existing code. But they are in no way "Arecibo VLAR replacement", they are just another data for processing.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1815204 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1815385 - Posted: 6 Sep 2016, 19:05:13 UTC

Eric implemented plan class restrictions. Please check if affected Macs now don't get OpenCL NV work.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1815385 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1815393 - Posted: 6 Sep 2016, 19:44:46 UTC
Last modified: 6 Sep 2016, 20:09:41 UTC

@TBar,

I just Upgraded SETI Main on Andromeda to your CUDA75 application. (Upgraded from CUDA65.) I followed the directions to copy the three files into the Projects' Folder, then reinstalled BOINC. Changed BOINC to Allow New Work for SETI, picked up a few new Units; but, they still show as CUDA65... Is this normal??? Shouldn't they show as CUDA75?

I Opened the Appinfo.xml and checked, and the CUDA75 info does show... But the Units reflect as CUDA65.


TL

[EDIT:]

They seem to be running faster, though. Just completed four Units, (2 per card), and roughly 36.5 min per Unit.


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1815393 · Report as offensive
Previous · 1 . . . 34 · 35 · 36 · 37 · 38 · 39 · 40 . . . 58 · Next

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.