Message boards :
Number crunching :
I've Built a Couple OSX CUDA Apps...
Message board moderation
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 58 · Next
Author | Message |
---|---|
Tom Rinehart Send message Joined: 12 Dec 01 Posts: 113 Credit: 13,255,975 RAC: 6 |
I added -no_caching to mb_cmdline_mac_OpenCL_sah.txt. It is running on both iMacs with no files generated at run time. Actually, I forgot to add it on the 21.5" iMac and it did generate the file. I stopped Boinc, deleted the file, added -no_caching, and restarted Boinc. It seems to be working with no files generated. The one strange thing happening is that on my 27" iMac with a quad core i7, it will only run 7 CPU tasks and 1 GPU task. With the v7 apps, it would run 8 CPU tasks and 1 GPU task. All the CPU tasks in the queue that aren't running say "Waiting for shared memory" instead of "Ready to start." Some of the GPU tasks say "Waiting to run." Is this an issue with the MBv8_8.0r3304_sse41_x86_64-apple-darwin app? On the 21.5" iMac with a dual core i3, it runs 4 CPU tasks and 1 GPU task as expected. |
Tom Rinehart Send message Joined: 12 Dec 01 Posts: 113 Credit: 13,255,975 RAC: 6 |
My 27" iMac completed one WU and started another without any issues. This is the completed one: http://setiathome.berkeley.edu/result.php?resultid=4653876425 |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I added -no_caching to mb_cmdline_mac_OpenCL_sah.txt. It is running on both iMacs with no files generated at run time. Actually, I forgot to add it on the 21.5" iMac and it did generate the file. I stopped Boinc, deleted the file, added -no_caching, and restarted Boinc. It seems to be working with no files generated. It sounds as though when you started without the -no_caching cmd it tried to start the GPU tasks one right after the the other, not sure. "Waiting for shared memory" and "Waiting to run" means the task has been started at least once. Try looking in the BOINC Data folder/Slots folder and see how many slot folders you have. If you have many it means some tasks have tried to start. You can open the stderr.txt in the higher numbered folders and see what it says near the bottom. When running OpenCL tasks you should reserve One CPU core per running GPU task. If you don't, the GPU task will not be fed properly and run much slower than it should. The easiest way to reserve a core is to open the Menu bar Tools/Computing preference/processor usage tab and set On Multiprocessor systems use at most to 99% instead of 100%. |
Tom Rinehart Send message Joined: 12 Dec 01 Posts: 113 Credit: 13,255,975 RAC: 6 |
My 21.5" iMac also completed a WU and successfully moved on to the next one with no issues. http://setiathome.berkeley.edu/result.php?resultid=4653913023 |
Tom Rinehart Send message Joined: 12 Dec 01 Posts: 113 Credit: 13,255,975 RAC: 6 |
I'm living dangerously now. I got my wife to let me test the OpenCL app on her MacBook Pro overnight. The computer is: MacBook Pro (13-inch, Mid 2012) Processor 2.9 GHz Intel Core i7 Memory 8 GB 1600 MHz DDR3 Graphics Intel HD Graphics 4000 1536 MB I edited app_info.xml and made two changes: change <plan_class>opencl_ati_sah</plan_class> to <plan_class>opencl_intel_gpu_sah</plan_class> change <type>ATI</type> to <type>intel_gpu</type> It is working. I did not add -no_caching, and it did create the files. I'll see what happens when it finishes the WU. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Nice. I was just mentioning that to someone else. That One App should run on any GPU that has OpenCL. If it works as expected I'll send it to Eric as a place holder until dedicated Apps can be developed and tested. Thanks for giving it a try. |
Tom Rinehart Send message Joined: 12 Dec 01 Posts: 113 Credit: 13,255,975 RAC: 6 |
It seemed to work: http://setiathome.berkeley.edu/result.php?resultid=4654074407 It is now working on the next WU with no issues. Boinc reports the Intel GPU as OpenCL 1.2. Is the issue with the ATI HD 4XXX series GPUs that they are OpenCL 1.0 and it doesn't deal with the files correctly? |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
It seemed to work: http://setiathome.berkeley.edu/result.php?resultid=4654074407 Since that GPU is much more modern, and has more vram, you will probably see better performance if you change the settings to; -sbs 128 -oclfft_tune_gr 256 -oclfft_tune_wg 128 -period_iterations_num 32 or even; -sbs 256 -oclfft_tune_gr 256 -oclfft_tune_wg 128 -period_iterations_num 32 The problems with the HD4 cards are they were the first with Any type of OpenCL and even that was listed as Beta support. The HD4 cards were never listed as having Full OpenCL support and the 4670 and below have Never worked with SETI@Home MB tasks on any platform. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Nice. I was just mentioning that to someone else. That One App should run on any GPU that has OpenCL. If it works as expected I'll send it to Eric as a place holder until dedicated Apps can be developed and tested. Yep,in theory of "ideal spherical OpenCL implementation in vacuum" it should. In reality there are so many vendor specific bugs and tweaks in runtimes that to have one build working at least somewhow on at least 2 vendors is almost miracle ;) |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Is the issue with the ATI HD 4XXX series GPUs that they are OpenCL 1.0 and it doesn't deal with the files correctly? App still OpenCL 1.0 compliant. The issue with OpenCL implementation for HD4xxx cards. On Windows side (Linux too perhaps) AMD never released OpenCL runtime for HD4xxx. All were beta only and AMD stressed that every time. On Apple's side I don't know who of them developed runtime but it seems they gone not too much further... Maybe issue with reduced WG size for low-end HD4xxx part of family (I had HD4870 that worked well until burnt :) ) and some bug of re-loading kernels that need re-configuration for lower WG size (and for some reason not saved properly with reduced WG size)... but I don't feel it's too actual to dig further than working -no_caching workaround for now. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Nice. I was just mentioning that to someone else. That One App should run on any GPU that has OpenCL. If it works as expected I'll send it to Eric as a place holder until dedicated Apps can be developed and tested. Seems we're in that mystical area then, or maybe the twilight zone. I ran the App on my 750Ti at Beta and didn't receive a single inconclusive, http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=63959&offset=20. It also ran on my ATI 6870 in standalone. It's running on Tom's ATI HD4 cards and his Intel GPU. 3 for 3. *nods head* |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30914 Credit: 53,134,872 RAC: 32 |
I added -no_caching to mb_cmdline_mac_OpenCL_sah.txt. It is running on both iMacs with no files generated at run time. Actually, I forgot to add it on the 21.5" iMac and it did generate the file. I stopped Boinc, deleted the file, added -no_caching, and restarted Boinc. It seems to be working with no files generated. Let me point you over here http://setiathome.berkeley.edu/forum_thread.php?id=40722 for the waiting for shared memory issue. It is a Mac OS default issue. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
The one strange thing happening is that on my 27" iMac with a quad core i7, it will only run 7 CPU tasks and 1 GPU task. With the v7 apps, it would run 8 CPU tasks and 1 GPU task. All the CPU tasks in the queue that aren't running say "Waiting for shared memory" instead of "Ready to start." Some of the GPU tasks say "Waiting to run." Is this an issue with the MBv8_8.0r3304_sse41_x86_64-apple-darwin app? On the 21.5" iMac with a dual core i3, it runs 4 CPU tasks and 1 GPU task as expected. The BOINC developers addressed that issue several years ago, by switching to a different form of inter-process communication which doesn't rely on using so much shared memory. If you look at my sample app_info file for the Mac at the top of this forum (sticky thread), you'll see an extra line. <api_version>7.7.0</api_version> If you add that line to your app_info.xml file, BOINC should use the 'new' method and not over-tax the shared memory segment. [That's my theoretical understanding, at any rate. Not driving a Mac myself, I can't test it. The actual API version number itself is not important - anything from 6.1.0 (dating from 2008) should do. Just in case anything changes in the future, public releases should use the API version string embedded in the application via the linked BOINC library] |
Tom Rinehart Send message Joined: 12 Dec 01 Posts: 113 Credit: 13,255,975 RAC: 6 |
<api_version>7.7.0</api_version> Adding this fixed it. Thanks! |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30914 Credit: 53,134,872 RAC: 32 |
<api_version>7.7.0</api_version> @Richard, which means it is not the default, so I wouldn't call it fixed yet, just an available option. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
As mentioned in another thread, http://setiathome.berkeley.edu/forum_thread.php?id=78704&postid=1753958#1753958, trying to compile on a Mac with boinc-master 7.7 simply doesn't work. It will fail with the "clang: error: linker command failed with exit code 1" every time. It Still doesn't work even with boinc-master 7.5 when trying to compile from the seti_boinc repository folder. After giving up on trying to get around the 'Graphics' Errors, setting seti_boinc to 'no graphics' runs all the way to the end without any Errors then Fails with "clang: error: linker command failed with exit code 1" So, All my OSX Apps have been compiled with boinc-master 7.5. It seems Urs was finally able to compile an OSX App, I wonder which version he is using. I posted a couple AVX Apps at Crunchers Anonymous for anyone wanting to try them on their Mac, and I'm thinking about posting the latest CUDA build as it appears to be working very well. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
<api_version>7.7.0</api_version> Actually, I disagree. Look in BOINC's Version History file, and scroll all the way down to version 6.2.6: client: copy api_version of APP_VERSIONs in scheduler reply, even if we already have the APP_VERSION. Otherwise, when upgrading from 5.10 to 6.2, we won't have the api_version, and we won't learn about it until project releases new version. That was the final step in the fix, and it dates back to 29 May 2008. Note that you had to go back to July 2007 to find a reference to the spy-hill.net workround. The difference arises from the two different ways of supplying science applications for BOINC to use: stock, or 'Anonymous Platform'. The stock pathway has been automated for many years. A Project Scientist writes a new science application for the project, and hands it over to the Project Administrator (who may be the same person, as here). The Project Administrator runs the update_versions script, which, inter alia, updates api_version: The version number of the BOINC API used by the app. Notes: And the job is done - no shared memory complaints from users of Mac applications delivered as stock from project servers. For Anonymous Platform deployment, none of that automation applies: you have to roll your own app_info.xml file, or wait for somebody like me to come along and write one for you. The Anonymous Platform documentation has indicated the use of <api_version> since v6.1.0, but to be honest, I ignored it - I couldn't see the point. It makes no visible difference on my platform, Windows. It was only a couple of years ago, when a special extension to <api_version> was added at the request of Bitcoin Utopia (!), that I started looking into the processes in any detail, and worked out what I've written above. And decided that it still didn't really matter for Windows - though there are indications that using <api_version> might make BOINC slightly more reliable [it mandates the use of PID monitoring to replace heartbeats]. But it does make an important difference for Mac, as Tom has confirmed. The moral of this story is that as and when TBar succeeds in building an OSX CUDA App, and is ready to offer it for wider distribution, he needs to package it with an app_info.xml file including an <api_version> marker (as documented), to implement the 2008 fix. And now, if you'll excuse me, I've got about 14 more app_info files to write this afternoon, and about 20 old ones to modify. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
That's weird (Both TBat and Richard), because firstly the V7 app I built against Boinc 7.7 libs/api using the XCode project supplied, and secondly that app crunched flawlessly without any api version entry in the app_info.xml, as did the CPU app (wherever I got it from). Differences due to el capitan and/or more Boinc master changes maybe ? Either way, going to try a baseline build myself, so as to get the 780 working on Beta. Not sure whether it'll work on prior OSX at all. Might just ask Eric if he could roll one out with appropriate Cuda driver restriction, and see what it works on and what it doesn't. Anyway, beast is fired up, updating code, and will see what happens "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
Note that Tom said that the problem only happened on his 8 CPU + GPU host. Macs have shared memory configured by default - just not enough of it for big multi-core beasts, as spy-hill.net found all those years ago. Unless you push a big Mac really hard with anonymous platform apps, you'll probably never encounter a problem. (Maybe Apple finally realised they had to make a different configuration for your 16 cores?) |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
no idea, lol. Anyway, off and running on beta, with Cuda 7.5. 2up on the 780. Will see if the good validation character holds for a bit, do some or another Linux build, get that online at beta too, then probably ask Eric if he has any idea what OS versions & Cuda versions might be needed ( given Pre-Fermi's are basically completely defunct on these platforms [Due to 64 bitness...]) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.