OpenCL apps are available for download on Lunatics

Author	Message
Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1334313 - Posted: 3 Feb 2013, 17:30:16 UTC You cant compare HD5 app version against non HD5 app. Stay tuned we will release new HD5 app soon. With each crime and every kindness we birth our future. ID: 1334313 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1334356 - Posted: 3 Feb 2013, 18:52:53 UTC Last modified: 3 Feb 2013, 18:55:21 UTC For those having trouble running Cat 13.1. Just unistall APP runtime and use pre version under same drivers. You might want to use a file manager like system commander. Shut down Boinc Go to C:/AMD/support/13-1_vista_win7_win8_64_dd_ccc_whql/Packages/apps/OpenCL64 directory and run OpenCL.msi to uninstall APP runtime. Now follow this steps carefully. Restart your computer. Now unpack Cat 12.8. by clicking on it but abort installation and run OpenCL.msi from 12.8 package. Restart your computer again. Check with GPUz or GPUCapsviewer that OpenCL is checked. Now you are running Cat 13.1. with OpenCL runtime from 12.8. Hope this helps. With each crime and every kindness we birth our future. ID: 1334356 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1334446 - Posted: 3 Feb 2013, 22:08:42 UTC - in response to Message 1334356. For those having trouble running Cat 13.1. Just unistall APP runtime and use pre version under same drivers.... I don't suppose it would be as easy as: start with clean install of 12.8, run 13.1 install but don't check the box for "APP SDK Runtime" ? IE keep the previous APP runtime? ID: 1334446 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1334451 - Posted: 3 Feb 2013, 22:23:08 UTC - in response to Message 1334446. Last modified: 3 Feb 2013, 22:24:40 UTC For those having trouble running Cat 13.1. Just unistall APP runtime and use pre version under same drivers.... I don't suppose it would be as easy as: start with clean install of 12.8, run 13.1 install but don't check the box for "APP SDK Runtime" ? IE keep the previous APP runtime? Its not that hard at all. Took me 20 minutes. Some might want to keep 13.1. for new games. Edit: Its for those who already have 13.1 installed. With each crime and every kindness we birth our future. ID: 1334451 ·

TRuEQ & TuVaLu Volunteer tester Send message Joined: 4 Oct 99 Posts: 505 Credit: 69,523,653 RAC: 10	Message 1334453 - Posted: 3 Feb 2013, 23:14:36 UTC 1'st task done with 1761 http://setiathome.berkeley.edu/result.php?resultid=2821308526 Speed is about the same as 1363 and it works, which is the important thing. Good Work!!! ID: 1334453 ·

Wedge009 Volunteer tester Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553	Message 1339804 - Posted: 20 Feb 2013, 20:25:21 UTC Last modified: 20 Feb 2013, 20:25:57 UTC I've noticed a warning message that's been bothering me since r1761: WARNING: can't open binary kernel file for oclFFT plan: C:\Documents and Settings\All Users\Application Data\BOINC/projects/setiathome.berkeley.edu\AP_clFFTplan_ATIRV770_32768_r1766.bin, continue with recompile... On two of my hosts, I get this message with every single work-unit (and, obviously, I haven't purged any binary caches). This is what I've been able to find, relating to this warning message: Affected hosts both run Windows XP. Affected hosts have this warning message on HD 4000 GPUs. HD 6000 GPUs (only other ATI GPUs I have) are not affected. This is only an issue on ATI AstroPulse. NV AstroPulse and ATI MB are not affected. Issue appears to be present in r1761, r1764, and r1766. It was not present in r1636. I'm just wondering, is this going to be a problem for me? Are these hosts really going to spend CPU cycles recompiling for every single AP WU? Soli Deo Gloria ID: 1339804 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1339855 - Posted: 20 Feb 2013, 22:03:49 UTC Its not really a problem. Nothing changed depending binary creation since r1316 AFAIK. Yes, it will create the binaries on each AP unit. Maybe you changed something else. Boinc/driver etc ? With each crime and every kindness we birth our future. ID: 1339855 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1339857 - Posted: 20 Feb 2013, 22:13:39 UTC - in response to Message 1339804. Last modified: 20 Feb 2013, 22:17:34 UTC I've noticed a warning message that's been bothering me since r1761: WARNING: can't open binary kernel file for oclFFT plan: C:\Documents and Settings\All Users\Application Data\BOINC/projects/setiathome.berkeley.edu\AP_clFFTplan_ATIRV770_32768_r1766.bin, continue with recompile... On two of my hosts, I get this message with every single work-unit (and, obviously, I haven't purged any binary caches). This is what I've been able to find, relating to this warning message: Affected hosts both run Windows XP. Affected hosts have this warning message on HD 4000 GPUs. HD 6000 GPUs (only other ATI GPUs I have) are not affected. This is only an issue on ATI AstroPulse. NV AstroPulse and ATI MB are not affected. Issue appears to be present in r1761, r1764, and r1766. It was not present in r1636. I'm just wondering, is this going to be a problem for me? Are these hosts really going to spend CPU cycles recompiling for every single AP WU? Stop boinc locate that file. Does it exist? rename it. start boinc file recreated? compare recreated one with saved one -are they identical? EDIT: Is this result from affected host? http://setiathome.berkeley.edu/result.php?resultid=2841704346 If yes better to stop with ATi AP on that host, it produces invalids. Perhaps low-end HD4xxx no more supported at all. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1339857 ·

Wedge009 Volunteer tester Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553	Message 1339883 - Posted: 21 Feb 2013, 1:02:17 UTC I'll double-check when I get home, but I have restarted BOINC and the whole system at least a few times since r1761 was first released. I haven't changed the Catalyst drivers, but that's a good point: the affected hosts are using Catalyst 11.12, and none of my other hosts use that particular version. As for the invalid results: yes, I'm aware of those. That's the AGP HD 4670 you helped me with some time ago. The funny thing about that card is that its failures (and successes) are inconsistent. With BOINC and the host remaining on continuously and no driver or other system changes, there will be a bunch of valid AP results for a while, then a bunch of invalid results, then a bunch of valid results... I do know that Catalyst 12.1 gives a lot of invalid results on this card, though, but Catalyst 11.12 is okay. Soli Deo Gloria ID: 1339883 ·

Wedge009 Volunteer tester Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553	Message 1340084 - Posted: 22 Feb 2013, 4:38:49 UTC Phew, site is back on-line... Interestingly, I just realised that the ATI bin cache is not being written to disk at all, which would explain the warning messages. Now the puzzle is why it's not being written to disk. The directory is not write-protected in anyway, and the other .bin_V6 and .wisdom cache files are being written as normal. Any idea why the bin file isn't getting saved with each WU run? Soli Deo Gloria ID: 1340084 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1340129 - Posted: 22 Feb 2013, 7:19:38 UTC - in response to Message 1340084. Phew, site is back on-line... Interestingly, I just realised that the ATI bin cache is not being written to disk at all, which would explain the warning messages. Now the puzzle is why it's not being written to disk. The directory is not write-protected in anyway, and the other .bin_V6 and .wisdom cache files are being written as normal. Any idea why the bin file isn't getting saved with each WU run? Cause "patching required" for oclFFT. It means that after kernel formation oclFFT setup code finds that it can't execute prepared kernels on particular device. And re-creates some of them. There were reports that after binary cache introduction app stopped to work on low-ATi devices (workgroup of 128). So, I disabled cache creation in such case. It will always though such warning but should each time create "right" kernels. But look slike last statement not the case. But not because of binary cache at last. App was compiled with SDK 2.5 so you need corresponding driver to run w/o issues. We can try SDK 2.4-compiled one insttead as debugging effort, but it will take some time to get. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1340129 ·

Wedge009 Volunteer tester Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553	Message 1340149 - Posted: 22 Feb 2013, 8:25:53 UTC Ah, that makes sense. The 'patching' warning, that is, as both hosts where the .bin cache is not present also exhibit the patching warning. That's fine, as long as I know there's no major problem, I'm satisfied, thanks for the answer. APP SDK 2.5 support is supposed to start with Catalyst 11.7, so both hosts should be okay at Catalyst 11.12, anyway. Just for the record, the HD 4850 has a workgroup size of 256, though it produces the patching warning. Somewhat related, the invalid results you were observing on my HD 4670 earlier, I have a sneaking suspicion - though no valid proof yet - that the invalid results may be caused by CPU starvation. It's a very old Pentium 4 CPU on this host, so I don't have any CPU tasks running on it, but the AstroPulse NV tasks do cause a lot of CPU usage. I'm aware of the high-CPU usage issue in recent GeForce drivers... what's the most recent set of drivers that did not exhibit this high-CPU issue? After waiting so long for AMD to fix Catalyst's high-CPU usage problem, I'm disappointed Nvidia still hasn't fixed theirs... Soli Deo Gloria ID: 1340149 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1340152 - Posted: 22 Feb 2013, 8:33:56 UTC - in response to Message 1340149. Last modified: 22 Feb 2013, 8:49:08 UTC Somewhat related, the invalid results you were observing on my HD 4670 earlier, I have a sneaking suspicion - though no valid proof yet - that the invalid results may be caused by CPU starvation. It's a very old Pentium 4 CPU on this host, so I don't have any CPU tasks running on it, but the AstroPulse NV tasks do cause a lot of CPU usage. I'm aware of the high-CPU usage issue in recent GeForce drivers... what's the most recent set of drivers that did not exhibit this high-CPU issue? After waiting so long for AMD to fix Catalyst's high-CPU usage problem, I'm disappointed Nvidia still hasn't fixed theirs... IIRC around 266.58 or so (maybe earlier, someone else can correct me) was the last nv OpenCL 1.0 driver after when they went to multithreaded drivers. These older ones use the blocking style synchronisation semantics required for that kindof coding to use low amounts of CPU. All nVidia OpenCL 1.1 drivers will be expecting the developer to implement GPU callback functionality ( ala. OpenGL/DirectX ) for low CPU usage, as blocking synchronisation is no longer used with multithreaded drivers at low levels (a form of 'deadlock prevention', moving to 'non-blocking' APIs ). "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1340152 ·

Wedge009 Volunteer tester Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553	Message 1340167 - Posted: 22 Feb 2013, 9:29:39 UTC Thanks for that. I haven't done any OpenCL development before, but I think I understand what's going on there. After looking around, it seems OpenCL 1.1 was introduced with GeForce 280.26, with the release notes for GeForce 275.33 indicating OpenCL support is at 1.0. You're saying the CPU issue relates to the multi-threading drivers introduced with OpenCL 1.1 or did the switch to multi-threading happen before that? Also, you say that OpenCL 1.1 is dependent on the developer to implement GPU call-back - is this the case for the current AstroPulse NV application? I might give 275.33 a try tomorrow. Soli Deo Gloria ID: 1340167 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1340169 - Posted: 22 Feb 2013, 10:07:49 UTC - in response to Message 1340167. ...After looking around, it seems OpenCL 1.1 was introduced with GeForce 280.26, with the release notes for GeForce 275.33 indicating OpenCL support is at 1.0. You're saying the CPU issue relates to the multi-threading drivers introduced with OpenCL 1.1 or did the switch to multi-threading happen before that? It's a while back now, but i recall a bit of overlap due to the usual transition bugs etc. Also, you say that OpenCL 1.1 is dependent on the developer to implement GPU call-back - is this the case for the current AstroPulse NV application? Not sure which direction Raistmer went in after OpenCL 1.0, but yes. I'm not aware if AMDs OpenCL 1.2 implementation works around this with Cuda like syncs at a higher level, or a change/ammendment to the opencl specification (or both). [Including callback functionailty derived from preexisting OpenGL, DirectX and Windows Core Audio technologies] "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1340169 ·

Wedge009 Volunteer tester Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553	Message 1340176 - Posted: 22 Feb 2013, 10:42:25 UTC Last modified: 22 Feb 2013, 10:43:54 UTC Thanks for the information. I thought of one more point: I notice that running AP and MB simultaneously on the same NV GPU results in far lower overall CPU usage (on the GPU tasks) than running multiple AP WUs on a single NV card only. I notice this pattern across all hosts, where those hosts also run CPU-only tasks, on recent GeForce drivers. Does this match your experience of running tasks on NV GPUs? Soli Deo Gloria ID: 1340176 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1340185 - Posted: 22 Feb 2013, 12:02:03 UTC - in response to Message 1340176. Last modified: 22 Feb 2013, 12:02:24 UTC Does this match your experience of running tasks on NV GPUs? Technically no, because I don't run GPU AP. x41zc is pretty low CPU usage on its own here. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1340185 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1340257 - Posted: 22 Feb 2013, 16:08:47 UTC - in response to Message 1340176. Thanks for the information. I thought of one more point: I notice that running AP and MB simultaneously on the same NV GPU results in far lower overall CPU usage (on the GPU tasks) than running multiple AP WUs on a single NV card only. I notice this pattern across all hosts, where those hosts also run CPU-only tasks, on recent GeForce drivers. Does this match your experience of running tasks on NV GPUs? It can be explained if postulate that OpenCL sync uses busy-wait loop while CUDA uses interrupt-based signalling (or smth more modern signalling that I'm not aware of but not busy-wait loop). Then, if GPU busy with both OpenCL AP and CUDA MB kernels there is time for AP kernel to complete before driver thread starts to poll CPU constantly. Hence, lower CPU usage overall. Also, I see BIG drop in CPU usage for the same reason between idle and busy CPU (studied in details on APU, observed on Intel "APU" (Ivy Bridge) and on "CPU consuming" NV drivers on GTX 260). If CPU busy with some work it will take longer time to switch to polling driver thread. Hence, GPU may report completion on first iteration saving CPU cycles for useful work. Unfortunately, in this case CPU can easely become TOO busy to feed GPU unit with new work so, while CPU consumption will be quite slow, GPU load and overall host performance will drop considerably. Hence, freeing core for GPU feeding is some sort of compromise. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1340257 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1340269 - Posted: 22 Feb 2013, 16:41:16 UTC - in response to Message 1340257. Thanks for the information. I thought of one more point: I notice that running AP and MB simultaneously on the same NV GPU results in far lower overall CPU usage (on the GPU tasks) than running multiple AP WUs on a single NV card only. I notice this pattern across all hosts, where those hosts also run CPU-only tasks, on recent GeForce drivers. Does this match your experience of running tasks on NV GPUs? It can be explained if postulate that OpenCL sync uses busy-wait loop while CUDA uses interrupt-based signalling (or smth more modern signalling that I'm not aware of but not busy-wait loop). Then, if GPU busy with both OpenCL AP and CUDA MB kernels there is time for AP kernel to complete before driver thread starts to poll CPU constantly. Hence, lower CPU usage overall. Also, I see BIG drop in CPU usage for the same reason between idle and busy CPU (studied in details on APU, observed on Intel "APU" (Ivy Bridge) and on "CPU consuming" NV drivers on GTX 260). If CPU busy with some work it will take longer time to switch to polling driver thread. Hence, GPU may report completion on first iteration saving CPU cycles for useful work. Unfortunately, in this case CPU can easely become TOO busy to feed GPU unit with new work so, while CPU consumption will be quite slow, GPU load and overall host performance will drop considerably. Hence, freeing core for GPU feeding is some sort of compromise. Which takes us back to a question which I think has been raised before. Is it possible, or is it not possible, for a developer to code for "GPU callback functionality ( ala. OpenGL/DirectX )" under the OpenCL 1.1 platform on NVidia hardware, using open-source GPL-compliant development environments? If so, could somebody put together a recommended toolset and programming guide for study? ID: 1340269 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1340271 - Posted: 22 Feb 2013, 16:49:57 UTC - in response to Message 1340269. Last modified: 22 Feb 2013, 16:50:35 UTC If so, could somebody put together a recommended toolset and programming guide for study? Like the nVidia OpenCL Ocean Demo (from the GPU compute SDK) ? "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1340271 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.