OpenCL apps are available for download on Lunatics


log in

Advanced search

Message boards : Number crunching : OpenCL apps are available for download on Lunatics

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 9 · Next
Author Message
Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4347
Credit: 1,124,182
RAC: 746
United States
Message 1333730 - Posted: 2 Feb 2013, 1:04:43 UTC

Further about doing the transition to r1760, the issue is the clFFTplan file ending with _32768_r1761.bin so that should be deleted unless you're sure it was created by the Astropulse application. The other _r1761.bin files won't be needed so if you like things neat they can also be deleted.

Joe

Wedge009
Volunteer tester
Avatar
Send message
Joined: 3 Apr 99
Posts: 367
Credit: 155,335,920
RAC: 126,688
Australia
Message 1333793 - Posted: 2 Feb 2013, 3:48:43 UTC

All right here's my updated test report for MB r1760 only (since there haven't been any significant changes for AP between r1363 and r1761 and they're all working just fine, ATI and NV both).


HD 6970

WinXP, Catalyst 11.2/11.7 kludge (only known stable configuration for my Cayman cards).

I never got to try this one with r1761. In fact, I had to pinch a WU from CPU to test r1760. But it seems to be working okay, CPU/GPU usage looks good. Time taken to complete is a little longer than usual (even considering the extra time used for binary compilation), but I've seen some MB WUs take this long even with r426. So I can't comment on relative performance till I get more WUs processed with MB r1760.

I picked this particular WU because it already has a result awaiting validation on the server, but unfortunately uploads (and downloads) seem to be dead for me at time of writing, so I'll have to wait till the server recovers before I can see if my result is valid. At least the spike/pulse/triplet/gaussian counts are consistent.

Edit: Oops, no, strike that. Uploads working again, albeit at a crawl. But the result was marked as valid!


HD 6950

WinXP, Catalyst 11.2/11.7 kludge (only known stable configuration for my Cayman cards).

I kept getting -9 result overflows with MB r1761 previously. I'm quite sure that I had AP r1761 running previously, and given that MB WUs are working again with r1760, I'd say reusing the AP r1761 bin was the problem. Looks like problem solved! The work-unit was marked as valid, in any case.


AMD C-50

Win7 32-bit, Catalyst 12.8.

Unfortunately, the good news doesn't carry over to the APUs. Result appears to be the same as before - after only a few seconds of GPU use, the driver crashed at about 0.792% progress.


AMD E-450

Win7 64-bit, Catalyst 13.1.

Given that the C-50 failed again and that the underlying GPU is very similar, I didn't feel it was worth the effort of rolling back to Catalyst 12.8 to try this one again.
____________
Soli Deo Gloria

Profile cov_routeProject donor
Avatar
Send message
Joined: 13 Sep 12
Posts: 310
Credit: 7,621,058
RAC: 1,085
Canada
Message 1333836 - Posted: 2 Feb 2013, 5:21:38 UTC

I finally broke down and decided to run a bench test.

Ran r390 and r1761 with catalyst 13.1 and a short test wu. First thing, r1761 ran without error. The card is a 6670 (Turks).

Second, 1761 was about 30% slower (iow run time was 40% longer).

Quick timetable WU : 17dc12ad.28185.12907.8.10.253.wu MB6_win_x86_SSE3_OpenCL_ATi_HD5_r390.exe : Elapsed 314.227 secs CPU 45.084 secs MB7_win_x86_SSE_OpenCL_ATi_r1761.exe : Elapsed 444.233 secs, speedup: -41.37% ratio: 0.71x CPU 45.583 secs, speedup: -1.11% ratio: 0.99x


So unless there's a science reason to switch I'll stay with r390 for the time being. If time allows I'll bench a few more random wu's in the days to come.

Observation: 1761 produced "WARNING: suboptimal workgroup size for PC_find_pulse_kernel_cl pass x" in stderr. From a quick look at the source I think that's just about wasteful memory alignment, shouldn't have much to do with speed.

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3631
Credit: 49,348,662
RAC: 29,643
Russia
Message 1333872 - Posted: 2 Feb 2013, 7:54:15 UTC - in response to Message 1333793.

All right here's my updated test report for MB r1760 only

Thanks for detailed report.
And what period_iterations_num value did you try already with C-60 ?
____________

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3631
Credit: 49,348,662
RAC: 29,643
Russia
Message 1333873 - Posted: 2 Feb 2013, 7:55:28 UTC - in response to Message 1333836.


Observation: 1761 produced "WARNING: suboptimal workgroup size for PC_find_pulse_kernel_cl pass x" in stderr. From a quick look at the source I think that's just about wasteful memory alignment, shouldn't have much to do with speed.

Actually it can affect performace. Suboptibal workgroup size meand underloading GPU. Could you give link on result with that warning ?

____________

Profile BilBg
Volunteer tester
Avatar
Send message
Joined: 27 May 07
Posts: 2894
Credit: 6,613,167
RAC: 7,940
Bulgaria
Message 1333879 - Posted: 2 Feb 2013, 8:17:57 UTC

My current app_info.xml have (for both MB and AP, as from the Lunatics_Win32_v0.40_setup.exe):
<plan_class>ati13ati</plan_class>

... and my client_state.xml have 95 lines <plan_class>ati13ati</plan_class>

(I don't have <platform>windows_intelx86</platform> in app_info.xml but client_state.xml have it for every <result>)


And in the new files:
MB7_win_x86_SSE_OpenCL_ATi.aistub
<plan_class>ati_opencl_sah</plan_class>
<plan_class>opencl_ati_sah</plan_class>

AP6_win_x86_SSE2_OpenCL_ATI.aistub
<plan_class>ati_opencl_100</plan_class>
<plan_class>opencl_ati_100</plan_class>


If I (anybody) just replace the sections for MB and AP in app_info.xml I think the tasks will be deleted.

Is there need in .aistub files to be included another section with <plan_class>ati13ati</plan_class>
And what about 64 bit users? Do they need another 3 sections with <platform>windows_x86_64</platform>
Another problem is <version_num>601</version_num> for AP (as from the Lunatics_Win32_v0.40_setup.exe)

So: use the new .aistub files with care (meaning: think before use, duplicate the sections and edit to fit your current situation)


____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

Wedge009
Volunteer tester
Avatar
Send message
Joined: 3 Apr 99
Posts: 367
Credit: 155,335,920
RAC: 126,688
Australia
Message 1333910 - Posted: 2 Feb 2013, 10:33:03 UTC - in response to Message 1333836.

And what period_iterations_num value did you try already with C-60 ?

Mine is a C-50, but essentially the same as your C-60 except for the turbo clock boost. Anyway, I normally use 24, but at 32 I still get a driver crash. Do you suggest going even higher?

Observation: 1761 produced "WARNING: suboptimal workgroup size for PC_find_pulse_kernel_cl pass x" in stderr.

I haven't seen this warning in my WU outputs, nor a consistent slow down in performance, but I still only have a small sample of completed WUs to make observations on.

Regarding the plan classes, I just manually switched them over to opencl_ati_sah, etc, in my client_state.xml before restarting BOINC. No problem... if you remember to do it.
____________
Soli Deo Gloria

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 25183
Credit: 34,749,905
RAC: 19,978
Germany
Message 1333915 - Posted: 2 Feb 2013, 10:46:39 UTC - in response to Message 1333910.
Last modified: 2 Feb 2013, 10:47:22 UTC

Mine is a C-50, but essentially the same as your C-60 except for the turbo clock boost. Anyway, I normally use 24, but at 32 I still get a driver crash. Do you suggest going even higher?


Definitely

Try 50 or more.
The GPU part is not that powerfull.
____________

Wedge009
Volunteer tester
Avatar
Send message
Joined: 3 Apr 99
Posts: 367
Credit: 155,335,920
RAC: 126,688
Australia
Message 1333922 - Posted: 2 Feb 2013, 11:36:29 UTC
Last modified: 2 Feb 2013, 11:37:30 UTC

I know very well that it's not very powerful. But I tried values of 64, 128, 256, 512 (I know there's no real reason to use powers of 2, I just did), but all of them resulted in driver crashes within seconds of the GPU being used. If there are any other suggestions, I'll give them a try.

Rolling back to r426 yet again.
____________
Soli Deo Gloria

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3631
Credit: 49,348,662
RAC: 29,643
Russia
Message 1333928 - Posted: 2 Feb 2013, 11:55:36 UTC - in response to Message 1333922.

I know very well that it's not very powerful. But I tried values of 64, 128, 256, 512 (I know there's no real reason to use powers of 2, I just did), but all of them resulted in driver crashes within seconds of the GPU being used. If there are any other suggestions, I'll give them a try.

Rolling back to r426 yet again.


Unfortunately overheating issue on my C-60 netbook only increased, now it shuts down even through AP offline testing with idle CPU. So i can't check how MB7 executes on C-60 for now. Maybe next week after repair...
____________

Wedge009
Volunteer tester
Avatar
Send message
Joined: 3 Apr 99
Posts: 367
Credit: 155,335,920
RAC: 126,688
Australia
Message 1333933 - Posted: 2 Feb 2013, 12:16:00 UTC

That's crazy. Do you know how to open it and try cleaning out the heat-sink and fan? It's a bit fiddly, but I learned how to open the case to replace the memory - the pre-installed 1 GiB was paltry.

Hopefully, it's not like what happened to my most powerful laptop - pipe between GPU and heat-sink seems to be no longer effective. I can feel the heat being transferred on the pipe between the heat-sink and CPU, but the GPU one is relatively cold, which tells me something is broken. >.< Well, it's only a Mobility Radeon 2600, so the best it could do was the old CAL-based hybrid AP, anyway.
____________
Soli Deo Gloria

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3631
Credit: 49,348,662
RAC: 29,643
Russia
Message 1333937 - Posted: 2 Feb 2013, 12:34:49 UTC - in response to Message 1333933.

That's crazy. Do you know how to open it and try cleaning out the heat-sink and fan?

I tried to oil fan spindle but w/o any success. Completely fan removal requires not only bottom case opening but almost complete netbook disassembly. Wanna try to get repair via service center (if warranty will work, it was bought in USA...) If not will have no choice but disassembly.
____________

Wedge009
Volunteer tester
Avatar
Send message
Joined: 3 Apr 99
Posts: 367
Credit: 155,335,920
RAC: 126,688
Australia
Message 1333940 - Posted: 2 Feb 2013, 12:40:46 UTC

So sounds like the fan itself is broken or not spinning very well, at best.

Well, never mind, then. Looks like I'll have to be your low-end Fusion tester for the time being. (:
____________
Soli Deo Gloria

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3631
Credit: 49,348,662
RAC: 29,643
Russia
Message 1333942 - Posted: 2 Feb 2013, 12:45:29 UTC - in response to Message 1333940.

So sounds like the fan itself is broken or not spinning very well, at best.

YEs, fan is broken. Sometimes it spins fast and noisless but soon after it begins strong vibration accompanying laud noise and rotation speed drops (and APU temp rises up to point of emergency shutdown near 90C )
____________

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4239
Credit: 34,937,158
RAC: 23,142
United Kingdom
Message 1333945 - Posted: 2 Feb 2013, 12:58:47 UTC - in response to Message 1333942.
Last modified: 2 Feb 2013, 12:59:03 UTC

So sounds like the fan itself is broken or not spinning very well, at best.

YEs, fan is broken. Sometimes it spins fast and noisless but soon after it begins strong vibration accompanying laud noise and rotation speed drops (and APU temp rises up to point of emergency shutdown near 90C )

My C2D T8100 Laptop has doing that, somethimes noisy, sometimes quiet, temps up to 90°C too, complete disassembly of Laptop was required to get at fan assembly, i cleaned the Fan assembly and Fan, and lubricated the Fan shaft with cycle oil, now temps ~65°C, still 5°C too high for my liking, but good enough,

Claggy

Profile cov_routeProject donor
Avatar
Send message
Joined: 13 Sep 12
Posts: 310
Credit: 7,621,058
RAC: 1,085
Canada
Message 1333968 - Posted: 2 Feb 2013, 14:32:09 UTC - in response to Message 1333873.


Observation: 1761 produced "WARNING: suboptimal workgroup size for PC_find_pulse_kernel_cl pass x" in stderr. From a quick look at the source I think that's just about wasteful memory alignment, shouldn't have much to do with speed.

Actually it can affect performace. Suboptibal workgroup size meand underloading GPU. Could you give link on result with that warning ?

I got that from a benchmark run. I could send you the result somehow if you like.

In other news, I tried benchmarking another longer-running wu (r1761 & cat 13.1) and I got a driver restart.

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 25183
Credit: 34,749,905
RAC: 19,978
Germany
Message 1333970 - Posted: 2 Feb 2013, 14:36:48 UTC - in response to Message 1333873.


Observation: 1761 produced "WARNING: suboptimal workgroup size for PC_find_pulse_kernel_cl pass x" in stderr. From a quick look at the source I think that's just about wasteful memory alignment, shouldn't have much to do with speed.

Actually it can affect performace. Suboptibal workgroup size meand underloading GPU. Could you give link on result with that warning ?


It does apear on VHARs only.
Times are still very good.

____________

JarrettH
Send message
Joined: 14 Nov 02
Posts: 72
Credit: 13,432,294
RAC: 7,651
Canada
Message 1334086 - Posted: 2 Feb 2013, 22:04:12 UTC

Core 2 Duo E6600 (2.4 GHz)
GeForce 550 Ti

Astropulse on CPU ~12 hours
Astropulse on GPU 90 minutes for the first unit
____________

Wedge009
Volunteer tester
Avatar
Send message
Joined: 3 Apr 99
Posts: 367
Credit: 155,335,920
RAC: 126,688
Australia
Message 1334172 - Posted: 3 Feb 2013, 5:03:31 UTC
Last modified: 3 Feb 2013, 5:05:26 UTC

One more report from me.

HD 4670

WinXP, Catalyst 11.12.

I did this one earlier, but didn't bother reporting it. I didn't expect MB to work since maximum work-group size is only 128, and I was right: r1760 seemed to get stuck on compiling the FFT plans. The bin_V6 file was compiled okay.

This is the same AGP card that had been plagued with several invalid AP results in the past (you might remember helping me with this one, Raistmer). Interesting that in recent months, it seems to have stabilised enough on r1363 and r1761 is also working okay so far. I noticed that Catalyst 12.1 produces a lot more invalid results than Catalyst 11.12.


HD 4850

WinXP, Catalyst 11.12.

This one I have set-up on the computer I made for my brother. No work has been done on the Radeon card for a long, long time, but so far the stability issues I remembered from previously seem to be no longer present. At least from what we've seen so far. AP works great, MB r1760/r1761 also running without apparent issue. But the work-units will take a while to complete, so can't comment on their correctness just yet.
____________
Soli Deo Gloria

Profile cov_routeProject donor
Avatar
Send message
Joined: 13 Sep 12
Posts: 310
Credit: 7,621,058
RAC: 1,085
Canada
Message 1334291 - Posted: 3 Feb 2013, 15:36:28 UTC

TLDR: Only 2 sample points: r1761 run time is 35-55% longer than r390 on a 6670.

Couple of test MB wu's:

HD6670, Phenom II 945, PCI Express 1.1 x16, Win 7 64, Catalyst 12.8

Quick timetable WU : 04ja13ag.2565.9474.14.10.95.wu MB6_win_x86_SSE3_OpenCL_ATi_HD5_r390.exe : Elapsed 2260.388 secs CPU 159.979 secs MB7_win_x86_SSE_OpenCL_ATi_r1761.exe : Elapsed 3503.245 secs, speedup: -54.98% ratio: 0.65x CPU 149.901 secs, speedup: 6.30% ratio: 1.07x WU : 17dc12ad.28185.12907.8.10.253.wu MB6_win_x86_SSE3_OpenCL_ATi_HD5_r390.exe : Elapsed 349.442 secs CPU 45.037 secs MB7_win_x86_SSE_OpenCL_ATi_r1761.exe : Elapsed 472.977 secs, speedup: -35.35% ratio: 0.74x CPU 46.301 secs, speedup: -2.81% ratio: 0.97x


Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 9 · Next

Message boards : Number crunching : OpenCL apps are available for download on Lunatics

Copyright © 2014 University of California