Can I prevent longer 7.03 WUs being downloaded? (GPU problem)

Message boards : Number crunching : Can I prevent longer 7.03 WUs being downloaded? (GPU problem)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Graham Thomas

Send message
Joined: 19 May 99
Posts: 32
Credit: 3,880,875
RAC: 4
United Kingdom
Message 1651592 - Posted: 11 Mar 2015, 9:08:29 UTC
Last modified: 11 Mar 2015, 9:09:53 UTC

I haven't seen any Astropulse units for a long time now, but I do get lots of seti@home v7 7.03 WUs downloaded for my GPU. This in itself isn't a problem, but units scheduled to take a longer time to complete end up stalling. They reach a certain percentage of their worktime (usually around 20%), then stick at that percentage and amount of time completed, while incrementing the time to completion. Also, at the point when the WU stalls, my screen display goes blank and then recovers.

These WUs are typically scheduled to take 2-3 hours on my machine, whereas shorter WUs that are scheduled to complete in just over an hour run without problems. I've been told (in response to an earlier post here) that there are some combinations of GPU hardware and drivers that just don't handle the longer units. (Details of my setup are below.)

I've got used to aborting the longer WUs before they run, but it's tedious. Is there a way to make sure they don't download in the first place? From the spec of the cc_config.xml file, it looks like it might be possible to stop *all* v7.03 WUs from downloading (via the exclude_gpu option, though I'm no expert here), but that would leave my GPU idle until Astropulse units start flowing again. Is there any way to distinguish between longer and shorter WUs?

Here's my setup:

11/03/2015 08:51:59 | | CAL: ATI GPU 0: AMD Radeon HD 6520G/6530D/6550D/6620G (SuperSumo) (CAL version 1.4.1848, 512MB, 479MB available, 960 GFLOPS peak)
11/03/2015 08:51:59 | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 6520G/6530D/6550D/6620G (SuperSumo) (driver version 1642.5 (VM), device version OpenCL 1.2 AMD-APP (1642.5), 512MB, 479MB available, 960 GFLOPS peak)
11/03/2015 08:51:59 | | OpenCL CPU: AMD A8-3850 APU with Radeon(tm) HD Graphics (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1642.5 (sse2), device version OpenCL 1.2 AMD-APP (1642.5))
11/03/2015 08:51:59 | | Processor: 4 AuthenticAMD AMD A8-3850 APU with Radeon(tm) HD Graphics [Family 18 Model 1 Stepping 0]
11/03/2015 08:51:59 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni cx16 popcnt syscall nx lm svm sse4a osvw ibs skinit wdt page1gb rdtscp 3dnowext 3dnow
11/03/2015 08:51:59 | | OS: Microsoft Windows 7: Home Premium x64 Edition, Service Pack 1, (06.01.7601.00)
Graham Thomas
ID: 1651592 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34255
Credit: 79,922,639
RAC: 80
Germany
Message 1651596 - Posted: 11 Mar 2015, 9:26:24 UTC

That`s not possible.

Check your project folder for a file called mb_cmdline_win_x86_SSE_OpenCL_ATi_HD5.txt

Open and add the following.

-period_iterations_num 50 -hp

Save it again.
On next WU startup those params should be in use.


With each crime and every kindness we birth our future.
ID: 1651596 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1651770 - Posted: 11 Mar 2015, 19:14:35 UTC - in response to Message 1651636.  

@ Mike - I'm also getting a lot of 7.03 jobs that are taking 8 hours to get to 50% on an ATI HD4770. Normally they take 2 1/2 hours to complete.

Would the advice to Graham help me?

I was having a few ATI tasks run slow unless they were suspended and then resumed. Raising the SBS setting helped. If you haven't raised it, try using the setting;
—sbs 256 -period_iterations_num 48
ID: 1651770 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34255
Credit: 79,922,639
RAC: 80
Germany
Message 1651897 - Posted: 11 Mar 2015, 23:39:11 UTC - in response to Message 1651770.  

@ Mike - I'm also getting a lot of 7.03 jobs that are taking 8 hours to get to 50% on an ATI HD4770. Normally they take 2 1/2 hours to complete.

Would the advice to Graham help me?

I was having a few ATI tasks run slow unless they were suspended and then resumed. Raising the SBS setting helped. If you haven't raised it, try using the setting;
—sbs 256 -period_iterations_num 48


On older apps like stock r_1831 its better to use -sbs 216 instead.


With each crime and every kindness we birth our future.
ID: 1651897 · Report as offensive
Profile Graham Thomas

Send message
Joined: 19 May 99
Posts: 32
Credit: 3,880,875
RAC: 4
United Kingdom
Message 1652028 - Posted: 12 Mar 2015, 10:39:21 UTC - in response to Message 1651596.  

Thanks, Mike.

However, I can't see a file called

mb_cmdline_win_x86_SSE_OpenCL_ATi_HD5.txt

in my

C:\ProgramData\BOINC\projects\setiathome.berkeley.edu

folder (which I hope is what you meant by 'projects' folder).

I do, though, have two (empty) files there called:

mb_cmdline_7.03_opencl_ati_sah.txt
mb_cmdline_7.03_opencl_ati5_sah.txt

Would it be useful to add the -period_iterations_num line to either or both of these?
Graham Thomas
ID: 1652028 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1652030 - Posted: 12 Mar 2015, 10:43:45 UTC - in response to Message 1652028.  

Thanks, Mike.

However, I can't see a file called

mb_cmdline_win_x86_SSE_OpenCL_ATi_HD5.txt

in my

C:\ProgramData\BOINC\projects\setiathome.berkeley.edu

folder (which I hope is what you meant by 'projects' folder).

I do, though, have two (empty) files there called:

mb_cmdline_7.03_opencl_ati_sah.txt
mb_cmdline_7.03_opencl_ati5_sah.txt

Would it be useful to add the -period_iterations_num line to either or both of these?

Yes.

Mike has used the file name that is used for anonymous platform, while you're running Stock, and the filenames for that are different.

Claggy
ID: 1652030 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1652031 - Posted: 12 Mar 2015, 10:46:44 UTC - in response to Message 1651592.  
Last modified: 12 Mar 2015, 11:15:36 UTC

You can also try deleting the compilations the app made, and let it generate new ones, see this post for what to delete.
Edit 2: An alternative is to do a Detach/Reattach instead (Remove project/Add project in Boinc 7 speak)

Edit: Actually, looking at your result history, I'd say you must do this immediately, before you try any other changes:

This AstroPulse v6 v6.06 (opencl_ati_100) had you running Cat 11.6, with SDK 2.4
http://setiathome.berkeley.edu/result.php?resultid=3556802831
OpenCL Platform Name: AMD Accelerated Parallel Processing
Number of devices: 1
Max compute units: 5
Max work group size: 256
Max clock frequency: 600Mhz
Max memory allocation: 209715200
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 536870912
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Queue properties:
Out-of-Order: No
Name: BeaverCreek
Vendor: Advanced Micro Devices, Inc.
Driver version: CAL 1.4.1417 (VM)
Version: OpenCL 1.1 AMD-APP-SDK-v2.4 (650.9)
Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing

This AstroPulse v7 v7.04 (opencl_ati_100) wu has the stderr.txt report that your drivers are too low:
http://setiathome.berkeley.edu/result.php?resultid=3777658873
Running on device number: 0
Priority of worker thread raised successfully
Priority of process adjusted successfully, below normal priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
BOINC assigns device 0
Info: BOINC provided OpenCL device ID used
ERROR: unsupported OpenCL runtime version: OpenCL 1.1 AMD-APP-SDK-v2.4 (650.9). Please update drivers! Exiting...

The Astropulse apps being fairly recent will redo their compilations when the driver/APP runtime is changed,
this isn't the case with the ATI/AMD Seti_v7 apps, they're quite old, they do their compilations once, so it's entirely possible you're done compilations for Seti_v7 on the incompatible SDK 2.4 SDK,
the minimum driver/SDK combination is Cat 11.7 with CAL 1.4.1457, SDK 2.5, and OpenCL runtime 2.5.684.213

ATI Driver Version Cheat Sheet

Claggy
ID: 1652031 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1652410 - Posted: 13 Mar 2015, 8:24:09 UTC - in response to Message 1652031.  

The minimum driver/SDK combination is Cat 11.7 with CAL 1.4.1457, SDK 2.5, and OpenCL runtime 2.5.684.213

ATI Driver Version Cheat Sheet

Are you sure? (are the stock apps so old?)
Better to use as minimum Catalyst 11.12 (SDK 2.6 / AMD-APP 831.4) to be prepared if the stock apps are updated to the current 'optimized'. (or if the user decide later to use Lunatics installer)
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1652410 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34255
Credit: 79,922,639
RAC: 80
Germany
Message 1652436 - Posted: 13 Mar 2015, 9:54:12 UTC

Well, r_1831 is almost 2 years old.
May 2013.
It would make sense to upgrade to a more recent app if possible.


With each crime and every kindness we birth our future.
ID: 1652436 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1652446 - Posted: 13 Mar 2015, 10:14:33 UTC - in response to Message 1652410.  
Last modified: 13 Mar 2015, 10:16:43 UTC

The minimum driver/SDK combination is Cat 11.7 with CAL 1.4.1457, SDK 2.5, and OpenCL runtime 2.5.684.213

ATI Driver Version Cheat Sheet

Are you sure? (are the stock apps so old?)
Better to use as minimum Catalyst 11.12 (SDK 2.6 / AMD-APP 831.4) to be prepared if the stock apps are updated to the current 'optimized'. (or if the user decide later to use Lunatics installer)

No, I'm no longer sure about that, I have a memory of when the ATI/AMD Seti_v7 app was released that it was sent to devices with too low a CAL/SDK version,
whether it was SDK 2.4 and SDK 2.5 that was too low, or just SDK 2.4, I don't remember,
Eric then put in a restriction so work was only sent to devices with particular CAL/SDK version, but even isn't 100% fool proof, we still see hosts with mismatched driver versions that manage to get work.

Claggy
ID: 1652446 · Report as offensive
Profile Graham Thomas

Send message
Joined: 19 May 99
Posts: 32
Credit: 3,880,875
RAC: 4
United Kingdom
Message 1652464 - Posted: 13 Mar 2015, 11:46:02 UTC - in response to Message 1652446.  

OK, now I'm a bit confused.

Some of the results reported above (especially the Astropulse 6 result) are from several months ago, before I updated my ATI drivers. They were certainly old then, and I updated after I saw that Astropulse 7 units were exiting (or stalling - can't remember - as soon as they started.

You can see what I'm running now from my post at the top of this thread. I've just checked my AMD Catalyst Control Center and the software update checker tells me I'm up-to-date (I'm running the standard version, not the beta.) The Catalyst software suite version is given as 14.501.1003.etc

I can see a CAL version from my seti@home event log. It's 1.4.1848, which is higher than the minimum given in posts above. My OpenCL version is 1.2 and as far as I know it just comes with the Catalyst suite. I can't see any mention of SDK.

I can certainly try detaching from the project and reattaching. But what would that change?
Graham Thomas
ID: 1652464 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1652474 - Posted: 13 Mar 2015, 12:31:13 UTC - in response to Message 1652464.  
Last modified: 13 Mar 2015, 12:31:38 UTC

I can certainly try detaching from the project and reattaching. But what would that change?

Because the ATI/AMD app is quite old, it only does compilations on it's first run, If you were running SDK 2.4 or 2.5 drivers at the time, then it'll have produced broken compilations. (the same goes with Catalyst 13.1),
Doing a Project Reset won't remove the compilations as that only removes files mentioned in the client_state.xml, the compilations aren't, a Detach/Reattach will remove the compilations as it clears out the whole directory,
then allowing the Current APP runtime to regenerate them.

Claggy
ID: 1652474 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1652479 - Posted: 13 Mar 2015, 13:09:00 UTC - in response to Message 1652464.  

OK, now I'm a bit confused.

Some of the results reported above (especially the Astropulse 6 result) are from several months ago, before I updated my ATI drivers. They were certainly old then, and I updated after I saw that Astropulse 7 units were exiting (or stalling - can't remember - as soon as they started.

You can see what I'm running now from my post at the top of this thread. I've just checked my AMD Catalyst Control Center and the software update checker tells me I'm up-to-date (I'm running the standard version, not the beta.) The Catalyst software suite version is given as 14.501.1003.etc

I can see a CAL version from my seti@home event log. It's 1.4.1848, which is higher than the minimum given in posts above. My OpenCL version is 1.2 and as far as I know it just comes with the Catalyst suite. I can't see any mention of SDK.

I can certainly try detaching from the project and reattaching. But what would that change?

The CAL version that BOINC reports isn't very useful for SETI@home. As SETI@home doesn't use CAL applications. It does let us know that driver 13.12 or newer is installed for ATI cards up to the HD7000 series. The Rx 2xx cards do not support CAL & report no CAL version.
The OpenCL version reported of 1.2 is the version of OpenCL the hardware/driver supports. This information also isn't super useful.
The information that IS useful is located in the stderr_txt for the completed tasks.
Looking at a recent AP v7 task you did. We see:
Driver version:				 1642.5 (VM)
Version:				 OpenCL 1.2 AMD-APP (1642.5)

If we compare that to the ATI Driver Version Cheat Sheet I try to keep up to date. We see 1642.5 version, which is really 10.10.1642.5, & that tells us you are using the most recent SDK/OpenCL.

As Claggy is saying there are 3 parts to watch for with ATI drivers/apps. The driver, the SDK/OCL component (which sometimes doesn't get updated when installing a new driver), & the compiled files that the SETI@home applications generate. Those compiled files from older apps/driver versions should be removed. So the current configuration can be compiled.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1652479 · Report as offensive
Admiral Gloval
Avatar

Send message
Joined: 31 Mar 13
Posts: 20238
Credit: 5,308,449
RAC: 0
United States
Message 1652486 - Posted: 13 Mar 2015, 13:35:43 UTC

I am running Boinc 7.4.36 stock / Cat 14.12 / AMD FX 4170 4.2 GHz / AMD XFX 260X

Running 4 MB on CPU / 1 MB on GPU
Are these desent running times for my GPU?


ID: 1652486 · Report as offensive
Profile Graham Thomas

Send message
Joined: 19 May 99
Posts: 32
Credit: 3,880,875
RAC: 4
United Kingdom
Message 1653136 - Posted: 15 Mar 2015, 10:43:48 UTC - in response to Message 1652486.  

Thanks for all that extra info. Very useful.

I've now detached from and reattached to the project. I noticed that the apps were downloaded again, so I guess that the recompilations have been done.

I tried leaving some of the longer 7.03 units in the list to see if they would complete. The first one that was processed did stall, but it got further along (ca 60%) than was the case previously. Now I'll try leaving just the 'shorter longer' units (if that makes sense - those scheduled to take, say, 2 hrs 15 mins to finish rather than the 3 hrs or more scheduled for the longest ones) and see what happens.

I'll then try adding the line suggested above to the two files.

(Oh, and I also tried letting Device Manager update my display driver, just in case my AMD Catalyst suite wasn't up-to-date. Big mistake. It 'updated' to a 2011 driver and totally knocked out the GPU. Thank goodness for the roll-back feature.)

Admiral Gloval: it looks like your machine is processing the 7.03 units about twice as fast as mine. How good that it depends of course on how capable your processor is supposed to be (there are comparison tables you can search for), but from here it looks OK.
Graham Thomas
ID: 1653136 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1653548 - Posted: 16 Mar 2015, 14:38:38 UTC - in response to Message 1653136.  


(Oh, and I also tried letting Device Manager update my display driver, just in case my AMD Catalyst suite wasn't up-to-date. Big mistake. It 'updated' to a 2011 driver and totally knocked out the GPU. Thank goodness for the roll-back feature.)

Yeah.. it is best to NEVER let MS do anything concerning your video driver. If you want to know which driver is the latest then it will be on http://support.amd.com/en-us/download, but when using BOINC projects the latest driver is not, or often really, the best driver to use.
The drivers distributed by MS don't include the OpenCL component. So it wouldn't do you any good, or you might end up with a driver and OpenCL version mismatch. Which would generally just cause your GPU to trash all of the work it tries to do.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1653548 · Report as offensive

Message boards : Number crunching : Can I prevent longer 7.03 WUs being downloaded? (GPU problem)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.