AMD OpenCL AstroPulse: can work fine w/o free cores

Message boards : Number crunching : AMD OpenCL AstroPulse: can work fine w/o free cores
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1451790 - Posted: 9 Dec 2013, 11:43:19 UTC
Last modified: 9 Dec 2013, 11:44:14 UTC

Use -cpu_lock option for that.
If you run 2 tasks per GPU add -instances_per_device 2 to command line also.

Here is picture from Trinity APU running AstroPulse with _ALL_ 4 CPU cores loaded with CPU MB. Times still can be improved a little leaving CPU free but no more GPU stalls with drastic GPU load falls when affinity set to single core.



Many thanks to cov_route who noticed this feature first and encouraged me to change -cpu_lock option behavior.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1451790 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34257
Credit: 79,922,639
RAC: 80
Germany
Message 1451807 - Posted: 9 Dec 2013, 12:25:26 UTC

The problem stil remains if one gets 2 high blanked units.
My FX still causes GPU into stall.

I made a test with 50 AP`s last week.




With each crime and every kindness we birth our future.
ID: 1451807 · Report as offensive
Profile cov_route
Avatar

Send message
Joined: 13 Sep 12
Posts: 342
Credit: 10,270,618
RAC: 0
Canada
Message 1452566 - Posted: 11 Dec 2013, 2:40:40 UTC - in response to Message 1451807.  

The problem stil remains if one gets 2 high blanked units.
My FX still causes GPU into stall.

I made a test with 50 AP`s last week.

I don't think that's the low-utilization bug, it's just that the blanked units run on the CPU so there is nothing for the GPU to do.

In this case the GPU job shares a core with a running CPU job and will be slower than if the core were free. On balance though, high-blanked wu's are rare enough that you still come out ahead. Also, the default priority of GPU AP tasks is "below normal" while for CPU MB tasks it is "idle", so the GPU job should get most of the core when it needs it.

As far as the GPU job is concerned it shouldn't be *that* much slower than running on an idle core (talking about high-blanked wu's).
ID: 1452566 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34257
Credit: 79,922,639
RAC: 80
Germany
Message 1452756 - Posted: 11 Dec 2013, 13:25:48 UTC - in response to Message 1452566.  

The problem stil remains if one gets 2 high blanked units.
My FX still causes GPU into stall.

I made a test with 50 AP`s last week.

I don't think that's the low-utilization bug, it's just that the blanked units run on the CPU so there is nothing for the GPU to do.

In this case the GPU job shares a core with a running CPU job and will be slower than if the core were free. On balance though, high-blanked wu's are rare enough that you still come out ahead. Also, the default priority of GPU AP tasks is "below normal" while for CPU MB tasks it is "idle", so the GPU job should get most of the core when it needs it.

As far as the GPU job is concerned it shouldn't be *that* much slower than running on an idle core (talking about high-blanked wu's).


This might apply to slow GPU`s like yours. (sorry)

Example

http://setiathome.berkeley.edu/result.php?resultid=3263100044

Unit with 68% blanking using CPU lock.
Took ~10.000 second without cpu_lock approx 5000.

Three other units stalled up to 15000 seconds without any progress at all while i was at work.
The GPU simply didn`t start processing.

I have done approx 10000 astropulses until now and do a lot of testing with them.








With each crime and every kindness we birth our future.
ID: 1452756 · Report as offensive
Profile cov_route
Avatar

Send message
Joined: 13 Sep 12
Posts: 342
Credit: 10,270,618
RAC: 0
Canada
Message 1453094 - Posted: 12 Dec 2013, 3:51:09 UTC - in response to Message 1452756.  

This might apply to slow GPU`s like yours. (sorry)

Example

http://setiathome.berkeley.edu/result.php?resultid=3263100044

Unit with 68% blanking using CPU lock.
Took ~10.000 second without cpu_lock approx 5000.

Three other units stalled up to 15000 seconds without any progress at all while i was at work.
The GPU simply didn`t start processing.

I have done approx 10000 astropulses until now and do a lot of testing with them.

You mention the problem crops up when you have 2 high-blanked wu's at once.

If I had a 15h processor I'd look into assigning the tasks to separate modules instead of separate cores. I think the cpu_lock mechanism always assigns tasks to the next highest core. So you will always get the jobs running on the same module.

You could use Process Lasso to try that out.

I also use the -gpu_lock option, but I think Raistmer said it doesn't do anything.

It's also possible that your 32 compute units overwhelm the "fix" that we see with less powerful GPUs. Nobody really understands the underlying problem let alone why this trick seems to help.
ID: 1453094 · Report as offensive

Message boards : Number crunching : AMD OpenCL AstroPulse: can work fine w/o free cores


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.