Script for Affinity & Priority Management


log in

Advanced search

Message boards : Number crunching : Script for Affinity & Priority Management

1 · 2 · 3 · Next
Author Message
Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 295
Credit: 7,391,151
RAC: 12,252
Canada
Message 1345007 - Posted: 10 Mar 2013, 16:53:01 UTC

I wrote a windows powershell script that manages affinities and priorities for the Seti tasks. Previously I used ProLasso for this but it's limited.

If anyone is interested in this kind of thing I'm happy to share it. Tested on one machine only.

bill
Send message
Joined: 16 Jun 99
Posts: 861
Credit: 23,961,314
RAC: 14,048
United States
Message 1345091 - Posted: 10 Mar 2013, 21:31:01 UTC - in response to Message 1345007.

What are the advantages for managing affinities
and priorities for those of us that don't know?

Any disadvantages?

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 295
Credit: 7,391,151
RAC: 12,252
Canada
Message 1345102 - Posted: 10 Mar 2013, 21:53:57 UTC

I find that by setting the GPU process to a single core affinity I don't have to idle any cores, that's the biggest motivation for controlling affinity.

The other one is theoretical: by tying each process to a core their data is more likely to remain in L2 cache (on my proc each core as its own L2).

Also since my machine is the family computer it's nice to be able to set different priorities/affinities when the machine is idle or active, which I do.

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24500
Credit: 33,830,438
RAC: 24,160
Germany
Message 1345111 - Posted: 10 Mar 2013, 22:05:43 UTC

How many instances are you running on your 6850 ?

I see at least your APs suffering.

____________

Horacio
Send message
Joined: 14 Jan 00
Posts: 536
Credit: 75,048,098
RAC: 41,187
Argentina
Message 1345114 - Posted: 10 Mar 2013, 22:11:51 UTC - in response to Message 1345091.

What are the advantages for managing affinities
and priorities for those of us that don't know?

Any disadvantages?

I dont know any advantage of micromanaging affinities... Some people said they get better performance on intel HT CPUs by using the affinity to force the use of the "real" cores over the "virtual" ones... But that means you will be using half the cores and in my tests any gain in the performance due to the use of half the cores was not enough to compensate the lost of the other cores. (I have no AMD CPUs, so I cant say anything about them.)

About priorities I agree with Mark. Giving a higher priority to tasks might give a certain rise in their performance (more noticeable for GPU tasks) but it could lead to a less responsive computer for everyday use. Ive not noticed any ill effect on my main host with the GPU apps rised to "normal" priority instead of the default "below normal" value, and I use it the whole day for my work...

As always when talking about optimizations and performance YMMV...
____________

bill
Send message
Joined: 16 Jun 99
Posts: 861
Credit: 23,961,314
RAC: 14,048
United States
Message 1345122 - Posted: 10 Mar 2013, 22:23:29 UTC - in response to Message 1345114.

Ok, thanks all for the information.

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 295
Credit: 7,391,151
RAC: 12,252
Canada
Message 1345123 - Posted: 10 Mar 2013, 22:24:13 UTC - in response to Message 1345111.

How many instances are you running on your 6850 ?

I see at least your APs suffering.

It's a 6670, one instance.

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24500
Credit: 33,830,438
RAC: 24,160
Germany
Message 1345125 - Posted: 10 Mar 2013, 22:30:36 UTC - in response to Message 1345123.

How many instances are you running on your 6850 ?

I see at least your APs suffering.

It's a 6670, one instance.


Then your times are fine.

____________

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3491
Credit: 47,654,056
RAC: 47,033
Russia
Message 1345480 - Posted: 11 Mar 2013, 19:07:24 UTC - in response to Message 1345102.
Last modified: 11 Mar 2013, 19:07:39 UTC

I find that by setting the GPU process to a single core affinity I don't have to idle any cores, that's the biggest motivation for controlling affinity.

The other one is theoretical: by tying each process to a core their data is more likely to remain in L2 cache (on my proc each core as its own L2).

Also since my machine is the family computer it's nice to be able to set different priorities/affinities when the machine is idle or active, which I do.


FYI such observations were made on some Cat drivers before too that's why both OpenCL apps has built-in ability to lock their affinity.
Some time ago this option was even enabled by defaut.
But, there were 2 logical CPU reserved per app. Currently you report improvement in CPU handling if only one logical CPU is available for app instance - did I understand that correctly? It's some new info then.
____________

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 295
Credit: 7,391,151
RAC: 12,252
Canada
Message 1345619 - Posted: 11 Mar 2013, 23:06:18 UTC - in response to Message 1345480.

Currently you report improvement in CPU handling if only one logical CPU is available for app instance - did I understand that correctly? It's some new info then.

That is exactly what I have found. It is consistent and repeatable. I've been running with all cores loaded with CPU jobs and full GPU utilization for months using that technique.

No cores idled. It works even if the CPU processes are at higher priority than the GPU process.

The initial motivation to write the script was so I could vary the core that the GPU process is assigned to instead of it always being forced to the same core with ProLasso. Eventually I added other stuff.

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3491
Credit: 47,654,056
RAC: 47,033
Russia
Message 1345622 - Posted: 11 Mar 2013, 23:20:42 UTC - in response to Message 1345619.
Last modified: 11 Mar 2013, 23:21:51 UTC

Thanks. I will try to reproduce this on own host and then consider to change CPUlock app behavior to allow only single core instead of 2 on multicore CPUs.

EDIT: And I think script worth to be posted anyway.
____________

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 295
Credit: 7,391,151
RAC: 12,252
Canada
Message 1345634 - Posted: 12 Mar 2013, 0:08:33 UTC

Link: https://www.box.com/s/fahbx7qali3rnwrnmtz8

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3491
Credit: 47,654,056
RAC: 47,033
Russia
Message 1345770 - Posted: 12 Mar 2013, 10:41:36 UTC - in response to Message 1345634.
Last modified: 12 Mar 2013, 10:42:33 UTC

Regarding task scheduler. You propose to put some job into it.
I did the same before with retry script and now can't get rid of it. Network issue is fixed, I deleted job in task scheduler... but it tries to run anyway (!).
Then I renamed script file but time to time black windows opens for fraction of second (and closes not found script it tries to run). But no corresponding job listed in task scheduler. Maybe you know where in registry they are stored to delete orphaned job from there ?
____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8631
Credit: 51,488,576
RAC: 48,398
United Kingdom
Message 1345772 - Posted: 12 Mar 2013, 10:52:25 UTC - in response to Message 1345770.

Regarding task scheduler. You propose to put some job into it.
I did the same before with retry script and now can't get rid of it. Network issue is fixed, I deleted job in task scheduler... but it tries to run anyway (!).
Then I renamed script file but time to time black windows opens for fraction of second (and closes not found script it tries to run). But no corresponding job listed in task scheduler. Maybe you know where in registry they are stored to delete orphaned job from there ?

You can manage them from the Task Scheduler GUI, but it's just a matter of finding where it's stored. If you didn't specify a group when you created the task, expand the tree in the left-hand navigation pane - it'll probably be in 'Microsoft', perhaps down one level to 'Windows'.

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 295
Credit: 7,391,151
RAC: 12,252
Canada
Message 1345841 - Posted: 12 Mar 2013, 15:29:04 UTC

You can see Task Scheduler do it's work in Event Viewer. Look for Task Scheduler under Software/Microsoft in EV (win 7).

You can see if it really is scheduler firing off those jobs and if so find the name of the scheduler entry that does it.

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4429
Credit: 118,814,411
RAC: 139,108
United States
Message 1345853 - Posted: 12 Mar 2013, 20:07:59 UTC - in response to Message 1345770.
Last modified: 12 Mar 2013, 20:29:58 UTC

Regarding task scheduler. You propose to put some job into it.
I did the same before with retry script and now can't get rid of it. Network issue is fixed, I deleted job in task scheduler... but it tries to run anyway (!).
Then I renamed script file but time to time black windows opens for fraction of second (and closes not found script it tries to run). But no corresponding job listed in task scheduler. Maybe you know where in registry they are stored to delete orphaned job from there ?

You will find the tasks stored in the registry here.
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Schedule\TaskCache\
Then two sub folders.
\Tasks
There is a sub folder for each task with a name like "{061F6307-2074-4401-96BD-D360147E7375}", but each will contain String named "Path" with the task name as the data.
\Tree
This is much like the tree in the GUI. So the task will be in the corresponding location here. I put my scripts in the folder "Task Scheduler Library" So they are directly under \Tree.

Also you could enable Task history and then look to see if anything shows up.
http://www.hal6000.com/seti/images/task_hist.png
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Profile Karsten Vinding
Volunteer tester
Send message
Joined: 18 May 99
Posts: 140
Credit: 16,679,355
RAC: 3,087
Denmark
Message 1345929 - Posted: 12 Mar 2013, 22:19:47 UTC - in response to Message 1345619.
Last modified: 12 Mar 2013, 22:21:21 UTC


That is exactly what I have found. It is consistent and repeatable. I've been running with all cores loaded with CPU jobs and full GPU utilization for months using that technique.

No cores idled. It works even if the CPU processes are at higher priority than the GPU process.

The initial motivation to write the script was so I could vary the core that the GPU process is assigned to instead of it always being forced to the same core with ProLasso. Eventually I added other stuff.


Not to un-validate your findings, but I don't see this on my system.

On my system when I set GPU-tasks affinity to one single core (I run 2 GPU tasks at a time), I emidiatly get a 8-10% drop in GPU utilization as seen by GPU-z.

I tried many settings, GPU app affinity set to 2 cores and to 1 core, with cpu crunching on 6, 7 and 8 cores (affinity set accordingly to this number, and set to avoid the cores assigned to GPU, if possible). Only when the CPU is set to crunch on only 6 cores do I get a steady 100% GPU utilization. Any more than 6 cores, I get the slowdown no matter how I setup GPU-tasks affinity.

As the GPU crunches _much_ faster than my CPU cores, these 8-10% are worth sacrificing 2 cores on the CPU, the end RAC still is higher.

It may be up to something in my setup, I dont know, but I cannot confirm that your method works.
____________

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 295
Credit: 7,391,151
RAC: 12,252
Canada
Message 1345984 - Posted: 13 Mar 2013, 2:28:53 UTC

What do you see if you take the default GPU affinity (all cores) and don't idle any cores? Is the slowdown the same 8-10% or more?

On my system I can see a 70% reduction in GUP utilization without idle cores or 1-core affinity. One difference we have is that my ratio of CPU/GPU speed is probably a lot higher than yours...CPU gives me about 1/2 of my RAC.

Also on my board the MB GPU task *never* reaches 100%. It is normally 90-92%. There could be a key difference there, because my system doesn't have the PCI bandwidth that yours does (I have PCI-E 1.1). So the 1-core trick allows my system to reach it's full potential but on yours it does not.

I would say in general if the 1-core method allows most GPU's to get in the 90% range then it is pretty good because I have read many reports that without some action, the default setup, GPU utilization is sometimes almost zero.

It would be well worth setting as the default, even if it could be improved by some users if they so desired.

It will be interesting to see what Raistmer finds out in his testing.

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3491
Credit: 47,654,056
RAC: 47,033
Russia
Message 1346099 - Posted: 13 Mar 2013, 10:15:27 UTC
Last modified: 13 Mar 2013, 11:14:08 UTC

Ok, I didn't understand fully how to configure script so far so testing via ProcessLasso.
MB7_win_x86_SSE_OpenCL_ATi_HD5_r1764.exe -verb -nog :
1) APU idle:

WU : PG0009.wu Elapsed 288.054 secs CPU 35.724 secs
WU : PG0395.wu Elapsed 156.374 secs CPU 35.350 secs
WU : PG0444.wu Elapsed 142.111 secs CPU 34.539 secs
WU : PG1327.wu Elapsed 100.519 secs CPU 27.643 secs

2) all 4 CPUs are busy, GPU running test task, default affinity:
WU : PG0009.wu Elapsed 376.513 secs CPU 41.605 secs
WU : PG0395.wu Elapsed 202.957 secs CPU 38.049 secs
WU : PG0444.wu Elapsed 184.707 secs CPU 38.376 secs
WU : PG1327.wu Elapsed 206.770 secs CPU 28.221 secs

3) same as 2) but affinity for GPU task set to 0 in ProcessLasso:
WU : PG0009.wu Elapsed 333.362 secs CPU 40.841 secs
WU : PG0395.wu Elapsed 192.010 secs CPU 38.361 secs
WU : PG0444.wu Elapsed 166.940 secs CPU 38.064 secs
WU : PG1327.wu Elapsed 117.420 secs CPU 29.905 secs

So, not as cool as completely free device but better than default indeed.
AFAIK CPU0 is responsible for interrupt processing.
So I'm not sure is it good or bad to stick GPU app on CPU0.
From one side it's good cause interrupts from GPU no need to be rerouted (actually, their data in core cache). From another side - this core should handle other interrupts so its average availability lower....

4) same as 3 but CPU1 used instead of CPU0
WU : PG0009.wu Elapsed 347.569 secs CPU 38.953 secs
WU : PG0395.wu Elapsed 183.694 secs CPU 36.114 secs
WU : PG0444.wu Elapsed 160.770 secs CPU 36.645 secs
WU : PG1327.wu Elapsed 113.711 secs CPU 29.063 secs

5) and the same again with CPU2 used (secondary Trinity's Piledriver module)
WU : PG0009.wu Elapsed 350.538 secs CPU 41.995 secs
WU : PG0395.wu Elapsed 179.995 secs CPU 39.000 secs
WU : PG0444.wu Elapsed 162.808 secs CPU 39.921 secs
WU : PG1327.wu Elapsed 119.178 secs CPU 31.013 secs
____________

1 · 2 · 3 · Next

Message boards : Number crunching : Script for Affinity & Priority Management

Copyright © 2014 University of California