"Laggy" GUI and Processor Scheduling: Win32PrioritySeparation


log in

Advanced search

Message boards : Number crunching : "Laggy" GUI and Processor Scheduling: Win32PrioritySeparation

1 · 2 · Next
Author Message
Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 290
Credit: 6,734,383
RAC: 17,308
Canada
Message 1396273 - Posted: 29 Jul 2013, 17:53:02 UTC
Last modified: 29 Jul 2013, 18:07:51 UTC

Win7/64, Phenom II 945, Radeon HD 6670 1GB, running MB GPU 2-up

From time to time my machine gets "sticky", the GUI becomes poorly responsive.

It isn't a function of %cpu utilization, it can often run at 100% utilization and be responsive.

During periods of stickiness, the Resource Monitor shows the "modified" memory as somewhat larger than other times, and varying. Definitely a different behaviour than normal. It makes me suspect some sort of soft-fault cache-trashing issue.

There is plenty of memory in the system. It's not hard faulting.

What I tried is changing the processor scheduling from "Programs" to "Background Services". The theory being that if the background (seti) jobs get hold of the caches for longer there will be less trashing, less coherency traffic and better overall memory functionality.

At the registry level that changes Win32PrioritySeparation to decimal 28 or 011101. Technet.

It seems to have worked for now, but this is the sort of thing that's notoriously hard to nail down. Sometimes seemingly random changes can fix it for a while only to re-emerge later.

I realize it seems backwards, giving higher priority to background jobs to make the GUI run smoother but I can't argue with the results. For now.

Edit: fixed the link.

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 357,953
RAC: 37
Germany
Message 1396326 - Posted: 29 Jul 2013, 20:16:34 UTC - in response to Message 1396273.

At the registry level that changes Win32PrioritySeparation to decimal 28 or 011101. Technet.

28 is an even number and 011101 is not (last didit non-zero). If my memories of my octal era (Pr1mos) don't fail me, 011 101 (yes, I take groups of 3 :-) is 29 (24 + 5).

Gruß,
Gundolf

Ulrich Metzner
Volunteer tester
Avatar
Send message
Joined: 3 Jul 02
Posts: 976
Credit: 8,430,292
RAC: 7,669
Germany
Message 1396335 - Posted: 29 Jul 2013, 21:03:57 UTC

I think, this is application dependent.
See here: http://setiathome.berkeley.edu/forum_thread.php?id=72379
I'm also not happy with ATI and *some* MB tasks...
____________
Aloha, Uli

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23803
Credit: 32,621,034
RAC: 23,771
Germany
Message 1396339 - Posted: 29 Jul 2013, 21:09:39 UTC

@Uli

In your case its simply a weak GPU.

cove_route increase period_iterations_num to 50.
This should help.

____________

Ulrich Metzner
Volunteer tester
Avatar
Send message
Joined: 3 Jul 02
Posts: 976
Credit: 8,430,292
RAC: 7,669
Germany
Message 1396341 - Posted: 29 Jul 2013, 21:17:20 UTC - in response to Message 1396339.

@Uli

In your case its simply a weak GPU.

cove_route increase period_iterations_num to 50.
This should help.

I don't think it's *only* related to the weak GPU. He also describes some points, where the computer is very, very laggy, in other times, it's just fine. It was the same in my case.

But ok, let's see, if increasing the iterations will help in his case. :?
Good luck! *thumbsup*
____________
Aloha, Uli

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 290
Credit: 6,734,383
RAC: 17,308
Canada
Message 1396354 - Posted: 29 Jul 2013, 21:57:55 UTC

I am not sure about period_iteration_num because this problem can come and go even with the same work unit.

It definitely appears to be a condition that arises in the processor related to memory access.

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23803
Credit: 32,621,034
RAC: 23,771
Germany
Message 1396357 - Posted: 29 Jul 2013, 22:11:42 UTC - in response to Message 1396354.

I am not sure about period_iteration_num because this problem can come and go even with the same work unit.

It definitely appears to be a condition that arises in the processor related to memory access.


I had to increase it on my HD 5850 running V7 so it doesn`t hurt if you try.

____________

TBar
Volunteer tester
Send message
Joined: 22 May 99
Posts: 1221
Credit: 45,514,955
RAC: 118,995
United States
Message 1396384 - Posted: 29 Jul 2013, 23:15:14 UTC - in response to Message 1396354.
Last modified: 29 Jul 2013, 23:38:08 UTC

I am not sure about period_iteration_num because this problem can come and go even with the same work unit.

It definitely appears to be a condition that arises in the processor related to memory access.

I'm still getting 'stuttering' ever so often with my 6850. I've raised the period_iteration_num to 64. Everything is fine for a while, then the stutter..

It does seem to have increased after I went from using 2 CPU cores to using 3 cores for v7 CPU MBs.

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 290
Credit: 6,734,383
RAC: 17,308
Canada
Message 1396826 - Posted: 31 Jul 2013, 3:00:55 UTC

@Gundolf

Yes you are correct. As it says in the Technet entry the registry value for the background case is dec 24 or 011000.

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3397
Credit: 46,362,185
RAC: 10,012
Russia
Message 1397850 - Posted: 2 Aug 2013, 8:19:55 UTC

In general this change can't be ultimate solution (my host configured for background services and still has GUI lags time to time) (also, it can be done via usual Windows settings GUI, no need to edit registry keys directly), but can add some to understanding the roots of issue.
Could you post screenshot with counter that strange behavior you observed at GUI lags moments, please ?

____________

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 290
Credit: 6,734,383
RAC: 17,308
Canada
Message 1397999 - Posted: 2 Aug 2013, 17:07:30 UTC - in response to Message 1397850.

I'll do that next time I see it, Raistmer. It doesn't happen that often and since I made the change to background priority I haven't seen it at all.

On another matter, I started using -cpu_lock with -instances_per_device and -gpu_lock to run with single assigned cores. It works, so I have been able to stop using my custom script.

To remind you, that enables me to run with no cores idled.

Some time ago you mentioned that -gpu_lock shouldn't be needed, but I find it doesn't work without it.

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3397
Credit: 46,362,185
RAC: 10,012
Russia
Message 1398025 - Posted: 2 Aug 2013, 18:37:51 UTC - in response to Message 1397999.


Some time ago you mentioned that -gpu_lock shouldn't be needed, but I find it doesn't work without it.

Thanks, will check logic of interaction between switches once more.

____________

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 290
Credit: 6,734,383
RAC: 17,308
Canada
Message 1400215 - Posted: 7 Aug 2013, 16:41:59 UTC

Raistmer:

The first graph is the system running "sticky". The second graph is after a logoff/logon when it is running "smoothly".

You can see the faults are almost all transition (soft) faults. I don't know if transition faults include cache misses or just page repositioning. I suspect they dont.




Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3397
Credit: 46,362,185
RAC: 10,012
Russia
Message 1400838 - Posted: 8 Aug 2013, 23:19:53 UTC
Last modified: 8 Aug 2013, 23:26:03 UTC

thanks. I suspect that driver swaps data between system and device memories for some reason in first graph...

EDIT: while you in these observations, could you please record what tasks experienced laggy behavior and look into their stderrs after completion. Info to check - if they also show excessive "misses" (any kind) in counters fields vs non-laggy tasks ?

EDIT2: examples of such fields in bold:

class Gaussian_transfer_not_needed: total=96555, N=96555, <>=1, min=1 max=1
class Gaussian_transfer_needed: total=19, N=19, <>=1, min=1 max=1


class Gaussian_skip1_no_peak: total=0, N=0, <>=0, min=0 max=0
class Gaussian_skip2_bad_group_peak: total=0, N=0, <>=0, min=0 max=0
class Gaussian_skip3_too_weak_peak: total=0, N=0, <>=0, min=0 max=0
class Gaussian_skip4_too_big_ChiSq: total=0, N=50, <>=0, min=0 max=0
class Gaussian_skip6_low_power: total=25, N=50, <>=0.5, min=0 max=1


class Gaussian_new_best: total=32, N=32, <>=1, min=1 max=1
class Gaussian_report: total=0, N=0, <>=0, min=0 max=0
class Gaussian_miss: total=18, N=18, <>=1, min=1 max=1


class PC_triplet_find_hit: total=23938, N=23938, <>=1, min=1 max=1
class PC_triplet_find_miss: total=187, N=187, <>=1, min=1 max=1


class PC_pulse_find_hit: total=12050, N=12050, <>=1, min=1 max=1
class PC_pulse_find_miss: total=11, N=11, <>=1, min=1 max=1
class PC_pulse_find_early_miss: total=1, N=1, <>=1, min=1 max=1
class PC_pulse_find_2CPU: total=1, N=1, <>=1, min=1 max=1


class PoT_transfer_not_needed: total=23927, N=23927, <>=1, min=1 max=1
class PoT_transfer_needed: total=199, N=199, <>=1, min=1 max=1
____________

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 290
Credit: 6,734,383
RAC: 17,308
Canada
Message 1403092 - Posted: 14 Aug 2013, 15:37:49 UTC

The slowdown happened again, here is the performance monitor:



Heavy lines are overall page faults, fine lines are for the two MB jobs running. Conclusion, all the faults are coming from the two jobs, and they are all transition faults. You can see where the default behaviour spontaneously stops by itself at the 7:21PM mark.

The two results:

http://setiathome.berkeley.edu/result.php?resultid=3110887969
http://setiathome.berkeley.edu/result.php?resultid=3110887697

***

class Gaussian_transfer_not_needed: total=32966, N=32966, <>=1, min=1 max=1
class Gaussian_transfer_needed: total=2, N=2, <>=1, min=1 max=1

class Gaussian_skip1_no_peak: total=0, N=0, <>=0, min=0 max=0
class Gaussian_skip2_bad_group_peak: total=0, N=0, <>=0, min=0 max=0
class Gaussian_skip3_too_weak_peak: total=0, N=0, <>=0, min=0 max=0
class Gaussian_skip4_too_big_ChiSq: total=0, N=2, <>=0, min=0 max=0
class Gaussian_skip6_low_power: total=0, N=2, <>=0, min=0 max=0

class Gaussian_new_best: total=2, N=2, <>=1, min=1 max=1
class Gaussian_report: total=0, N=0, <>=0, min=0 max=0
class Gaussian_miss: total=0, N=0, <>=0, min=0 max=0

class PC_triplet_find_hit: total=8150, N=8150, <>=1, min=1 max=1
class PC_triplet_find_miss: total=84, N=84, <>=1, min=1 max=1

class PC_pulse_find_hit: total=4114, N=4114, <>=1, min=1 max=1
class PC_pulse_find_miss: total=2, N=2, <>=1, min=1 max=1
class PC_pulse_find_early_miss: total=0, N=0, <>=0, min=0 max=0
class PC_pulse_find_2CPU: total=0, N=0, <>=0, min=0 max=0

class PoT_transfer_not_needed: total=8149, N=8149, <>=1, min=1 max=1
class PoT_transfer_needed: total=85, N=85, <>=1, min=1 max=1

*****

class Gaussian_transfer_not_needed: total=84782, N=84782, <>=1, min=1 max=1
class Gaussian_transfer_needed: total=1, N=1, <>=1, min=1 max=1

class Gaussian_skip1_no_peak: total=0, N=0, <>=0, min=0 max=0
class Gaussian_skip2_bad_group_peak: total=0, N=0, <>=0, min=0 max=0
class Gaussian_skip3_too_weak_peak: total=0, N=0, <>=0, min=0 max=0
class Gaussian_skip4_too_big_ChiSq: total=0, N=2, <>=0, min=0 max=0
class Gaussian_skip6_low_power: total=0, N=2, <>=0, min=0 max=0

class Gaussian_new_best: total=1, N=1, <>=1, min=1 max=1
class Gaussian_report: total=0, N=0, <>=0, min=0 max=0
class Gaussian_miss: total=1, N=1, <>=1, min=1 max=1

class PC_triplet_find_hit: total=21043, N=21043, <>=1, min=1 max=1
class PC_triplet_find_miss: total=139, N=139, <>=1, min=1 max=1

class PC_pulse_find_hit: total=10587, N=10587, <>=1, min=1 max=1
class PC_pulse_find_miss: total=5, N=5, <>=1, min=1 max=1
class PC_pulse_find_early_miss: total=3, N=3, <>=1, min=1 max=1
class PC_pulse_find_2CPU: total=0, N=0, <>=0, min=0 max=0


class PoT_transfer_not_needed: total=21039, N=21039, <>=1, min=1 max=1
class PoT_transfer_needed: total=143, N=143, <>=1, min=1 max=1

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3397
Credit: 46,362,185
RAC: 10,012
Russia
Message 1403367 - Posted: 15 Aug 2013, 8:24:09 UTC - in response to Message 1403092.
Last modified: 15 Aug 2013, 8:24:47 UTC

I would not say there is big overall rise in those counters.
The only possibility to link those events is to assume that all those PoT transfers are happened very locally in WU processing. Unfortunately, it can't be proved with counters. Can you monitor data transfers over PCIe bus somehow? Some NV cards (in GPU-Z) have "memory controller load" counter. But I didn't see something like that for ATi GPUs. It would be interesting to correlate thosee increase in page faults counter with PCIe transfers increases if any. Any ideas how to monitor PCIe transfers? Any tools ?
____________

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 290
Credit: 6,734,383
RAC: 17,308
Canada
Message 1405353 - Posted: 20 Aug 2013, 4:03:04 UTC - in response to Message 1403367.

Any ideas how to monitor PCIe transfers? Any tools ?

No ideas. I downloaded the NV sys tools (NV chipset) but no PCI-E info except frequency. I need more instrumentation!!!

TimK
Send message
Joined: 19 Sep 08
Posts: 10
Credit: 6,075,678
RAC: 1
United Kingdom
Message 1407029 - Posted: 23 Aug 2013, 18:59:55 UTC

For the past couple of weeks my machine has also been "laggy". I checked a few things in performance monitor and found about 20,000 average page faults per second, hardly any hard faults. I thought this was high, until I saw the graph below showing nearly 500,000 per second.
What is the consensus on any possible fix? As I said, it's only started in the past few weeks and nothing has changed in my pc (as far as I know).

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 290
Credit: 6,734,383
RAC: 17,308
Canada
Message 1407059 - Posted: 23 Aug 2013, 20:11:48 UTC - in response to Message 1407029.

I don't know that there is a surefire fix. It is something that has been observed intermittently for a long time by many people.

One thing that is suggested is to increase the argument -period_iterations_num from the default value of 20. I've tried very high numbers like 100 without seeing a difference but it might work for some.

The other thing to do, if you are running more than one instance per gpu, is to decrease that to 1. I'm pretty sure that does help if it applies.

If all else fails and it's really interfering with computer use, you can always set BOINC to not allow computing when the computer is active.

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23803
Credit: 32,621,034
RAC: 23,771
Germany
Message 1407082 - Posted: 23 Aug 2013, 21:21:11 UTC

Add the following to your appinfo.xml or mb_cmdline*.txt

-peroiod_iterations_num 40 -sbs 256

This should fix it for you.

I experienced the 7700 sometimes need lower values on single buffer size.
So you can try -sbs 192 or -sbs 156.


____________

1 · 2 · Next

Message boards : Number crunching : "Laggy" GUI and Processor Scheduling: Win32PrioritySeparation

Copyright © 2014 University of California