"Laggy" GUI and Processor Scheduling: Win32PrioritySeparation

Author	Message
cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1396273 - Posted: 29 Jul 2013, 17:53:02 UTC Last modified: 29 Jul 2013, 18:07:51 UTC Win7/64, Phenom II 945, Radeon HD 6670 1GB, running MB GPU 2-up From time to time my machine gets "sticky", the GUI becomes poorly responsive. It isn't a function of %cpu utilization, it can often run at 100% utilization and be responsive. During periods of stickiness, the Resource Monitor shows the "modified" memory as somewhat larger than other times, and varying. Definitely a different behaviour than normal. It makes me suspect some sort of soft-fault cache-trashing issue. There is plenty of memory in the system. It's not hard faulting. What I tried is changing the processor scheduling from "Programs" to "Background Services". The theory being that if the background (seti) jobs get hold of the caches for longer there will be less trashing, less coherency traffic and better overall memory functionality. At the registry level that changes Win32PrioritySeparation to decimal 28 or 011101. Technet. It seems to have worked for now, but this is the sort of thing that's notoriously hard to nail down. Sometimes seemingly random changes can fix it for a while only to re-emerge later. I realize it seems backwards, giving higher priority to background jobs to make the GUI run smoother but I can't argue with the results. For now. Edit: fixed the link. ID: 1396273 ·

Gundolf Jahn Send message Joined: 19 Sep 00 Posts: 3184 Credit: 446,358 RAC: 0	Message 1396326 - Posted: 29 Jul 2013, 20:16:34 UTC - in response to Message 1396273. At the registry level that changes Win32PrioritySeparation to decimal 28 or 011101. Technet. 28 is an even number and 011101 is not (last didit non-zero). If my memories of my octal era (Pr1mos) don't fail me, 011 101 (yes, I take groups of 3 :-) is 29 (24 + 5). GruÃŸ, Gundolf ID: 1396326 ·

Ulrich Metzner Volunteer tester Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13	Message 1396335 - Posted: 29 Jul 2013, 21:03:57 UTC I think, this is application dependent. See here: http://setiathome.berkeley.edu/forum_thread.php?id=72379 I'm also not happy with ATI and some MB tasks... Aloha, Uli ID: 1396335 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80	Message 1396339 - Posted: 29 Jul 2013, 21:09:39 UTC @Uli In your case its simply a weak GPU. cove_route increase period_iterations_num to 50. This should help. With each crime and every kindness we birth our future. ID: 1396339 ·

Ulrich Metzner Volunteer tester Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13	Message 1396341 - Posted: 29 Jul 2013, 21:17:20 UTC - in response to Message 1396339. @Uli In your case its simply a weak GPU. cove_route increase period_iterations_num to 50. This should help. I don't think it's only related to the weak GPU. He also describes some points, where the computer is very, very laggy, in other times, it's just fine. It was the same in my case. But ok, let's see, if increasing the iterations will help in his case. :? Good luck! thumbsup Aloha, Uli ID: 1396341 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1396354 - Posted: 29 Jul 2013, 21:57:55 UTC I am not sure about period_iteration_num because this problem can come and go even with the same work unit. It definitely appears to be a condition that arises in the processor related to memory access. ID: 1396354 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80	Message 1396357 - Posted: 29 Jul 2013, 22:11:42 UTC - in response to Message 1396354. I am not sure about period_iteration_num because this problem can come and go even with the same work unit. It definitely appears to be a condition that arises in the processor related to memory access. I had to increase it on my HD 5850 running V7 so it doesn`t hurt if you try. With each crime and every kindness we birth our future. ID: 1396357 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1396384 - Posted: 29 Jul 2013, 23:15:14 UTC - in response to Message 1396354. Last modified: 29 Jul 2013, 23:38:08 UTC I am not sure about period_iteration_num because this problem can come and go even with the same work unit. It definitely appears to be a condition that arises in the processor related to memory access. I'm still getting 'stuttering' ever so often with my 6850. I've raised the period_iteration_num to 64. Everything is fine for a while, then the stutter.. It does seem to have increased after I went from using 2 CPU cores to using 3 cores for v7 CPU MBs. ID: 1396384 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1396826 - Posted: 31 Jul 2013, 3:00:55 UTC @Gundolf Yes you are correct. As it says in the Technet entry the registry value for the background case is dec 24 or 011000. ID: 1396826 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1397850 - Posted: 2 Aug 2013, 8:19:55 UTC In general this change can't be ultimate solution (my host configured for background services and still has GUI lags time to time) (also, it can be done via usual Windows settings GUI, no need to edit registry keys directly), but can add some to understanding the roots of issue. Could you post screenshot with counter that strange behavior you observed at GUI lags moments, please ? SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1397850 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1397999 - Posted: 2 Aug 2013, 17:07:30 UTC - in response to Message 1397850. I'll do that next time I see it, Raistmer. It doesn't happen that often and since I made the change to background priority I haven't seen it at all. On another matter, I started using -cpu_lock with -instances_per_device and -gpu_lock to run with single assigned cores. It works, so I have been able to stop using my custom script. To remind you, that enables me to run with no cores idled. Some time ago you mentioned that -gpu_lock shouldn't be needed, but I find it doesn't work without it. ID: 1397999 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1398025 - Posted: 2 Aug 2013, 18:37:51 UTC - in response to Message 1397999. Some time ago you mentioned that -gpu_lock shouldn't be needed, but I find it doesn't work without it. Thanks, will check logic of interaction between switches once more. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1398025 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1400215 - Posted: 7 Aug 2013, 16:41:59 UTC Raistmer: The first graph is the system running "sticky". The second graph is after a logoff/logon when it is running "smoothly". You can see the faults are almost all transition (soft) faults. I don't know if transition faults include cache misses or just page repositioning. I suspect they dont. ID: 1400215 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1400838 - Posted: 8 Aug 2013, 23:19:53 UTC Last modified: 8 Aug 2013, 23:26:03 UTC thanks. I suspect that driver swaps data between system and device memories for some reason in first graph... EDIT: while you in these observations, could you please record what tasks experienced laggy behavior and look into their stderrs after completion. Info to check - if they also show excessive "misses" (any kind) in counters fields vs non-laggy tasks ? EDIT2: examples of such fields in bold: class Gaussian_transfer_not_needed: total=96555, N=96555, <>=1, min=1 max=1 class Gaussian_transfer_needed: total=19, N=19, <>=1, min=1 max=1 class Gaussian_skip1_no_peak: total=0, N=0, <>=0, min=0 max=0 class Gaussian_skip2_bad_group_peak: total=0, N=0, <>=0, min=0 max=0 class Gaussian_skip3_too_weak_peak: total=0, N=0, <>=0, min=0 max=0 class Gaussian_skip4_too_big_ChiSq: total=0, N=50, <>=0, min=0 max=0 class Gaussian_skip6_low_power: total=25, N=50, <>=0.5, min=0 max=1 class Gaussian_new_best: total=32, N=32, <>=1, min=1 max=1 class Gaussian_report: total=0, N=0, <>=0, min=0 max=0 class Gaussian_miss: total=18, N=18, <>=1, min=1 max=1 class PC_triplet_find_hit: total=23938, N=23938, <>=1, min=1 max=1 class PC_triplet_find_miss: total=187, N=187, <>=1, min=1 max=1 class PC_pulse_find_hit: total=12050, N=12050, <>=1, min=1 max=1 class PC_pulse_find_miss: total=11, N=11, <>=1, min=1 max=1 class PC_pulse_find_early_miss: total=1, N=1, <>=1, min=1 max=1 class PC_pulse_find_2CPU: total=1, N=1, <>=1, min=1 max=1 class PoT_transfer_not_needed: total=23927, N=23927, <>=1, min=1 max=1 class PoT_transfer_needed: total=199, N=199, <>=1, min=1 max=1 SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1400838 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1403092 - Posted: 14 Aug 2013, 15:37:49 UTC The slowdown happened again, here is the performance monitor: Heavy lines are overall page faults, fine lines are for the two MB jobs running. Conclusion, all the faults are coming from the two jobs, and they are all transition faults. You can see where the default behaviour spontaneously stops by itself at the 7:21PM mark. The two results: http://setiathome.berkeley.edu/result.php?resultid=3110887969 http://setiathome.berkeley.edu/result.php?resultid=3110887697 * class Gaussian_transfer_not_needed: total=32966, N=32966, <>=1, min=1 max=1 class Gaussian_transfer_needed: total=2, N=2, <>=1, min=1 max=1 class Gaussian_skip1_no_peak: total=0, N=0, <>=0, min=0 max=0 class Gaussian_skip2_bad_group_peak: total=0, N=0, <>=0, min=0 max=0 class Gaussian_skip3_too_weak_peak: total=0, N=0, <>=0, min=0 max=0 class Gaussian_skip4_too_big_ChiSq: total=0, N=2, <>=0, min=0 max=0 class Gaussian_skip6_low_power: total=0, N=2, <>=0, min=0 max=0 class Gaussian_new_best: total=2, N=2, <>=1, min=1 max=1 class Gaussian_report: total=0, N=0, <>=0, min=0 max=0 class Gaussian_miss: total=0, N=0, <>=0, min=0 max=0 class PC_triplet_find_hit: total=8150, N=8150, <>=1, min=1 max=1 class PC_triplet_find_miss: total=84, N=84, <>=1, min=1 max=1 class PC_pulse_find_hit: total=4114, N=4114, <>=1, min=1 max=1 class PC_pulse_find_miss: total=2, N=2, <>=1, min=1 max=1 class PC_pulse_find_early_miss: total=0, N=0, <>=0, min=0 max=0 class PC_pulse_find_2CPU: total=0, N=0, <>=0, min=0 max=0 class PoT_transfer_not_needed: total=8149, N=8149, <>=1, min=1 max=1 class PoT_transfer_needed: total=85, N=85, <>=1, min=1 max=1 * class Gaussian_transfer_not_needed: total=84782, N=84782, <>=1, min=1 max=1 class Gaussian_transfer_needed: total=1, N=1, <>=1, min=1 max=1 class Gaussian_skip1_no_peak: total=0, N=0, <>=0, min=0 max=0 class Gaussian_skip2_bad_group_peak: total=0, N=0, <>=0, min=0 max=0 class Gaussian_skip3_too_weak_peak: total=0, N=0, <>=0, min=0 max=0 class Gaussian_skip4_too_big_ChiSq: total=0, N=2, <>=0, min=0 max=0 class Gaussian_skip6_low_power: total=0, N=2, <>=0, min=0 max=0 class Gaussian_new_best: total=1, N=1, <>=1, min=1 max=1 class Gaussian_report: total=0, N=0, <>=0, min=0 max=0 class Gaussian_miss: total=1, N=1, <>=1, min=1 max=1 class PC_triplet_find_hit: total=21043, N=21043, <>=1, min=1 max=1 class PC_triplet_find_miss: total=139, N=139, <>=1, min=1 max=1 class PC_pulse_find_hit: total=10587, N=10587, <>=1, min=1 max=1 class PC_pulse_find_miss: total=5, N=5, <>=1, min=1 max=1 class PC_pulse_find_early_miss: total=3, N=3, <>=1, min=1 max=1 class PC_pulse_find_2CPU: total=0, N=0, <>=0, min=0 max=0 class PoT_transfer_not_needed: total=21039, N=21039, <>=1, min=1 max=1 class PoT_transfer_needed: total=143, N=143, <>=1, min=1 max=1** ID: 1403092 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1403367 - Posted: 15 Aug 2013, 8:24:09 UTC - in response to Message 1403092. Last modified: 15 Aug 2013, 8:24:47 UTC I would not say there is big overall rise in those counters. The only possibility to link those events is to assume that all those PoT transfers are happened very locally in WU processing. Unfortunately, it can't be proved with counters. Can you monitor data transfers over PCIe bus somehow? Some NV cards (in GPU-Z) have "memory controller load" counter. But I didn't see something like that for ATi GPUs. It would be interesting to correlate thosee increase in page faults counter with PCIe transfers increases if any. Any ideas how to monitor PCIe transfers? Any tools ? SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1403367 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1405353 - Posted: 20 Aug 2013, 4:03:04 UTC - in response to Message 1403367. Any ideas how to monitor PCIe transfers? Any tools ? No ideas. I downloaded the NV sys tools (NV chipset) but no PCI-E info except frequency. I need more instrumentation!!! ID: 1405353 ·

TimK Send message Joined: 19 Sep 08 Posts: 10 Credit: 6,079,103 RAC: 0	Message 1407029 - Posted: 23 Aug 2013, 18:59:55 UTC For the past couple of weeks my machine has also been "laggy". I checked a few things in performance monitor and found about 20,000 average page faults per second, hardly any hard faults. I thought this was high, until I saw the graph below showing nearly 500,000 per second. What is the consensus on any possible fix? As I said, it's only started in the past few weeks and nothing has changed in my pc (as far as I know). ID: 1407029 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1407059 - Posted: 23 Aug 2013, 20:11:48 UTC - in response to Message 1407029. I don't know that there is a surefire fix. It is something that has been observed intermittently for a long time by many people. One thing that is suggested is to increase the argument -period_iterations_num from the default value of 20. I've tried very high numbers like 100 without seeing a difference but it might work for some. The other thing to do, if you are running more than one instance per gpu, is to decrease that to 1. I'm pretty sure that does help if it applies. If all else fails and it's really interfering with computer use, you can always set BOINC to not allow computing when the computer is active. ID: 1407059 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80	Message 1407082 - Posted: 23 Aug 2013, 21:21:11 UTC Add the following to your appinfo.xml or mb_cmdline*.txt -peroiod_iterations_num 40 -sbs 256 This should fix it for you. I experienced the 7700 sometimes need lower values on single buffer size. So you can try -sbs 192 or -sbs 156. With each crime and every kindness we birth our future. ID: 1407082 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.