Message boards :
Number crunching :
Stability problems in Windows 8
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
Status report from Win7 system with HW from Win8: _________________________________________________________________________________________________________ CONCLUSION _________________________________________________________________________________________________________ Your system appears to be having trouble handling real-time audio and other tasks. You are likely to experience buffer underruns appearing as drop outs, clicks or pops. One or more DPC routines that belong to a driver running in your system appear to be executing for too long. Also one or more ISR routines that belong to a driver running in your system appear to be executing for too long. One problem may be related to power management, disable CPU throttling settings in Control Panel and BIOS setup. Check for BIOS updates. LatencyMon has been analyzing your system for 1:41:40 (h:mm:ss) on all processors. _________________________________________________________________________________________________________ SYSTEM INFORMATION _________________________________________________________________________________________________________ Computer name: PEGASUS OS version: Windows 7 Service Pack 1, 6.1, build: 7601 (x64) Hardware: ASRock, Z77 WS CPU: GenuineIntel Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz Logical processors: 8 Processor groups: 1 RAM: 16341 MB total _________________________________________________________________________________________________________ CPU SPEED _________________________________________________________________________________________________________ Reported CPU speed: 3500,0 MHz Measured CPU speed: 2087,0 MHz (approx.) Note: reported execution times may be calculated based on a fixed reported CPU speed. Disable variable speed settings like Intel Speed Step and AMD Cool N Quiet in the BIOS setup for more accurate results. _________________________________________________________________________________________________________ MEASURED INTERRUPT TO USER PROCESS LATENCIES _________________________________________________________________________________________________________ The interrupt to process latency reflects the measured interval that a usermode process needed to respond to a hardware request from the moment the interrupt service routine started execution. This includes the scheduling and execution of a DPC routine, the signaling of an event and the waking up of a usermode thread from an idle wait state in response to that event. Highest measured interrupt to process latency (µs): 1956231,310545 Average measured interrupt to process latency (µs): 22,645260 Highest measured interrupt to DPC latency (µs): 1956229,555151 Average measured interrupt to DPC latency (µs): 21,434103 _________________________________________________________________________________________________________ REPORTED ISRs _________________________________________________________________________________________________________ Interrupt service routines are routines installed by the OS and device drivers that execute in response to a hardware interrupt signal. Highest ISR routine execution time (µs): 299674,124857 Driver with highest ISR routine execution time: dxgkrnl.sys - DirectX Graphics Kernel, Microsoft Corporation Highest reported total ISR routine time (%): 1,301748 Driver with highest ISR total time: dxgkrnl.sys - DirectX Graphics Kernel, Microsoft Corporation Total time spent in ISRs (%) 1,564753 ISR count (execution time <250 µs): 114274910 ISR count (execution time 250-500 µs): 0 ISR count (execution time 500-999 µs): 39 ISR count (execution time 1000-1999 µs): 112 ISR count (execution time 2000-3999 µs): 63 ISR count (execution time >=4000 µs): 0 _________________________________________________________________________________________________________ REPORTED DPCs _________________________________________________________________________________________________________ DPC routines are part of the interrupt servicing dispatch mechanism and disable the possibility for a process to utilize the CPU while it is interrupted until the DPC has finished execution. Highest DPC routine execution time (µs): 49397,993143 Driver with highest DPC routine execution time: ntoskrnl.exe - NT Kernel & System, Microsoft Corporation Highest reported total DPC routine time (%): 1,230810 Driver with highest DPC total execution time: dxgkrnl.sys - DirectX Graphics Kernel, Microsoft Corporation Total time spent in DPCs (%) 0,970995 DPC count (execution time <250 µs): 76921706 DPC count (execution time 250-500 µs): 0 DPC count (execution time 500-999 µs): 394 DPC count (execution time 1000-1999 µs): 14 DPC count (execution time 2000-3999 µs): 45 DPC count (execution time >=4000 µs): 0 _________________________________________________________________________________________________________ REPORTED HARD PAGEFAULTS _________________________________________________________________________________________________________ Hard pagefaults are events that get triggered by making use of virtual memory that is not resident in RAM but backed by a memory mapped file on disk. The process of resolving the hard pagefault requires reading in the memory from disk while the process is interrupted and blocked from execution. NOTE: some processes were hit by hard pagefaults. If these were programs producing audio, they are likely to interrupt the audio stream resulting in dropouts, clicks and pops. Check the Processes tab to see which programs were hit. Process with highest pagefault count: avgrsa.exe Total number of hard pagefaults 16518 Hard pagefault count of hardest hit process: 7990 Highest hard pagefault resolution time (µs): 6757056,1080 Total time spent in hard pagefaults (%): 1,054746 Number of processes hit: 44 _________________________________________________________________________________________________________ PER CPU DATA _________________________________________________________________________________________________________ CPU 0 Interrupt cycle time (s): 1687,328820 CPU 0 ISR highest execution time (µs): 299674,124857 CPU 0 ISR total execution time (s): 763,720475 CPU 0 ISR count: 114275166 CPU 0 DPC highest execution time (µs): 49397,993143 CPU 0 DPC total execution time (s): 457,564790 CPU 0 DPC count: 71975428 _________________________________________________________________________________________________________ CPU 1 Interrupt cycle time (s): 20,775048 CPU 1 ISR highest execution time (µs): 0,0 CPU 1 ISR total execution time (s): 0,0 CPU 1 ISR count: 0 CPU 1 DPC highest execution time (µs): 2014,664571 CPU 1 DPC total execution time (s): 1,005917 CPU 1 DPC count: 440774 _________________________________________________________________________________________________________ CPU 2 Interrupt cycle time (s): 48,155371 CPU 2 ISR highest execution time (µs): 0,0 CPU 2 ISR total execution time (s): 0,0 CPU 2 ISR count: 0 CPU 2 DPC highest execution time (µs): 414,8980 CPU 2 DPC total execution time (s): 2,745514 CPU 2 DPC count: 489936 _________________________________________________________________________________________________________ CPU 3 Interrupt cycle time (s): 27,237191 CPU 3 ISR highest execution time (µs): 0,0 CPU 3 ISR total execution time (s): 0,0 CPU 3 ISR count: 0 CPU 3 DPC highest execution time (µs): 47216,4620 CPU 3 DPC total execution time (s): 7,125037 CPU 3 DPC count: 1932254 _________________________________________________________________________________________________________ CPU 4 Interrupt cycle time (s): 12,828998 CPU 4 ISR highest execution time (µs): 0,0 CPU 4 ISR total execution time (s): 0,0 CPU 4 ISR count: 0 CPU 4 DPC highest execution time (µs): 905,518286 CPU 4 DPC total execution time (s): 0,954342 CPU 4 DPC count: 440332 _________________________________________________________________________________________________________ CPU 5 Interrupt cycle time (s): 19,926407 CPU 5 ISR highest execution time (µs): 0,0 CPU 5 ISR total execution time (s): 0,0 CPU 5 ISR count: 0 CPU 5 DPC highest execution time (µs): 47325,272286 CPU 5 DPC total execution time (s): 0,924708 CPU 5 DPC count: 430346 _________________________________________________________________________________________________________ CPU 6 Interrupt cycle time (s): 22,510830 CPU 6 ISR highest execution time (µs): 0,0 CPU 6 ISR total execution time (s): 0,0 CPU 6 ISR count: 0 CPU 6 DPC highest execution time (µs): 489,357429 CPU 6 DPC total execution time (s): 1,289997 CPU 6 DPC count: 488128 _________________________________________________________________________________________________________ CPU 7 Interrupt cycle time (s): 22,114244 CPU 7 ISR highest execution time (µs): 0,0 CPU 7 ISR total execution time (s): 0,0 CPU 7 ISR count: 0 CPU 7 DPC highest execution time (µs): 185,686857 CPU 7 DPC total execution time (s): 2,310174 CPU 7 DPC count: 724985 _________________________________________________________________________________________________________ |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Hmmm, great sluething. Isolating DPC spike issues can be a pain, and eventually turn out to be something unexpected. So it's followed the hardware to a different OS, therefore different set of chipset drivers etc too ? I believe that leaves only firmware/BIOS and then hardware. The problem I found when going through this process, after improvement when force updating every Intel chipset driver, was a Wifi adaptor driver. I ended up having to use a modified driver for that. Fingers crossed in your case it'll narrow down to something simpler like a BIOS update and forcing every Intel chipset driver to update. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
Hello, My assumption on HW was premature. I got the stall in Win 8 also with the other MB + cpu unit. Definately the PCI1 + PCI5 where both are x16 lanes triggers the problem easiest. Also the optimized astropulse application triggers it more easy than multibeam. If the GPU is put in a slot that only uses x4 lanes + 8x lanes then the problem does not occur. Could it be that no-one ever before verified low-level heavy usage of the pcie bus without sli with all x16 lanes? A firmware debug would be nice if AsRock takes up the task. Maybe Plex bridge chip is playing role here? What is the way out? Use intel 2011 architecture without bridge chip? Switch from Windows to Linux? Another thing in Windows 7. The host is of course crunching with video-off. When I left DPC Latency Checker running and return to desktop the histoy was full of hich >16000us spikes, which dissapeared when video became on. Could it be that some firmware/driver does not cope well with heavy load on GPU, while video is off at the same time? |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
So it's followed the hardware to a different OS, therefore different set of chipset drivers etc too ? I believe that leaves only firmware/BIOS and then hardware. The problem I found when going through this process, after improvement when force updating every Intel chipset driver, was a Wifi adaptor driver. I ended up having to use a modified driver for that. Well, most likely the drivers of Win8 x64 and Win7 x64 are quite similar, or? In this case the MB were identical with same BIOS and the Serial NO was only 1 off between them. The PCIe slot definately plays role. What worries me is that once the problems starts occurring only a system restart or a nvidia driver restart (in Win7) put things back to normal. |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
I got the following report: Driverfile, Highest execution (ms), total execution (ms) dxgkrnl.sys 50,217370 46222,026890 HECIx64.sys 49,9968343 3073,300975 USBPORT.SYS 49,924763 26688,221027 iaStor.sys 47,873097 1292,611844 ndis.sys 18,997810 132,713971 tcpip.sys 4,967767 378,522447 nvlddmkm.sys 2,457429 24818,047199 Go figure.... the ISR and DPC is having spikes even after crunching is stopped. |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
Could this SW be useful for debugging the PCIe bus? http://www.tssc.de/index.htm |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Could this SW be useful for debugging the PCIe bus? mmmm, not sure, possibly,haven't had to debug PCIe at or below firmware level before myself. Given there's a PCIe controller in the CPU these days as well, it could be worth unmounting the CPU & looking at the lands, and the socket pins, with a magnifying glass. Who knows, even if no obvious damage then maybe just the act of reseating could knock some oxidation off. As aside note: future applications will probably use the PCIe bus less and less, though I realise that doesn't help in this situation, where spikes are occurring even without applications running. That reduction is more for performance and scalability down the line. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
Really wierd. It seems that ASRock motherboard seems to have a problem to handle 2 x16 cards at the same time. The reason for this is that I popped in a 3rd card GTX670 and system seems to work at least past the 10minute mark. |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
Hello, Looks like the diagnosis starts to take shape on this one. I reduced the powerbudget and modified a more agressive fan curve for the Nvidia cards using Precision. It seems that the Factory settings is not always on par for stability when GPU temperature is increased from 70 to 80 C. The card that is in the topmost slot will have higher temperatures when another card is inserted below. I wonder why the manufacturer did not flash in the card bios such setting that are stable in all conditions. Workaround for now is to run Precision with a modified Fan Curve + Powerbudget if neccessary to keep the temperature lower. In multi GPU systems it would be nice to get indication which GPU is causing unstability & high DPC spikes. ------------------ I managed to get a new all time high of 54230 of Event ID 14 nvlddmkm in the eventviewer within last hour when I setup a new configuration of GPU in a new slot. |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
Still quite high DPC. Needed to downclock to avoid crash... |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
Still problem with 2 cards.. |
spitfire_mk_2 Send message Joined: 14 Apr 00 Posts: 563 Credit: 27,306,885 RAC: 0 |
Still problem with 2 cards.. Did you open the pc case? |
Mark Lybeck Send message Joined: 9 Aug 99 Posts: 245 Credit: 216,677,290 RAC: 173 |
Case has been opened and closed several times for getting gear in and out. I happened today to run one computer with only one GTX680. Even without any load there are high DPC spikes. I ran CudaMemTest and guess what: It failed. I rerun in the same computer, same PCIe slot another 670FTW and it passed. I need to retest the GTX680 again in another computer and also retest in another slot. Maybe the GTX680 has died... Which would actually be a relief, giving the circumstances if it will be verified over the next days. I recall that with GTX560 you just got inconsistent results. Having driver crashes and BSOD is something new. It should not happen. ****************** PASS 6 ****************** TEST 1 : OWN address PASS 6, TEST 1, Own Address, fill and check time: 977.082458 (ms) No error detected TEST 2 : MOVING One and Zeros Thread 0 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 Thread 1 : 429065504 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 2 : 1632124 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 3 : 268635636 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 4 : 5120 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 5 : 7887384 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 6 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 7 : 7571496 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 8 : 174 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 9 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 17 : 828863230 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 18 : 1632104 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 19 : 268560861 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 20 : -529697949 error(s). Faulty addresses (up to 4) :0x0 Thread 21 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 22 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 Thread 23 : 1632092 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 24 : -529697949 error(s). Faulty addresses (up to 4) :0x0 Thread 25 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 28 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 Thread 29 : 429065504 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 30 : 1632124 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 31 : 268635636 error(s). Faulty addresses (up to 4) :0x0,0,0,0 PASS 6, TEST 2 : Moving inversions, ones&zeros , fill and check time: 1991.15124 5 (ms) TEST 3 : MOVING Inversions - 8 bits pattern Thread 0 : 4199485 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 1 : 32 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 2 : 268635636 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 3 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 Thread 4 : 429065504 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 5 : 1632124 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 6 : 268635636 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 7 : 5120 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 8 : 7887384 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 9 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 10 : 7571496 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 11 : 174 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 12 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 20 : 828863230 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 21 : 1632104 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 22 : 268560861 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 23 : -529697949 error(s). Faulty addresses (up to 4) :0x0 Thread 24 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 25 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 Thread 26 : 1632092 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 27 : -529697949 error(s). Faulty addresses (up to 4) :0x0 Thread 28 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 31 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 PASS 6, TEST 3 : Moving inversions, 8 bit pattern , fill and check time: 1982.14 1235 (ms) TEST 4 : MOVING Inversions - Random Pattern Thread 0 : 4199485 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 1 : 32 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 2 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 Thread 3 : 4199485 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 4 : 32 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 5 : 268635636 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 6 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 Thread 7 : 429065504 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 8 : 1632124 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 9 : 268635636 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 10 : 5120 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 11 : 7887384 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 12 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 13 : 7571496 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 14 : 174 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 15 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 23 : 828863230 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 24 : 1632104 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 25 : 268560861 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 26 : -529697949 error(s). Faulty addresses (up to 4) :0x0 Thread 27 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 28 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 Thread 29 : 1632092 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 30 : -529697949 error(s). Faulty addresses (up to 4) :0x0 Thread 31 : 1 error(s). Faulty addresses (up to 4) :0x0 PASS 6, TEST 4 : Moving inversions, random pattern , fill and check time: 2093.8 46924 (ms) TEST 5 : Moving inversions, 32 bit pattern Thread 0 : 4199485 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 1 : 32 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 2 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 3 : 4199485 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 4 : 32 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 5 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 Thread 6 : 4199485 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 7 : 32 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 8 : 268635636 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 9 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 Thread 10 : 429065504 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 11 : 1632124 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 12 : 268635636 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 13 : 5120 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 14 : 7887384 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 15 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 16 : 7571496 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 17 : 174 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 18 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 26 : 828863230 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 27 : 1632104 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 28 : 268560861 error(s). Faulty addresses (up to 4) :0x0,0,0,0 Thread 29 : -529697949 error(s). Faulty addresses (up to 4) :0x0 Thread 30 : 1 error(s). Faulty addresses (up to 4) :0x0 Thread 31 : 3 error(s). Faulty addresses (up to 4) :0x0,0,0 PASS 6, TEST 5 : Moving inversions, 32 bit pattern , fill and check time: 1981.2 79907 (ms) TEST 6 : Modulo 20, random pattern PASS 6, TEST 6, Modulo 20, Random Pattern, fill and check time: 995.270020 (ms) No error detected |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.