After 7 days of GPU crunching AMD driver crashes...


log in

Advanced search

Questions and Answers : GPU applications : After 7 days of GPU crunching AMD driver crashes...

Author Message
Dave
Send message
Joined: 30 Oct 12
Posts: 12
Credit: 1,834,278
RAC: 21
Germany
Message 1420510 - Posted: 26 Sep 2013, 12:39:10 UTC
Last modified: 26 Sep 2013, 12:42:40 UTC

Hi,

I had my PC running for 7 days in a row crunching SETI@HOME with Lunatics optimizations installed.

After 7 days however, the AMD driver crashed and Boinc stopped using the GPU from that point on and instead started using the CPU.

Windows event log states the following:

Der Anzeigetreiber "amdkmdap" reagiert nicht mehr und wurde wiederhergestellt

(Which means that amdkmdap has crashed).

Temps were all god. Max 59°C GPU core. I had the fans spinning at quite high rpm as well to reduce the risk of overheating.

What can I do? The GPU is running at stock frequency. It's a 7970 Gigabyte Windforce 3x @ 1 GHz (stock OCed). AMD driver is the 13.4 WHQL.

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,321
RAC: 0
Korea, North
Message 1420544 - Posted: 26 Sep 2013, 15:31:03 UTC - in response to Message 1420510.

It appears that you started off getting this error 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED. Then you started just spitting out WU's with 30 signals. Did you free up at least 1 CPU core? IIRC this error The AMD/ATI cards need at least 1 CPU core to feed the GPU. If you are running 3+ WU's on your GPU then free up 2 cores for the GPU feed.
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Dave
Send message
Joined: 30 Oct 12
Posts: 12
Credit: 1,834,278
RAC: 21
Germany
Message 1421162 - Posted: 27 Sep 2013, 20:54:37 UTC

That's how my app_config looks like:

app_config>
<app>
<name>setiathome_v7</name>
<gpu_versions>
<gpu_usage>.25</gpu_usage>
<cpu_usage>1</cpu_usage>
</gpu_versions>
</app>
<app>
<name>astropulse_v6</name
<gpu_versions>
<gpu_usage>.25</gpu_usage>
<cpu_usage>1</cpu_usage>
</gpu_versions>
</app>
<app>
<name>setiathome_enhanced</name>
<gpu_versions>
<gpu_usage>.25</gpu_usage>
<cpu_usage>1</cpu_usage>
</gpu_versions>
</app>
</app_config>

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,321
RAC: 0
Korea, North
Message 1421173 - Posted: 27 Sep 2013, 21:22:27 UTC - in response to Message 1421162.

you need to set the % usage either in the BOINC manager or on Seti@home. It can easily be set in your account page. Under Comuting Preferences choose "Use at most " and set it to 50%. this will free up 2 Cores and all your GPU to feed properly. You'll need to update the BOINC manager for this to take effect.

Also, I'd set your computers as a separate prefences (work,home,school) since the other one is working fine as is.

____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 361,286
RAC: 37
Germany
Message 1421203 - Posted: 28 Sep 2013, 0:04:29 UTC - in response to Message 1421173.

In theory, with that app_config, BOINC should keep one core free for each running GPU task, so 4 in total:

<gpu_usage>.25</gpu_usage>
<cpu_usage>1</cpu_usage>


@Dave: Did you reboot after the driver crash?

If a crash happens frequently when using th GPU, you should try running less than 4 tasks concurrently.

Gruß
Gundolf

Stefan158
Send message
Joined: 26 Aug 05
Posts: 3
Credit: 131,464
RAC: 99
Germany
Message 1426651 - Posted: 10 Oct 2013, 16:21:40 UTC - in response to Message 1421203.
Last modified: 10 Oct 2013, 16:36:20 UTC

Hallo zusammen.
Sorry, dass ich auf deutsch schreibe.
Habe Boinc 7.0.64 (x64) 2.8.10 auf Win 7 64 Bit.AMD Radeon HD6310 Grafik, AMD E350 Prozessor 1,6 GHz.
Mit dem Rechner hatte ich schon viele GPU-Aufgaben berechnet.
Seit dem Update auf 7.0.64 hatte ich auch die Fehlermeldung mit dem wiederhergestellten Anzeigetreiber. Es wurde eine Berechnung gestartet, und kurz darauf kam die Fehlermeldung. Gestern hab ich den Grafiktreiber aktualisiert:
Datum 30.08.2013 Version 13.152.0.0. Jetzt wird eine Berechnung gestartet, und nach ein paar Sekunden kommt die Meldung: Berechnungsfehler.
Leider kann ich nicht mehr nachvollziehen, mit welcher älteren Boinc Version die Nutzung der GPU problemlos geklappt hat. Hat jemand eine Lösung?
Grüße,
Stefan

Hier Boinc Meldungen:
10.10.2013 17:33:35 | | No config file found - using defaults
10.10.2013 17:33:35 | | Starting BOINC client version 7.0.64 for windows_x86_64
10.10.2013 17:33:35 | | log flags: file_xfer, sched_ops, task
10.10.2013 17:33:35 | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
10.10.2013 17:33:35 | | Data directory: C:\ProgramData\BOINC
10.10.2013 17:33:35 | | Running under account Stefan
10.10.2013 17:33:35 | | Processor: 2 AuthenticAMD AMD E-350 Processor [Family 20 Model 1 Stepping 0]
10.10.2013 17:33:35 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 cx16 popcnt syscall nx lm svm sse4a ibs skinit wdt page1gb rdtscp
10.10.2013 17:33:35 | | OS: Microsoft Windows 7: Home Premium x64 Edition, Service Pack 1, (06.01.7601.00)
10.10.2013 17:33:35 | | Memory: 3.60 GB physical, 7.51 GB virtual
10.10.2013 17:33:35 | | Disk: 254.14 GB total, 151.35 GB free
10.10.2013 17:33:35 | | Local time is UTC +2 hours
10.10.2013 17:33:35 | | CAL: ATI GPU 0: AMD Radeon HD 6200/6300/7300 series (Wrestler) (CAL version 1.4.1848, 384MB, 351MB available, 157 GFLOPS peak)
10.10.2013 17:33:35 | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 6200/6300/7300 series (Wrestler) (driver version 1268.1 (VM), device version OpenCL 1.2 AMD-APP (1268.1), 384MB, 351MB available, 157 GFLOPS peak)
10.10.2013 17:33:35 | Einstein@Home | URL http://einstein.phys.uwm.edu/; Computer ID 4072866; resource share 100
10.10.2013 17:33:35 | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 5860248; resource share 100
10.10.2013 17:33:35 | SETI@home | General prefs: from SETI@home (last modified 30-Dec-2006 13:01:03)
10.10.2013 17:33:35 | SETI@home | Computer location: home
10.10.2013 17:33:35 | SETI@home | General prefs: no separate prefs for home; using your defaults
10.10.2013 17:33:35 | | Reading preferences override file
10.10.2013 17:33:35 | | Preferences:
10.10.2013 17:33:35 | | max memory usage when active: 1845.45MB
10.10.2013 17:33:35 | | max memory usage when idle: 3321.81MB
10.10.2013 17:33:35 | | max disk usage: 1.00GB
10.10.2013 17:33:35 | | max CPUs used: 1
10.10.2013 17:33:35 | | don't use GPU while active
10.10.2013 17:33:35 | | suspend work if non-BOINC CPU load exceeds 80 %
10.10.2013 17:33:35 | | (to change preferences, visit a project web site or select Preferences in the Manager)
10.10.2013 17:33:35 | | Not using a proxy
____________

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 361,286
RAC: 37
Germany
Message 1426732 - Posted: 10 Oct 2013, 18:04:25 UTC - in response to Message 1426651.
Last modified: 10 Oct 2013, 18:06:13 UTC

Seit dem Update auf 7.0.64 hatte ich auch die Fehlermeldung mit dem wiederhergestellten Anzeigetreiber. Es wurde eine Berechnung gestartet, und kurz darauf kam die Fehlermeldung. Gestern hab ich den Grafiktreiber aktualisiert:
Datum 30.08.2013 Version 13.152.0.0. Jetzt wird eine Berechnung gestartet, und nach ein paar Sekunden kommt die Meldung: Berechnungsfehler.

Hast Du zwischen dem letzten Treiber-Restart und dem Berechnungsfehler den Rechner neu gestartet? [edit] Mit Strom abschalten (Stecker ziehen)[/edit]

Gruß
Gundolf

Stefan158
Send message
Joined: 26 Aug 05
Posts: 3
Credit: 131,464
RAC: 99
Germany
Message 1427530 - Posted: 12 Oct 2013, 11:13:26 UTC - in response to Message 1426732.

Hallo Gundolf.
Ja, hatte ich.
Vorgestern hat Win7 25 Updates gemacht, vorher hatte ich alle Projekte zurückgesetzt. Siehe da: Es klappt!!
Schönes We,
Stefan
____________

Stefan158
Send message
Joined: 26 Aug 05
Posts: 3
Credit: 131,464
RAC: 99
Germany
Message 1456383 - Posted: 22 Dec 2013, 13:43:42 UTC
Last modified: 22 Dec 2013, 13:56:48 UTC

So, jetzt hat es ein paar Wochen geklappt, jetzt bekomme ich nach ein bis zwei Minuten wieder Berechnungsfehler. Werde noch mal alles zurücksetzen. Wenns dann nicht geht werd ich das ogl versuchen abzustellen. Kann mir einer sagen wie das geht? Wenn ich ogl nicht zulasse, werden trotzdem Rechenaufgaben geschickt, halt nur als gestoppt angezeigt.
Grüße,
Stefan

22.12.2013 13:39:28 | SETI@home | Sending scheduler request: To fetch work.
22.12.2013 13:39:28 | SETI@home | Reporting 1 completed tasks
22.12.2013 13:39:28 | SETI@home | Requesting new tasks for ATI
22.12.2013 13:39:32 | SETI@home | Scheduler request completed: got 1 new tasks
22.12.2013 13:39:34 | SETI@home | Started download of 03no13ab.18213.14791.438086664195.12.220
22.12.2013 13:39:38 | SETI@home | Finished download of 03no13ab.18213.14791.438086664195.12.220
22.12.2013 13:39:38 | SETI@home | Starting task 03no13ab.18213.14791.438086664195.12.220_0 using setiathome_v7 version 703 (opencl_ati5_cat132) in slot 1
22.12.2013 13:40:07 | SETI@home | Computation for task 03no13ab.18213.14791.438086664195.12.220_0 finished
22.12.2013 13:40:09 | SETI@home | Started upload of 03no13ab.18213.14791.438086664195.12.220_0_0
22.12.2013 13:40:13 | SETI@home | Finished upload of 03no13ab.18213.14791.438086664195.12.220_0_0
22.12.2013 13:44:37 | SETI@home | Sending scheduler request: To fetch work.
22.12.2013 13:44:37 | SETI@home | Reporting 1 completed tasks
22.12.2013 13:44:37 | SETI@home | Requesting new tasks for ATI
22.12.2013 13:44:40 | SETI@home | Scheduler request completed: got 0 new tasks
22.12.2013 13:54:46 | SETI@home | Sending scheduler request: To fetch work.
22.12.2013 13:54:46 | SETI@home | Requesting new tasks for ATI
22.12.2013 13:54:48 | SETI@home | Scheduler request completed: got 0 new tasks
22.12.2013 14:14:53 | SETI@home | Sending scheduler request: To fetch work.
22.12.2013 14:14:53 | SETI@home | Requesting new tasks for ATI
____________

Questions and Answers : GPU applications : After 7 days of GPU crunching AMD driver crashes...

Copyright © 2014 University of California