AMD driver failed and recovered

Message boards : Number crunching : AMD driver failed and recovered
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
John

Send message
Joined: 31 Aug 13
Posts: 17
Credit: 2,604,894
RAC: 3,040
United States
Message 1481106 - Posted: 24 Feb 2014, 2:13:25 UTC

I am having trouble with SETI and my GPU. It says "AMD driver failed and recovered." I saw the posts recommending suspending/resuming or rebooting to get it to proceed but that doesn't work. Once I get that error message, it will never get beyond that point on that particular work unit. For example, it is stuck right now at 20.271% and no matter what I do or how long I let it work, it will never get any further. I have to abort the work unit. Some units it does fine, but others it gets stuck. I have the latest AMD drivers and version of BOINC. I've included the first portion of the event log so you can see the specs for my system. Any thoughts on how I can solve this problem?

2/23/2014 12:00:54 PM | | Starting BOINC client version 7.2.39 for windows_x86_64
2/23/2014 12:00:54 PM | | log flags: file_xfer, sched_ops, task
2/23/2014 12:00:54 PM | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
2/23/2014 12:00:54 PM | | Data directory: C:\ProgramData\BOINC
2/23/2014 12:00:54 PM | | Running under account John
2/23/2014 12:00:54 PM | | CAL: ATI GPU 0: AMD Radeon HD 6350/6450/7450/7470 series (Caicos) (CAL version 1.4.1848, 1024MB, 991MB available, 400 GFLOPS peak)
2/23/2014 12:00:54 PM | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 6350/6450/7450/7470 series (Caicos) (driver version 1348.5 (VM), device version OpenCL 1.2 AMD-APP (1348.5), 1024MB, 991MB available, 400 GFLOPS peak)
2/23/2014 12:00:54 PM | | OpenCL CPU: AMD FX(tm)-4300 Quad-Core Processor (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1348.5 (sse2,avx,fma4), device version OpenCL 1.2 AMD-APP (1348.5))
2/23/2014 12:00:54 PM | | Host name: JACpc
2/23/2014 12:00:54 PM | | Processor: 4 AuthenticAMD AMD FX(tm)-4300 Quad-Core Processor [Family 21 Model 2 Stepping 0]
2/23/2014 12:00:54 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 popcnt aes syscall nx lm svm sse4a osvw ibs xop skinit wdt lwp fma4 tce tbm topx page1gb rdtscp
2/23/2014 12:00:54 PM | | OS: Microsoft Windows 8.1: Core x64 Edition, (06.03.9600.00)
2/23/2014 12:00:54 PM | | Memory: 8.00 GB physical, 16.00 GB virtual
2/23/2014 12:00:54 PM | | Disk: 931.17 GB total, 719.55 GB free
2/23/2014 12:00:54 PM | | Local time is UTC -5 hours
2/23/2014 12:00:54 PM | | VirtualBox version: 4.3.6
ID: 1481106 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,350,878
RAC: 579
United States
Message 1481116 - Posted: 24 Feb 2014, 3:11:49 UTC

I can't give you any help except for this,
some GPU work units have parts that have
to be done on the CPU. While this is happening
the GPU just sits there and "looks" hung, but it
isn't. Depending on the circumstances it can
take hours for the cpu to finish it's part and
for the GPU to start up again. And this can happen
more than once. Try letting a hung one set for a day
and make note of it's progress. Patience is necessary
for some of the ways Seti@home works.
ID: 1481116 · Report as offensive
John

Send message
Joined: 31 Aug 13
Posts: 17
Credit: 2,604,894
RAC: 3,040
United States
Message 1481129 - Posted: 24 Feb 2014, 5:08:43 UTC - in response to Message 1481116.  

I've let one work for close to a month. Waiting doesn't do any good. It simply doesn't do any work after the error message. On the ones without the error message, it progresses in a normal linear fashion without pauses until completion.
ID: 1481129 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,350,878
RAC: 579
United States
Message 1481136 - Posted: 24 Feb 2014, 5:39:03 UTC - in response to Message 1481129.  

Sorry. I don't have any of your hardware so I'm not
of much help there.

There's a guy named TBAR that posts here that might be of more help.

Has this worked previously with other/older BOINC software?
ID: 1481136 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 30876
Credit: 60,007,113
RAC: 23,988
Germany
Message 1481161 - Posted: 24 Feb 2014, 8:40:08 UTC

We are talking about seti V7 here not AP`s.

What are your CPU`s doing ?
Do you have idle cores ?

Are you sure you deinstalled old drivers correctly ?
With each crime and every kindness we birth our future.
ID: 1481161 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3717
Credit: 9,138,377
RAC: 2,765
Bulgaria
Message 1481216 - Posted: 24 Feb 2014, 13:46:19 UTC - in response to Message 1481106.  

You can read this (rather LONG) thread with the same problem:
http://setiathome.berkeley.edu/forum_thread.php?id=72035

The 'solution' is posted here (downgrade BOINC to version 7.0.28):
http://setiathome.berkeley.edu/forum_thread.php?id=72035&postid=1385824#1385824

BOINC 7.0.64 already had the problem so it seems this continues on BOINC 7.2.39
I think it was something in BOINC API change (the way BOINC communicates with applications)

All BOINC Versions (Ctrl+F for 7.0.28):
http://boinc.berkeley.edu/dl/?C=M;O=A

I don't remember will there be a problem to downgrade BOINC to version 7.0.28 from BOINC 7.2.39



- ALF - "Find out what you don't do well ..... then don't do it!" :)
ID: 1481216 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11738
Credit: 111,634,658
RAC: 48,587
United Kingdom
Message 1481219 - Posted: 24 Feb 2014, 14:02:00 UTC - in response to Message 1481216.  

You can read this (rather LONG) thread with the same problem:
http://setiathome.berkeley.edu/forum_thread.php?id=72035

The 'solution' is posted here (downgrade BOINC to version 7.0.28):
http://setiathome.berkeley.edu/forum_thread.php?id=72035&postid=1385824#1385824

BOINC 7.0.64 already had the problem so it seems this continues on BOINC 7.2.39
I think it was something in BOINC API change (the way BOINC communicates with applications)

All BOINC Versions (Ctrl+F for 7.0.28):
http://boinc.berkeley.edu/dl/?C=M;O=A

I don't remember will there be a problem to downgrade BOINC to version 7.0.28 from BOINC 7.2.39

You'd end up on the wrong side of the EXIT_TIME_LIMIT_EXCEEDED error on GPU work barrier - which wouldn't be a problem going down, but would need care and attention when coming back up.
ID: 1481219 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3717
Credit: 9,138,377
RAC: 2,765
Bulgaria
Message 1481231 - Posted: 24 Feb 2014, 14:54:28 UTC - in response to Message 1481219.  
Last modified: 24 Feb 2014, 15:07:09 UTC

I don't remember will there be a problem to downgrade BOINC to version 7.0.28 from BOINC 7.2.39

You'd end up on the wrong side of the EXIT_TIME_LIMIT_EXCEEDED error on GPU work barrier - which wouldn't be a problem going down, but would need care and attention when coming back up.

As I understand this:
Since he is a stock applications user it will be no problem to go back/forth between 7.0.28 <-> 7.2.39 of BOINC versions?

I don't remember were any 'Recommended version' of BOINC between 7.0.28 and 7.0.64
Also don't know when BOINC API changed and become a problem for Windows 8 + ATI AMD drivers + ATI GPU apps




- ALF - "Find out what you don't do well ..... then don't do it!" :)
ID: 1481231 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11738
Credit: 111,634,658
RAC: 48,587
United Kingdom
Message 1481233 - Posted: 24 Feb 2014, 15:03:14 UTC - in response to Message 1481231.  

I don't remember will there be a problem to downgrade BOINC to version 7.0.28 from BOINC 7.2.39

You'd end up on the wrong side of the EXIT_TIME_LIMIT_EXCEEDED error on GPU work barrier - which wouldn't be a problem going down, but would need care and attention when coming back up.

As I understand this:
Since he is a stock applications user it will be no problem to go back/forth between 7.0.28 <-> 7.2.39 of BOINC versions?

OK, I'd forgotten the details - it's been a while.

Just so long as he doesn't try to solve his original problem by installing an anonymous platform application - do one or the other, but not both.
ID: 1481233 · Report as offensive
John

Send message
Joined: 31 Aug 13
Posts: 17
Credit: 2,604,894
RAC: 3,040
United States
Message 1481245 - Posted: 24 Feb 2014, 15:51:17 UTC - in response to Message 1481161.  

My CPUs are usually busy with other projects. I suspended them just to see if it would make any difference and the scheduler loaded them up with SETI workunits. The GPU got stuck as usual.

I uninstalled and reinstalled the AMD drivers. I didn't help.
ID: 1481245 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3936
Credit: 206,698,168
RAC: 191,996
United States
Message 1481262 - Posted: 24 Feb 2014, 16:45:50 UTC - in response to Message 1481245.  

Have you tried deleting the App Binaries yet? Go to the project folder setiathome.berkeley.edu and delete all the .bin, .bin_V6, .bin_V7, & .wisdom files so newer ones can be compiled.
You can see some of the files here;



Delete all the MB_clFFTplan...bin, AP_clFFTplan...bin, MultiBeam...bin_V7, AstroPulse...bin_V6, and xxx.wisdom files.

If that doesn't help, try using the Driver Uninstaller App to completely remove all of the driver and reinstall it. Delete the Binary files again after reinstalling the driver.
ID: 1481262 · Report as offensive
John

Send message
Joined: 31 Aug 13
Posts: 17
Credit: 2,604,894
RAC: 3,040
United States
Message 1481307 - Posted: 24 Feb 2014, 18:56:01 UTC - in response to Message 1481136.  

This computer started out with version 7.2.33. It had problems with the GPU which is why I upgraded to 7.2.39

Well, I tried going back to 7.0.28 this morning. It made no difference. Any other ideas?
ID: 1481307 · Report as offensive
John

Send message
Joined: 31 Aug 13
Posts: 17
Credit: 2,604,894
RAC: 3,040
United States
Message 1481310 - Posted: 24 Feb 2014, 19:01:31 UTC - in response to Message 1481262.  

How do I get to the project folders?
ID: 1481310 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3936
Credit: 206,698,168
RAC: 191,996
United States
Message 1481335 - Posted: 24 Feb 2014, 19:25:26 UTC - in response to Message 1481310.  

How do I get to the project folders?

2/23/2014 12:00:54 PM | | Data directory: C:\ProgramData\BOINC\projects\setiathome.berkeley.edu

ProgramData is a hidden folder. You will have to change the view preferences to see it.
ID: 1481335 · Report as offensive
OzzFan Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15640
Credit: 59,104,623
RAC: 42,518
United States
Message 1481353 - Posted: 24 Feb 2014, 19:53:19 UTC - in response to Message 1481335.  

How do I get to the project folders?

2/23/2014 12:00:54 PM | | Data directory: C:\ProgramData\BOINC\projects\setiathome.berkeley.edu

ProgramData is a hidden folder. You will have to change the view preferences to see it.


Or simply type it into the address bar in Windows Explorer, which prefer to un-hiding if I'm working on someone else's machine.
ID: 1481353 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3936
Credit: 206,698,168
RAC: 191,996
United States
Message 1481384 - Posted: 24 Feb 2014, 22:05:22 UTC
Last modified: 24 Feb 2014, 22:18:45 UTC

I'm sitting here waiting for what ever it is wrong with the AstroPulse downloads, so, I decided to try removing the app_info file and see how stock works on this host; http://setiathome.berkeley.edu/show_host_detail.php?hostid=6796475. It works perfectly fine. Went zooming right on by 20%, http://setiathome.berkeley.edu/result.php?resultid=3405606679. No driver restarts here with Windows 8.1 32bit.

I really do hope they fix the APs soon. My other ATI host can't run ATI MB7 and it's about out of work.
ID: 1481384 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3717
Credit: 9,138,377
RAC: 2,765
Bulgaria
Message 1482188 - Posted: 27 Feb 2014, 5:38:54 UTC - in response to Message 1481384.  

I decided to try removing the app_info file and see how stock works on this host; http://setiathome.berkeley.edu/show_host_detail.php?hostid=6796475.
It works perfectly fine. Went zooming right on by 20%, http://setiathome.berkeley.edu/result.php?resultid=3405606679.
No driver restarts here with Windows 8.1 32bit.


WU true angle range is : 2.711553

The original problem (17 Jun 2013) didn't happen on VHAR tasks
http://setiathome.berkeley.edu/forum_thread.php?id=72035&postid=1384767#1384767



- ALF - "Find out what you don't do well ..... then don't do it!" :)
ID: 1482188 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3717
Credit: 9,138,377
RAC: 2,765
Bulgaria
Message 1482192 - Posted: 27 Feb 2014, 5:52:44 UTC - in response to Message 1481307.  

Well, I tried going back to 7.0.28 this morning. It made no difference.

How long did you stay on BOINC 7.0.28 ?
Are you sure BOINC 7.0.28 was installed properly and that version was really running?
Did you use this file?:
http://boinc.berkeley.edu/dl/boinc_7.0.28_windows_x86_64.exe

Your computer still (or again?) shows 'BOINC version 7.2.39':
http://setiathome.berkeley.edu/show_host_detail.php?hostid=7177463



- ALF - "Find out what you don't do well ..... then don't do it!" :)
ID: 1482192 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3936
Credit: 206,698,168
RAC: 191,996
United States
Message 1482194 - Posted: 27 Feb 2014, 6:00:13 UTC - in response to Message 1482188.  
Last modified: 27 Feb 2014, 6:11:25 UTC

I decided to try removing the app_info file and see how stock works on this host; http://setiathome.berkeley.edu/show_host_detail.php?hostid=6796475.
It works perfectly fine. Went zooming right on by 20%, http://setiathome.berkeley.edu/result.php?resultid=3405606679.
No driver restarts here with Windows 8.1 32bit.


WU true angle range is : 2.711553

The original problem (17 Jun 2013) didn't happen on VHAR tasks
http://setiathome.berkeley.edu/forum_thread.php?id=72035&postid=1384767#1384767

I ran many more than just One. I posted the first one showing the Binary recompiles.
INFO: can't open binary kernel file: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MultiBeam_Kernels_r1831.cl_Capeverde.bin_V7, continue with recompile...
INFO: binary kernel file created
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_524288_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_8_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_16_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_32_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_64_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_128_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_256_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_512_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_1024_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_2048_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_4096_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_8192_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_16384_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_32768_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_65536_r1831.bin, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_Capeverde_131072_r1831.bin, continue with recompile...

This is one of the few Non-VHARs left in the results;
WU true angle range is : 0.432673
http://setiathome.berkeley.edu/result.php?resultid=3405606837
I'm fairly certain I'm not the only one around Not having problems with SETI@home v7 v7.03 (opencl_ati5_cat132) or SETI@home v7 v7.03 (opencl_ati_cat132) in Windows 8.1.
ID: 1482194 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3717
Credit: 9,138,377
RAC: 2,765
Bulgaria
Message 1482201 - Posted: 27 Feb 2014, 6:37:15 UTC - in response to Message 1482194.  

I'm fairly certain I'm not the only one around Not having problems with SETI@home v7 v7.03 (opencl_ati5_cat132) or SETI@home v7 v7.03 (opencl_ati_cat132) in Windows 8.1.

Except that your is Windows 8.1 32 bit (the only 2 problem machines I know/remember are using 'Windows 8.1 Core x64 Edition')
Yes, this problem can't be widespread (else I expect more than just 2 reports)


Seems like a combination of factors may cause this 'hang':

1) Windows 8.x x64

2) Weak GPU (low 'compute units')

John's 'AMD Radeon HD 6350/6450/7450/7470 series (Caicos)':

Used GPU device parameters are:
Number of compute units: 2
Single buffer allocation size: 64MB
max WG size: 256


TJ13's 'AMD A10-4600M APU with Radeon(tm) HD Graphics'

Used GPU device parameters are:
Number of compute units: 6
Single buffer allocation size: 64MB
max WG size: 256


3) Maybe combination of BOINC version and ATI app version (TJ13's computer runs BOINC 7.0.28 http://setiathome.berkeley.edu/show_host_detail.php?hostid=6973921 )

4) I'm not sure if it depend on ATI driver version



- ALF - "Find out what you don't do well ..... then don't do it!" :)
ID: 1482201 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : AMD driver failed and recovered


 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.