GPU task runs but never progresses

Questions and Answers : GPU applications : GPU task runs but never progresses
Message board moderation

To post messages, you must log in.

AuthorMessage
Hughsey2-1

Send message
Joined: 14 May 99
Posts: 4
Credit: 323,863
RAC: 1
United States
Message 1895964 - Posted: 18 Oct 2017, 15:35:21 UTC

I don't pay much attention to BOINC and the tasks it runs so I have no clue when this issue really started.

I noticed a SETI@home GPU task was overdue by over a month, but BOINC was still trying to process the task. I aborted the task, new one was pulled and it appeared to process normally, but again did not pay close attention to it. I may have also done a Reset Project at that time but don't recall. The current task I have was downloaded on Oct 11th. I noticed that it appeared to be starting over each day and never seemed to go past 0.053%. Today I switched the Activity to "Use GPU Always" to see what would happen. For the first hour or so, the Elapsed time increased, but the progress stayed at 0.053% and the Remaining time did an initial drop from the 1.5 hours and then stayed. Somewhere in hour two, the progress continues to stay at 0.053%, the Elapsed time continues to increase and now Remaining time is up over 180 days and continues to increase. When I started typing this message, it was at 160 days. It's incrementing in nearly 30 minute chunks each second.

How do I go about solving this issue so the SETI@home tasks go back to progressing normally. Nothing in the Event Log looks abnormal. This is on a Win 7 laptop.
10/18/2017 7:55:53 AM | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 8790M (driver version 1124.2 (VM), device version OpenCL 1.2 AMD-APP (1124.2), 2048MB, 2048MB available, 691 GFLOPS peak)
ID: 1895964 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1895978 - Posted: 18 Oct 2017, 16:50:58 UTC - in response to Message 1895964.  

Please enable coproc_debug log flag (Advanced View -> Options -> Event Log options), exit & restart BOINC and then run that task for a couple of minutes.
Then copy-paste BOINC's messages from Event Log here (CTRL+SHIFT+E).
ID: 1895978 · Report as offensive
Hughsey2-1

Send message
Joined: 14 May 99
Posts: 4
Credit: 323,863
RAC: 1
United States
Message 1896122 - Posted: 19 Oct 2017, 13:27:43 UTC - in response to Message 1895978.  

I let it run for an elapsed time of 2:21. The progress was again at 0.053%. I did notice yesterday after my post that the elapsed time and remaining time had reset at some point, even though the task never swapped nor did BOINC restart.

10/19/2017 8:20:57 AM | | Starting BOINC client version 7.8.2 for windows_x86_64
10/19/2017 8:20:57 AM | | log flags: file_xfer, sched_ops, task, coproc_debug
10/19/2017 8:20:57 AM | | Libraries: libcurl/7.47.1 OpenSSL/1.0.2g zlib/1.2.8
10/19/2017 8:20:57 AM | | Data directory: C:\ProgramData\BOINC
10/19/2017 8:20:57 AM | | [coproc] launching child process at C:\Program Files\BOINC\boinc.exe
10/19/2017 8:20:57 AM | | [coproc] relative to directory C:\ProgramData\BOINC
10/19/2017 8:20:57 AM | | [coproc] with data directory "C:\ProgramData\BOINC"
10/19/2017 8:20:58 AM | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 8790M (driver version 1124.2 (VM), device version OpenCL 1.2 AMD-APP (1124.2), 2048MB, 2048MB available, 691 GFLOPS peak)
10/19/2017 8:20:58 AM | | OpenCL CPU: Intel(R) Core(TM) i7-4610M CPU @ 3.00GHz (OpenCL driver vendor: Intel(R) Corporation, driver version 3.0.1.10878, device version OpenCL 1.2 (Build 76413))
10/19/2017 8:20:58 AM | | [coproc] No NVIDIA library found
10/19/2017 8:20:58 AM | | [coproc] No usable CAL devices found
10/19/2017 8:20:58 AM | | [coproc] clGetDeviceInfo failed to get CL_DEVICE_SIMD_PER_COMPUTE_UNIT_AMD for device 0
10/19/2017 8:20:58 AM | | Processor: 4 GenuineIntel Intel(R) Core(TM) i7-4610M CPU @ 3.00GHz [Family 6 Model 60 Stepping 3]
10/19/2017 8:20:58 AM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx smx tm2 pbe fsgsbase bmi1 smep bmi2
10/19/2017 8:20:58 AM | | OS: Microsoft Windows 7: Professional x64 Edition, Service Pack 1, (06.01.7601.00)
10/19/2017 8:20:58 AM | | Memory: 15.91 GB physical, 31.81 GB virtual
10/19/2017 8:20:58 AM | | Disk: 225.14 GB total, 25.29 GB free
10/19/2017 8:20:58 AM | | Local time is UTC -5 hours
10/19/2017 8:20:58 AM | | VirtualBox version: 5.0.12
10/19/2017 8:20:58 AM | | A new version of BOINC is available (7.8.3). <a href=https://boinc.berkeley.edu/download.php>Download</a>
10/19/2017 8:20:58 AM | PrimeGrid | URL http://www.primegrid.com/; Computer ID 156976; resource share 50
10/19/2017 8:20:58 AM | Rosetta@home | URL http://boinc.bakerlab.org/rosetta/; Computer ID 765125; resource share 50
10/19/2017 8:20:58 AM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 4256587; resource share 50
10/19/2017 8:20:58 AM | SETI@home | General prefs: from SETI@home (last modified 29-Jul-2010 11:22:31)
10/19/2017 8:20:58 AM | SETI@home | Computer location: work
10/19/2017 8:20:58 AM | SETI@home | General prefs: no separate prefs for work; using your defaults
10/19/2017 8:20:58 AM | | Reading preferences override file
10/19/2017 8:20:58 AM | | Preferences:
10/19/2017 8:20:58 AM | | max memory usage when active: 3257.83 MB
10/19/2017 8:20:58 AM | | max memory usage when idle: 9773.50 MB
10/19/2017 8:20:58 AM | | max disk usage: 3.00 GB
10/19/2017 8:20:58 AM | | max CPUs used: 2
10/19/2017 8:20:58 AM | | don't use GPU while active
10/19/2017 8:20:58 AM | | suspend work if non-BOINC CPU load exceeds 75%
10/19/2017 8:20:58 AM | | (to change preferences, visit a project web site or select Preferences in the Manager)
10/19/2017 8:20:59 AM | | Suspending GPU computation - computer is in use
10/19/2017 8:21:13 AM | | Resuming GPU computation
10/19/2017 8:21:13 AM | SETI@home | [coproc] Assigning ATI instance 0 to blc04_2bit_guppi_57976_07596_HIP74209_0027.18309.0.22.45.133.vlar_0
10/19/2017 8:22:14 AM | SETI@home | [coproc] ATI instance 0; 1.000000 pending for blc04_2bit_guppi_57976_07596_HIP74209_0027.18309.0.22.45.133.vlar_0
10/19/2017 8:22:14 AM | SETI@home | [coproc] ATI instance 0: confirming 1.000000 instance for blc04_2bit_guppi_57976_07596_HIP74209_0027.18309.0.22.45.133.vlar_0
10/19/2017 8:23:15 AM | SETI@home | [coproc] ATI instance 0; 1.000000 pending for blc04_2bit_guppi_57976_07596_HIP74209_0027.18309.0.22.45.133.vlar_0
10/19/2017 8:23:15 AM | SETI@home | [coproc] ATI instance 0: confirming 1.000000 instance for blc04_2bit_guppi_57976_07596_HIP74209_0027.18309.0.22.45.133.vlar_0
10/19/2017 8:23:34 AM | | Suspending GPU computation - computer is in use
ID: 1896122 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1896147 - Posted: 19 Oct 2017, 15:59:18 UTC - in response to Message 1896122.  

I see you're using a very old AMD graphics card driver (Catalyst 13.4), from 2013. Any reason for that?
Can you please try to update to a newer driver? The newest compatible with your GPU are the ReLive 17.7.2.

What I notice particularly is that it says there are no CAL capable devices detected, but your GPU is still CAL capable. CAL was ATI's own method of doing calculations on the GPU, before they abandoned it in favor of OpenCL. CAL is however still being used as a method of detecting if GPUs are capable of being used by some projects, and/or to make sure that the right application is being sent to them. You are getting sent the opencl_ati5_SoG_nocal application which is specifically built for GPUs that no longer do hardware CAL detection (RX and Vega series), which I feel is the wrong version for your GPU because it is misidentified by BOINC as not being CAL capable, which it is. It's just not using the drivers right.

CAL is still built into the ReLive drivers for backward compatibility with the HD, R5, R7, and R9 series GPUs that still roam this world.
It is best to use the AMD Clean Uninstall Utility to get rid of all the extra crud that the Catalyst drivers usually leave behind on an uninstall.
ID: 1896147 · Report as offensive
Hughsey2-1

Send message
Joined: 14 May 99
Posts: 4
Credit: 323,863
RAC: 1
United States
Message 1896167 - Posted: 19 Oct 2017, 18:27:31 UTC - in response to Message 1896147.  

As for the old video drivers, in short this is a work laptop from Dell. Pretty much one of the last things anyone ever updates. The laptop was built late 2015 by Dell so surprised on video driver age too. I'll look into updating the video drivers to start with.
ID: 1896167 · Report as offensive
Hughsey2-1

Send message
Joined: 14 May 99
Posts: 4
Credit: 323,863
RAC: 1
United States
Message 1896170 - Posted: 19 Oct 2017, 18:54:22 UTC - in response to Message 1896167.  

Went to Dell's website and pulled their version of the drivers for both the AMD and built-in video drivers. Installed the AMD driver first, rebooted and checked. The task began showing progress and went past the prior sticking point. Installed the other video driver just because, rebooted, and checked. The progress from before was retained and it still works afterwards.

10/19/2017 1:44:22 PM | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 8790M (driver version 1573.4 (VM), device version OpenCL 1.2 AMD-APP (1573.4), 2048MB, 2048MB available, 691 GFLOPS peak)
10/19/2017 1:44:22 PM | | OpenCL CPU: Intel(R) Core(TM) i7-4610M CPU @ 3.00GHz (OpenCL driver vendor: Intel(R) Corporation, driver version 4.2.0.148, device version OpenCL 1.2 (Build 148))

is what I ended up with for reference purposes.

Thanks for the help with the issue.
ID: 1896170 · Report as offensive

Questions and Answers : GPU applications : GPU task runs but never progresses


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.