Message boards :
Number crunching :
whole serie of data blocks failing with SoG
Message board moderation
Author | Message |
---|---|
Sixkid Send message Joined: 10 Jan 12 Posts: 17 Credit: 8,248,305 RAC: 21 ![]() ![]() |
my client is bizzy with 22ap11ae.21036.6611.6.33.88_0 ( opencl_nvidia_SoG) and it's starting to fail. remaining time says 2 minute 9 but after 2 minute 9 nothing changed and remaining time starts to add up till deadline is reached and block is aborted , this been going on for the past few numbers that i've been watching whats going on in this serie of blocks. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13870 Credit: 208,696,464 RAC: 304 ![]() ![]() |
My systems processed all those WUs using SoG with no issues. Grant Darwin NT |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
my client is bizzy with 22ap11ae.21036.6611.6.33.88_0 ( opencl_nvidia_SoG) and it's starting to fail. Timminator2 first reported this problem, then Robert Miles reported the same problem on a different series of tasks. Might I question that you are running on Windows 10 with the latest Nvidia 436 drivers? This is the common factor in these tasks. All the problem tasks are Arecibo VHARS of angle range 2.7. All tasks error out because of exceeding the compute time limit. The host can process other types of task with no problems. https://setiathome.berkeley.edu/forum_thread.php?id=80859&postid=2009430 I have suggested rolling back to earlier drivers as a fix. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Such symptoms usually mean video driver restart. App doesn't get informed about such event, last OpenCL runtime call just never returns. Hence abortion by deadline reaching. To change driver is good advice provided app worked OK with another driver version. And to narrow area where to find issue I would recommend to add (temporarily) -v 2 option. Also there was special debug build that reports each OpenCL call into stderr - that would be ideal to see where exactly problem occurs. From OS side-check system log to see if driver restart events had place. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Michael Donikowski Send message Joined: 29 May 99 Posts: 8 Credit: 22,914,826 RAC: 38 ![]() ![]() |
Same problem here with Aricebo GPU tasks only. Windows 10 Pro, GTX1660Ti, Stock apps. Problem occurs with nvidia drivers released in August 2019. (436 series). All Aricebo CPU tasks and all Green Bank tasks run fine. All tasks run fine with nvidia drivers released in July 2019. (431.36) |
Michael Donikowski Send message Joined: 29 May 99 Posts: 8 Credit: 22,914,826 RAC: 38 ![]() ![]() |
Same problem here with Aricebo GPU tasks only. Forgot to mention that it did not matter if it was a SoG or cuda application. |
![]() ![]() Send message Joined: 4 Mar 10 Posts: 1854 Credit: 2,258,721 RAC: 0 ![]() |
Found myself going back to previous driver 430.86 as new driver 436.15 was stopping and restarting the same Aricebo GPU task.. (nothing was aborted, just kept restarting from scratch) Everything running fine again.. |
Sixkid Send message Joined: 10 Jan 12 Posts: 17 Credit: 8,248,305 RAC: 21 ![]() ![]() |
Update : 436.15 driver has troubles with SoG 436.02 driver has troubles with SoG 431.60 driver is OK for my system Win10 pro - Gtx1080Ti if you have troubles to get the 431.60 installed due to system check failing when installing the driver. then use DDU ( google it ) and uninstall GPU drivers without restart ( also turn off internet ) , when finished install the 431.60 driver and it will succeed. turn on internet and happy crunching times are back.. |
![]() Send message Joined: 25 Sep 00 Posts: 190 Credit: 23,498,825 RAC: 9 ![]() |
I'm having the exact same issues. Hoping the next NVIDIA driver release fixes things. SETI@home classic workunits 103,576 SETI@home classic CPU time 655,753 hours |
Sixkid Send message Joined: 10 Jan 12 Posts: 17 Credit: 8,248,305 RAC: 21 ![]() ![]() |
4 units of the 22ap11ae serie done, so it's working again Btw Keith Thanks for the tip :D Happy crunching. |
Jeff Send message Joined: 8 May 99 Posts: 5 Credit: 98,361,983 RAC: 150 ![]() ![]() |
Going back to an earlier version driver (431.60) appears to have solved the issue. This is the first time that I remember an Nvidia update causing a problem. Thanks to the suggestions here I was able to install the older driver successfully. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Deductive reasoning indicated the driver was likely the cause of the issues. Simple fix to prove the case. Pretty sure it is the new integer scaling function in the driver. Nvidia likely didn't test the features outside of testing on the old pixel mapped games. Probably didn't even consider people use their cards for numerical crunching too besides just gaming. Anybody report this problem to Nvidia yet? Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Sixkid Send message Joined: 10 Jan 12 Posts: 17 Credit: 8,248,305 RAC: 21 ![]() ![]() |
Deductive reasoning indicated the driver was likely the cause of the issues. Simple fix to prove the case. Pretty sure it is the new integer scaling function in the driver. Nvidia likely didn't test the features outside of testing on the old pixel mapped games. Probably didn't even consider people use their cards for numerical crunching too besides just gaming. yes i did :) told em to contact the people from seti@home and or Boinc |
Jeff Send message Joined: 8 May 99 Posts: 5 Credit: 98,361,983 RAC: 150 ![]() ![]() |
Thanks for the reminder! Nvidia will not do anything if they are unaware of the problem. I did report the issue originally but will follow up with reports on the new driver. I think you are correct that no one at Nvidia thought that anyone would be using their cards for anything other than video games and cryptomining. It is important that everyone who encounters this issue report it to Nvidia through their feedback page. Hopefully, it will be resolved soon. (But after the second flawed release I am not holding my breath.) |
![]() Send message Joined: 25 Sep 00 Posts: 190 Credit: 23,498,825 RAC: 9 ![]() |
Version 436.30 is out today. I wonder if nVidia fixed things.... |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Version 436.30 is out today. I wonder if nVidia fixed things.... Be the guinea pig and report back. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13870 Credit: 208,696,464 RAC: 304 ![]() ![]() |
No mention of it being fixed in the release notes, but there's also no mention of it in the existing issues notes either.Version 436.30 is out today. I wonder if nVidia fixed things....Be the guinea pig and report back. Grant Darwin NT |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I doubt seriously if they even consider any use other than graphics. Unless there is a specific note addressing the failure of the driver with compute loads, I would avoid the drivers. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 ![]() ![]() |
I doubt seriously if they even consider any use other than graphics. Unless there is a specific note addressing the failure of the driver with compute loads, I would avoid the drivers. Remind me, weren't they trying to do real time analysis of data coming in? Hopefully no one updates the drivers.... |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I doubt seriously if they even consider any use other than graphics. Unless there is a specific note addressing the failure of the driver with compute loads, I would avoid the drivers. Not sure I understand the question. I believe the problem with the 436 drivers is the experimental integer scaling that is implemented. That feature is to make the bit mapped arcade games of the past look good on 4K monitors. It has nothing to do with compute. Except I think it is messing up compute somehow. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.