SoG custom application stuck on 1660Ti

Message boards : Number crunching : SoG custom application stuck on 1660Ti
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile SongBird
Volunteer tester

Send message
Joined: 23 Oct 01
Posts: 104
Credit: 157,588,707
RAC: 56,538
Bulgaria
Message 2015293 - Posted: 13 Oct 2019, 14:42:58 UTC

Yesterday I installed a new Nvidia 1660TI GPU. I installed MB8_win_x86_SSE3_OpenCL_NV_SoG_r3557 app using the  Lunatics Win64 v0.45 beta6 installer. It all went well and the gpu started crunching. Cut to a couple of hour later when opened GPU-Z to look at the temperatures when I noticed that the CPU clock is down and GPU utilization is down to 0%. When I opened BOINC it looked like it was working ok, but the estimated time left on the current units was in the days - 4 or five, or something.

I killed everything and switched to r3584 app from Mikes world. Not sure if workunits were flushed. Cut to a couple of hours later and the GPU was again at 0% and the units were estimated at 95 days.

Switched to CUDA50 and have been crunching for more than 20h without issues.
Drivers are latest Game ready drivers from Nvidia.

Is this behavior familiar to anyone? Is there a point in trying to diagnose this? I would not even know where to start.
ID: 2015293 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 8229
Credit: 517,285,655
RAC: 405,946
Panama
Message 2015296 - Posted: 13 Oct 2019, 14:56:59 UTC - in response to Message 2015293.  
Last modified: 13 Oct 2019, 15:04:23 UTC

It's hard to say something with your computers hidden.

But i imagine is because you select the wrong app for your GPU, you need to select SoG not the CUDA on the Lunatics installer or maybe because the driver your use. FYI There are a incompatibility with the latest NVidia driver and SETI or other projects.

BTW Ask for help with the hosts hidden is like go to the doctor and not allow him to exam, is some kind of an impossible task.
ID: 2015296 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 17936
Credit: 408,777,677
RAC: 32,096
United Kingdom
Message 2015299 - Posted: 13 Oct 2019, 15:02:29 UTC

A couple of other things to consider - first, "game ready" may not have everything required to do the computational work, and there is a "known problem" with at least one recent release see https://setiathome.berkeley.edu/forum_thread.php?id=84694
Second, it is not unusual t see very extended estimated run times when first installing optimised applications, but very unusual to see such a large wrong estimate (more normal would be a couple of hours).

However, as your computers are hidden it is very difficult to be more precise.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2015299 · Report as offensive
Profile SongBird
Volunteer tester

Send message
Joined: 23 Oct 01
Posts: 104
Credit: 157,588,707
RAC: 56,538
Bulgaria
Message 2015306 - Posted: 13 Oct 2019, 15:23:10 UTC
Last modified: 13 Oct 2019, 15:28:31 UTC

This is the computer in question. I should reiterate that I successfully calculated multiple work units with both r3557 (e.g. this task) and r3584 (e.g. this task) SoG apps. It is just that they got stuck after a relatively short while.

[edit]rob smith, I just got into the thread you linked to and this very much feels like what I'm experiencing. I'll try to roll back my drivers and see what happens. Thanks!
ID: 2015306 · Report as offensive
Linux? Surely not!
Volunteer tester
Avatar

Send message
Joined: 1 Nov 08
Posts: 7681
Credit: 47,934,736
RAC: 3,066
Sweden
Message 2015308 - Posted: 13 Oct 2019, 15:29:57 UTC - in response to Message 2015306.  
Last modified: 13 Oct 2019, 15:40:58 UTC

This is the computer in question. I should reiterate that I successfully calculated multiple work units with both r3557 (e.g. this task) and r3584 (e.g. this task) SoG apps. It is just that they got stuck after a relatively short while.

You're using driver: 436.48

See: NVidia 436.xx drivers can cause very long compute times especially on Arecibo VHAR work units

Edit: If it ain't broke, don't fix it.
I'm still running driver: 352.86 on my GTX980. Why would I upgrade every time there is a new driver, when the old one works perfectly? Newer drivers will not make the GPU crunch faster, or better.
ID: 2015308 · Report as offensive
Profile Keith Myers Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 10281
Credit: 1,003,873,819
RAC: 1,430,226
United States
Message 2015309 - Posted: 13 Oct 2019, 15:30:07 UTC - in response to Message 2015306.  

This is the computer in question. I should reiterate that I successfully calculated multiple work units with both r3557 (e.g. this task) and r3584 (e.g. this task) SoG apps. It is just that they got stuck after a relatively short while.

As Rob stated, you are running incompatibile Windows drivers. Your mistake was installing the latest. The last known good drivers are the 431 series. Any later series drivers causes the extended run times you are seeing. Backlevel to 431 series and all will be well.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2015309 · Report as offensive
Profile SongBird
Volunteer tester

Send message
Joined: 23 Oct 01
Posts: 104
Credit: 157,588,707
RAC: 56,538
Bulgaria
Message 2015310 - Posted: 13 Oct 2019, 15:35:06 UTC

Will do. Thank you all!
ID: 2015310 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 8229
Credit: 517,285,655
RAC: 405,946
Panama
Message 2015312 - Posted: 13 Oct 2019, 15:38:00 UTC - in response to Message 2015309.  
Last modified: 13 Oct 2019, 15:40:05 UTC

This is the computer in question. I should reiterate that I successfully calculated multiple work units with both r3557 (e.g. this task) and r3584 (e.g. this task) SoG apps. It is just that they got stuck after a relatively short while.

As Rob stated, you are running incompatibile Windows drivers. Your mistake was installing the latest. The last known good drivers are the 431 series. Any later series drivers causes the extended run times you are seeing. Backlevel to 431 series and all will be well.

Yes the driver is incompatible, but take a look at his crunched WU it said:
setiathome enhanced x41zi (baseline v8), Cuda 5.00

Was a long time i not use the Windows Lunatics apps but IIRC the fastest one are the SoG not the CUDA or that changes?
ID: 2015312 · Report as offensive
Profile Keith Myers Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 10281
Credit: 1,003,873,819
RAC: 1,430,226
United States
Message 2015320 - Posted: 13 Oct 2019, 17:21:30 UTC

He was running the proper SoG app from the Lunatics installer and all was working well until he updated the drivers. He only switched to the old CUDA50 app when troubleshooting. Once he reverts to a 431 driver he can revert back to the proper SoG app.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2015320 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9873
Credit: 89,243,536
RAC: 94,118
United Kingdom
Message 2015322 - Posted: 13 Oct 2019, 17:30:52 UTC
Last modified: 13 Oct 2019, 17:31:45 UTC

Why would I upgrade every time there is a new driver, when the old one works perfectly? Newer drivers will not make the GPU crunch faster, or better.


Well you may need to update due to security issues.

https://nvidia.custhelp.com/app/answers/detail/a_id/4841/~/security-bulletin%3A-nvidia-gpu-display-driver---august-2019

Anything prior to 431.60 is vulnerable.

But it is probably nothing to worry about.

I am a gamer and I do need the latest drivers, so I have stopped crunching on GPU's on my Windows machines.
ID: 2015322 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 1995
Credit: 906,131,227
RAC: 2,732,661
United States
Message 2015323 - Posted: 13 Oct 2019, 17:52:28 UTC - in response to Message 2015322.  
Last modified: 13 Oct 2019, 18:04:34 UTC

nothing about being a gamer means you "need" the latest drivers. your systems would be perfectly fine running the 431.60 or 435 (sorry 435 must be linux only version) drivers. most people are not even at risk of the vulnerbilities without the security patches and the worst of them requires an attacker to have physical access to the system.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2015323 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 7790
Credit: 2,920,534
RAC: 1,221
Italy
Message 2015386 - Posted: 14 Oct 2019, 4:54:41 UTC
Last modified: 14 Oct 2019, 4:56:10 UTC

With 436.48 most but not all Arecibo tasks take a huge time on my GTX 1060, while all Green Bank tasks work perfectly. I had to abort a number of Arecibo tasks. Einstein@home tasks work perfectly.
Tullio
ID: 2015386 · Report as offensive
Profile Wiggo "Democratic Socialist"
Avatar

Send message
Joined: 24 Jan 00
Posts: 17180
Credit: 239,614,739
RAC: 176,194
Australia
Message 2015394 - Posted: 14 Oct 2019, 7:49:35 UTC - in response to Message 2015386.  

With 436.48 most but not all Arecibo tasks take a huge time on my GTX 1060, while all Green Bank tasks work perfectly. I had to abort a number of Arecibo tasks. Einstein@home tasks work perfectly.
Tullio
Why not just roll back to the 431.60 driver and avoid aborting work all together?

Cheers.
ID: 2015394 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 17936
Credit: 408,777,677
RAC: 32,096
United Kingdom
Message 2015401 - Posted: 14 Oct 2019, 10:04:56 UTC

...and when "rolling back" to an earlier driver version do a "clean" installation - the option is in the "advanced" menu.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2015401 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 7790
Credit: 2,920,534
RAC: 1,221
Italy
Message 2015403 - Posted: 14 Oct 2019, 10:22:29 UTC

From a post mortem view, most Arecibo tasks fail after a very short time, so I don't have to abort them. Also GPUGRID tasks work perfectly on 436.48 and they are the most demanding on a GPU according to GPU-Z.
Tullio
ID: 2015403 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 11858
Credit: 183,564,995
RAC: 232,392
Australia
Message 2015405 - Posted: 14 Oct 2019, 10:36:17 UTC - in response to Message 2015403.  
Last modified: 14 Oct 2019, 10:37:37 UTC

From a post mortem view, most Arecibo tasks fail after a very short time, so I don't have to abort them.
Sorry, but that doesn't make any sense.
Looking at your Task List, there are no WU's (Arecibo or otherwise) that have failed after a short time. All of the WUs that failed, failed due to the driver. All other errors, are because you aborted the WU.
Grant
Darwin NT
ID: 2015405 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 13286
Credit: 165,025,240
RAC: 215,849
United Kingdom
Message 2015406 - Posted: 14 Oct 2019, 10:40:21 UTC - in response to Message 2015403.  

most Arecibo tasks fail after a very short time
Not 'most'. Most tasks run to completion, in a variable amount of time.

We do get occasional runs of 'noisy' tasks (which exit after a very few seconds), or VHAR tasks (which exit after a half or a third of the normal time), but neither of these are failures.
ID: 2015406 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 7790
Credit: 2,920,534
RAC: 1,221
Italy
Message 2015500 - Posted: 15 Oct 2019, 9:51:07 UTC - in response to Message 2015405.  
Last modified: 15 Oct 2019, 10:32:07 UTC

I am using the 436.48 driver because it works perfectly on Einstein@home and GPUGRID. A part of my SETI@home tasks are run as a ScienceUnited user, so I cannot see them since I don't have that password, only a general ScienceUnited password. All GreenBank tasks work flawlessly with that driver.
Tullio
PS
I have completed 426 SETI@home tasks on ScienceUnited with zero failures
GPU hours 107.95
ID: 2015500 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 1995
Credit: 906,131,227
RAC: 2,732,661
United States
Message 2015503 - Posted: 15 Oct 2019, 11:52:56 UTC - in response to Message 2015500.  
Last modified: 15 Oct 2019, 11:53:21 UTC

I am using the 436.48 driver because it works perfectly on Einstein@home and GPUGRID. A part of my SETI@home tasks are run as a ScienceUnited user, so I cannot see them since I don't have that password, only a general ScienceUnited password. All GreenBank tasks work flawlessly with that driver.
Tullio
PS
I have completed 426 SETI@home tasks on ScienceUnited with zero failures
GPU hours 107.95


You have 44 Errored tasks on that host. All of the ones I could see info for were Arecibo VHAR tasks.

https://setiathome.berkeley.edu/results.php?hostid=8609032&offset=0&show_names=0&state=6&appid=
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2015503 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 7790
Credit: 2,920,534
RAC: 1,221
Italy
Message 2015506 - Posted: 15 Oct 2019, 12:08:13 UTC - in response to Message 2015503.  

Thanks. The driver on that host is 436.30 not 436.48 like on the Windows 10 PC which is not enlisted in Science United.
Tullio
ID: 2015506 · Report as offensive
1 · 2 · 3 · Next

Message boards : Number crunching : SoG custom application stuck on 1660Ti


 
©2019 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.