Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database

Message boards : Number crunching : Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · Next

AuthorMessage
NRauer
Volunteer tester

Send message
Joined: 12 Aug 16
Posts: 3
Credit: 4,288,796
RAC: 0
United States
Message 2031804 - Posted: 10 Feb 2020, 15:38:42 UTC - in response to Message 2027520.  

Hey StrayCat! Completely unrelated to this thread but I was wondering if you could share your app_config.xml file with me. I am an old-time member starting back up again. I am an assuming you use Lunatics as well? I looked through your computers and it looks like you have things dialed in quite well.

Thanks in advance!
NRauer
ID: 2031804 · Report as offensive     Reply Quote
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36790
Credit: 261,360,520
RAC: 489
Australia
Message 2031874 - Posted: 10 Feb 2020, 20:54:34 UTC - in response to Message 2031804.  

Hey StrayCat! Completely unrelated to this thread but I was wondering if you could share your app_config.xml file with me. I am an old-time member starting back up again. I am an assuming you use Lunatics as well? I looked through your computers and it looks like you have things dialed in quite well.

Thanks in advance!
NRauer
1 thing, update the driver on your 1070 to 442.19 and that will stop the problems with Arecibo VHAR work units. ;-)

Cheers.
ID: 2031874 · Report as offensive     Reply Quote
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3806
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 2032431 - Posted: 14 Feb 2020, 20:33:13 UTC - in response to Message 2032428.  

higemayuge is already in the list. Guess I'll send a reminder. :^)
ID: 2032431 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22529
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2032432 - Posted: 14 Feb 2020, 20:34:33 UTC

That's the bad news - the good news (if it really can be called good news) is that it is now down to 2 tasks in progress.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2032432 · Report as offensive     Reply Quote
Lazydude
Volunteer tester

Send message
Joined: 17 Jan 01
Posts: 45
Credit: 96,158,001
RAC: 136
Sweden
Message 2032538 - Posted: 15 Feb 2020, 17:08:06 UTC

https://setiathome.berkeley.edu/workunit.php?wuid=3885927696
two nvidia got robbed against two error 9 cards
ID: 2032538 · Report as offensive     Reply Quote
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 2033184 - Posted: 20 Feb 2020, 16:35:50 UTC

I updated AMD drivers to 20.2.1, via the "factory reset" method... it's what their new "clean install" is called.
Noticed in Win10 reliability history that the previous AMD drivers crashed multiple times a day. On just sitting on the desktop, in games and in BOINC.
So will keep an eye on that now.
ID: 2033184 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2033185 - Posted: 20 Feb 2020, 16:44:30 UTC - in response to Message 2033184.  

Read an article and editorial expounding on the lack of reliability of the current AMD drivers with their latest products. Seems to be lots of forum traffic about Win10 just crashing down sitting idle in the Desktop. Seems they rushed product to market and didn't apply the necessary resources to make the drivers work correctly.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2033185 · Report as offensive     Reply Quote
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 2033225 - Posted: 20 Feb 2020, 21:52:05 UTC - in response to Message 2033184.  

Newer drivers are validating normally against other (non-AMD GPU) computers.
ID: 2033225 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2033228 - Posted: 20 Feb 2020, 22:06:09 UTC - in response to Message 2033225.  

Newer drivers are validating normally against other (non-AMD GPU) computers.

That's a good thing. The installed base of AMD users are slowly migrating to the newest drivers and reducing the original issue to minimum.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2033228 · Report as offensive     Reply Quote
jonah

Send message
Joined: 4 Mar 18
Posts: 2
Credit: 118,652
RAC: 0
Canada
Message 2033671 - Posted: 23 Feb 2020, 17:01:08 UTC

Even my rx 590 puts out a few errors now and again. Since this is my play machine, I'm not going to switch the drivers to compute within the Amd radeon software. Maybe the next batch of drivers may fix this issue.
ID: 2033671 · Report as offensive     Reply Quote
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 2033994 - Posted: 26 Feb 2020, 7:46:00 UTC
Last modified: 26 Feb 2020, 7:46:33 UTC

Just checking, but any driver prior to the 2020.1.1 version is broken, right?
Because of instability problems in Ghost Recon Breakpoint (the game will start a horrible flicker on my monitors after an hour and requires a reboot), I have to return to 19.11.2 for now. I understand it if I can no longer run Seti then. Have no work in progress anyway.
ID: 2033994 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2033995 - Posted: 26 Feb 2020, 7:57:04 UTC - in response to Message 2033994.  
Last modified: 26 Feb 2020, 7:58:07 UTC

I saw a YT video explaining how to install ONLY the required AMD drivers and not the entire AMD official package.

Was emphatically explained the AMD package is totally borked and causes instability and crashes. If you get rid of everything except the minimal drivers, the drivers and cards run well.

https://www.youtube.com/watch?v=keURVVjApY4
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2033995 · Report as offensive     Reply Quote
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 2034000 - Posted: 26 Feb 2020, 8:25:40 UTC - in response to Message 2033995.  
Last modified: 26 Feb 2020, 9:11:03 UTC

I've just been stuck in a perpetual Windows 10 Safe Mode reboot cycle, until I tried out my ancient old Hotmail password... which worked. Needed Safe Mode to get rid of the 2020 drivers as their uninstall had b0rked. How difficult to get places, Microsoft!

Edit: the 'buggy' stuff is what I like so much, ReLive's screen shot making, or screen recording.
This system is for gaming first, BOINC second. Seti runs on my low power Android devices, that's fine with me. Once in a blue weekend I run Seti on my RX 5700 XT, just not now.

19.11.2 give their own problems (mouse gets stuck at times, in GRB all kinds of artifacts on screen.)
I ran Kombustor for a good 20 minutes, no artifacts there.
ID: 2034000 · Report as offensive     Reply Quote
4-C

Send message
Joined: 19 Jun 05
Posts: 6
Credit: 4,069,157
RAC: 6
United States
Message 2034202 - Posted: 27 Feb 2020, 15:53:33 UTC - in response to Message 2033228.  

I have seen various reports of AMD driver issues but don't know how much credit to give them. My system uses both an AMD CPU, 3900X, and AMD GPU, 5700. The system was built just about a week after the hardware was released and while I did have some initial problems to work through it has been very stable since. Some people on the internet have strong opinions and ulterior motives. Shocking I know. When you read through the comment threads on tech sites it is littered with Nvidia people screaming about AMD's flaws and shortcomings and vice-versa. The only thing I can say is my system has been very stable.
ID: 2034202 · Report as offensive     Reply Quote
4-C

Send message
Joined: 19 Jun 05
Posts: 6
Credit: 4,069,157
RAC: 6
United States
Message 2034222 - Posted: 27 Feb 2020, 18:45:17 UTC - in response to Message 2034210.  

Why yes, Grumpy, there are invalid results returned. There are also far more valid results, and the system runs well and doesn't crash. You seem determined to live up to your nom de plume. I wish you well.
ID: 2034222 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2034238 - Posted: 27 Feb 2020, 19:31:00 UTC - in response to Message 2034222.  
Last modified: 27 Feb 2020, 19:32:44 UTC

Why yes, Grumpy, there are invalid results returned. There are also far more valid results, and the system runs well and doesn't crash. You seem determined to live up to your nom de plume. I wish you well.


looking at your "valid" tasks, it's clear that something isn't right here. your tasks are being validated even when you are returning different/incorrect results from the wingmen.

Examples
https://setiathome.berkeley.edu/workunit.php?wuid=3905271906
you:
Spike count:    6
Autocorr count: 0
Pulse count:    8
Triplet count:  7
Gaussian count: 0
wingmen:
Spike count:    5
Autocorr count: 0
Pulse count:    11
Triplet count:  5
Gaussian count: 0

https://setiathome.berkeley.edu/workunit.php?wuid=3902795339
you:
Spike count:    23
Autocorr count: 0
Pulse count:    6
Triplet count:  1
Gaussian count: 0
wingmen:
Spike count:    22
Autocorr count: 0
Pulse count:    7
Triplet count:  1
Gaussian count: 0

https://setiathome.berkeley.edu/workunit.php?wuid=3905067362
you:
Spike count:    6
Autocorr count: 2
Pulse count:    1
Triplet count:  0
Gaussian count: 0
wingmen:
Spike count:    6
Autocorr count: 2
Pulse count:    1
Triplet count:  2
Gaussian count: 0



you should update your drivers to the ones that actually work (you need any driver after version 20.1.1 i believe). if you are already using the proper fixed drivers, you need to figure out what is wrong with your system that it's producing incorrect results. maybe overheating or something else. either way you should stop SETI processing again until it's fixed.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2034238 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2034245 - Posted: 27 Feb 2020, 19:42:06 UTC - in response to Message 2034238.  

All those three examples have been processed by three computers, not the normal two. Provided two of them agree to the required scientific accuracy ("strongly similar"), the WU as a whole will be accepted. The third partner is then given an easy ride, and allowed to pass at a much lower threshhold ("weakly similar").

But in reality, all the electricity and internet bandwidth used by that third computer is utterly wasted: the WU could have been finalised by the two true scientists on their own.

In none of the three example cases was the result produced by computer 8778135 chosen as the canonical scientific result.
ID: 2034245 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2034256 - Posted: 27 Feb 2020, 20:00:21 UTC - in response to Message 2034245.  
Last modified: 27 Feb 2020, 20:28:13 UTC

All those three examples have been processed by three computers, not the normal two. Provided two of them agree to the required scientific accuracy ("strongly similar"), the WU as a whole will be accepted. The third partner is then given an easy ride, and allowed to pass at a much lower threshhold ("weakly similar").

But in reality, all the electricity and internet bandwidth used by that third computer is utterly wasted: the WU could have been finalised by the two true scientists on their own.

In none of the three example cases was the result produced by computer 8778135 chosen as the canonical scientific result.


at least the first one, he was actually the second person to submit his result, before my system did. and that result specifically looks worse than "weakly similar" to me. but that's just a first glance at the numbers, not a mathematical comparison.

the main point of my post was to explain why he's getting all these "valid" results that really aren't valid. he needs to fix his system, most likely by updating to the latest drivers.

whatever settings were changed server side to have initial replication of 3 for ati cards needs to be reversed now that there are working drivers. so that hosts like this will get majority invalids and realize they have to fix their system
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2034256 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2034264 - Posted: 27 Feb 2020, 20:18:42 UTC - in response to Message 2034256.  
Last modified: 27 Feb 2020, 20:20:42 UTC

whatever settings were changed server side to have initial replication of 3 for ati cards needs to be reversed now that there are working drivers. so that hosts like this will get majority invalids and realize they have to fix their system
I think that "initial replication" is a misnomer on the web page. I think you'll find that the 'initial' replication was indeed 2 (look at some of his 'in progress' tasks for confirmation), but the the figure was upped to three when the first validation was inconclusive and a check task created. It should be labelled 'current replication', or somesuch.
ID: 2034264 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2034267 - Posted: 27 Feb 2020, 20:27:33 UTC - in response to Message 2034264.  

im still not sure why he got credit for it then. i've seen plenty of WUs invalidate for less.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2034267 · Report as offensive     Reply Quote
Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · Next

Message boards : Number crunching : Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.