Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database

Message boards : Number crunching : Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 20 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 2027289 - Posted: 11 Jan 2020, 11:01:21 UTC
Last modified: 11 Jan 2020, 11:02:09 UTC

One interesting side effect from the change in numbers required to Validate certain results- my Inconclusives have gone from around 3% to 17%
Grant
Darwin NT
ID: 2027289 · Report as offensive     Reply Quote
Profile Recedham

Send message
Joined: 20 May 99
Posts: 6
Credit: 1,577,424
RAC: 1
United Kingdom
Message 2027335 - Posted: 11 Jan 2020, 17:34:29 UTC

I know the beta server as ran out of tasks, but can someone point me in the right direction on how to use the beta server?
I have a 5700XT and would like to help out.
I'm a Drunk
ID: 2027335 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2027338 - Posted: 11 Jan 2020, 18:01:06 UTC - in response to Message 2027335.  

You join Beta by using the Tools tab in the Manager and choose the Add Project selection. Then in the Project URL selection box add this URL:
https://setiweb.ssl.berkeley.edu/beta/

Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2027338 · Report as offensive     Reply Quote
Profile Recedham

Send message
Joined: 20 May 99
Posts: 6
Credit: 1,577,424
RAC: 1
United Kingdom
Message 2027351 - Posted: 11 Jan 2020, 20:04:57 UTC - in response to Message 2027338.  

Thank you.
Done, that was easy :-)
I'm a Drunk
ID: 2027351 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2027356 - Posted: 11 Jan 2020, 20:14:17 UTC - in response to Message 2027351.  

Thank you.
Done, that was easy :-)

Great. And thank you for being a Beta tester. The sooner we can prove out that the 8.24 app solves the AMD problem with the Navi cards, the sooner the app can be brought to Main for the masses.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2027356 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2027358 - Posted: 11 Jan 2020, 20:19:34 UTC - in response to Message 2027356.  

Thank you.
Done, that was easy :-)

Great. And thank you for being a Beta tester. The sooner we can prove out that the 8.24 app solves the AMD problem with the Navi cards, the sooner the app can be brought to Main for the masses.


and hopefully, all of the people who don't monitor forums or private messages will eventually update their drivers to the new version that is supposed to work. I think both are required for the fix to be effective (app+driver).
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2027358 · Report as offensive     Reply Quote
Profile Recedham

Send message
Joined: 20 May 99
Posts: 6
Credit: 1,577,424
RAC: 1
United Kingdom
Message 2027366 - Posted: 11 Jan 2020, 21:02:49 UTC - in response to Message 2027356.  

All I need now are Work Units.....
I'm a Drunk
ID: 2027366 · Report as offensive     Reply Quote
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36876
Credit: 261,360,520
RAC: 489
Australia
Message 2027393 - Posted: 12 Jan 2020, 0:25:14 UTC

Another 3 have joined the list.

aplrapid 7807183
Leigh Green 123169
Vytautas Liesis 173783

Cheers.
ID: 2027393 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2027400 - Posted: 12 Jan 2020, 0:57:54 UTC - in response to Message 2027358.  

I think both are required for the fix to be effective (app+driver).

A new app wasn't needed at Einstein. Only the new driver necessary for the 5700XT cards to finally start reporting valid work when paired with a card other than another Navi.

Guessing that that Einstein app didn't implement that iffy compiler flag that Eric had to remove on our app.[/quote]
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2027400 · Report as offensive     Reply Quote
Eric Korpela Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1382
Credit: 54,506,847
RAC: 60
United States
Message 2027409 - Posted: 12 Jan 2020, 1:39:41 UTC - in response to Message 2027400.  

Of course, the day I can't be near the computer is the day the beta splitter freezes. I'm copying in new data and unfreezing it.
@SETIEric@qoto.org (Mastodon)

ID: 2027409 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2027416 - Posted: 12 Jan 2020, 3:32:01 UTC - in response to Message 2027409.  

[quote]Of course, the day I can't be near the compu
. . We all have to learn to live with Murphy's Law ... no matter how much it sucks :(

Stephen

<shrug>
ID: 2027416 · Report as offensive     Reply Quote
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 2027441 - Posted: 12 Jan 2020, 13:41:47 UTC

The warning is still spreading:
https://www.hardwaretimes.com/opencl-compute-broken-on-amds-navi-cards-since-launch-all-radeon-rx-5700-5700-xt-gpus-affected/
What’s worse is that many Databases aren’t able to detect these invalid results, leading to possible corruption of OpenCL based Databases. In case you’re running an RX 5700 series card, kindly refrain from using it for OpenCL Compute. This includes Online Science Databases such as SETI, Geekbench, etc. We’ll let you know as soon as AMD fixes this issue.

https://www.reddit.com/r/Amd/comments/eblv26/psa_please_remove_your_amd_rx5700xt_from_setihome/
PSA: Please remove your AMD RX5700/XT from SETI@home now.

This issue has been ongoing since the RX5700 and RX5700XT was released. OpenCL compute is broken on these GPUs (unclear at this time if it’s hardware or drivers related) and they produce invalid results.

The problem is that these RX5700s are cross validating their incorrect results with each other on occasion. If left unchecked this has serious implications for the integrity of the science database.


https://www.pcgamesn.com/amd/radeon-rx-5700-xt-seti-home-aliens
AMD’s RX 5700-series graphics cards are turning up incorrect results in SETI@home, a crowd-supported “volunteer computing” experiment searching the vast emptiness of space for signs of extraterrestrial intelligence.

https://www.eteknix.com/seti-may-ban-amd-5700-gpus-from-searching-for-aliens/
In a report via TechPowerUp, it seems that the community is currently debating whether owners of AMD 5700 graphics cards can participate.
ID: 2027441 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2027447 - Posted: 12 Jan 2020, 14:37:56 UTC - in response to Message 2027441.  

I was the one who made that PSA post on reddit. I knew it got good visibility on the AMD sub, but I had no idea that people wrote articles about it.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2027447 · Report as offensive     Reply Quote
Profile Bill Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 30 Nov 05
Posts: 282
Credit: 6,916,194
RAC: 60
United States
Message 2027454 - Posted: 12 Jan 2020, 15:28:32 UTC - in response to Message 2027447.  

I was the one who made that PSA post on reddit. I knew it got good visibility on the AMD sub, but I had no idea that people wrote articles about it.
That is good that the warning spread, but once beta proves it works, articles need to be posted to say it gas been fixed!
Seti@home classic: 1,456 results, 1.613 years CPU time
ID: 2027454 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2027485 - Posted: 12 Jan 2020, 18:59:31 UTC

I was looking at StrayCat’s results over at beta, to get an idea of relative performance.

Judging based on the WU times compared to the cards it’s being stacked up against on the validates tasks, it looks similar in performance to a GTX 1060 or so. That seems a bit disappointing, I think it should be about 2x that.

Can anyone link to other 5700 systems at beta so I can see if others are having the same level of performance, or if it’s just his system.

StrayCat, can you comment on GPU load percentage? I wonder if since BOINC is only reporting half the CU count, that it’s only using half of the card or something. Or maybe it’s the removal of the compiler parameter or the driver itself that’s hampering performance.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2027485 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22541
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2027486 - Posted: 12 Jan 2020, 19:04:55 UTC

The performance hit may be real, and a direct result of Eric's comment about having had to turn off one of the optimization flags.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2027486 · Report as offensive     Reply Quote
Eric Korpela Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1382
Credit: 54,506,847
RAC: 60
United States
Message 2027496 - Posted: 12 Jan 2020, 19:27:21 UTC - in response to Message 2027454.  

So far, so good:

1219 results have been through the validator
1158 (95%) have validated
35 (2.9%) are inconclusive and may validate later
26 (2.1%) are errors, but don't appear to be computation errors.

If things continue this way I will probably put the new app versions on the main projects tomorrow or tuesday.
@SETIEric@qoto.org (Mastodon)

ID: 2027496 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2027497 - Posted: 12 Jan 2020, 19:28:57 UTC - in response to Message 2027486.  

If you compare the CPU times you will see StrayCat's are Very Low, https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=89082&offset=20
It appears StrayCat has his CPU setting at 100%. Others have a Much Higher CPU Time, and Much better Run times, https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=89092
I'd say 90% of the people at SETI are Strangling their GPUs by Not reducing their CPU settings to below 100% when running GPU tasks.
It certainly doesn't help by having some people going around telling users to set their CPU setting to 100%...and then make an app_config to lower the number of tasks.
What usually happens is the User can do the first part, raise the CPU setting to 100%, BUT, fails to make the app_config because it's more difficult. That leaves the GPU being Strangled for CPU time, and producing poor results.
ID: 2027497 · Report as offensive     Reply Quote
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 2027498 - Posted: 12 Jan 2020, 19:33:43 UTC
Last modified: 12 Jan 2020, 20:01:56 UTC

Okay, running my RX 5700 XT at Beta. Got an AP.
Keep an eye on it via https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=89104
(freed one core for the GPU)

Edit: an AP in 9m37s? Wow.
Second AP in 9m09s.
Third AP in 9m01s.
ID: 2027498 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2027502 - Posted: 12 Jan 2020, 20:01:56 UTC - in response to Message 2027498.  
Last modified: 12 Jan 2020, 20:08:21 UTC

“Wow” fast? Or “Wow” Slow?

My 2070s routinely run them about that fast (sometimes faster, sometimes slower), when running 2 WUs at a time.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2027502 · Report as offensive     Reply Quote
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 20 · Next

Message boards : Number crunching : Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.