Message boards :
Number crunching :
Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 20 · Next
Author | Message |
---|---|
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I get burned regularly too by the 5700's being used on the project. https://setiathome.berkeley.edu/workunit.php?wuid=3779924575 https://setiathome.berkeley.edu/workunit.php?wuid=3781287501 https://setiathome.berkeley.edu/workunit.php?wuid=3781287513 https://setiathome.berkeley.edu/workunit.php?wuid=3780346096 Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
rob smith Send message Joined: 7 Mar 03 Posts: 22535 Credit: 416,307,556 RAC: 380 |
If one looks at the task summary for any of the computers that "burned" Kieth one would see that they have very high "invalid" scores - yet another case for counting "invalid" as "error", as then such computers would have substantially fewer tasks per day. However such a step might, in the short term, affect Keith, but given his very high "valid" rate I guess he would escape the trap within the hour. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3806 Credit: 1,114,826,392 RAC: 3,319 |
Because all the ones I posted earlier went away, I will repost these in the hope that they do as well. Here are the cross-validated work units so far posted by others that are still in the database: Here are more from me: And here are some I found when I went hunting them: 3784413112, 3784300965, 3784224221, 3784267733 is the only work unit I have seen where the RX 5700 validated against a different platform (NVidia CUDA.) This is because it was a -9 overflow; they merely both agreed it could not complete. And here are the participants I PMed (many thanks to Wiggo for providing a bunch of these names) advising that they have cards producing nothing but bad results (just so they don't get bothered twice I hope): [AfZ]TomServo1 1483720 achimbln 138625 antoi 10856207 aridhol 10288747 Baldarov 9438496 Borktron 10682716 Brandon 8198367 calendir 9663884 Camiron 7449359 Carl 914781 Christopher 9894096 CoffeeSloth 10266313 Crisu 7833612 Derrek 219419 dsharbour 10858679 Dzsozi 8002127 Earendil 146007 egon.sauter 494566 eryndel 10878567 Foaming Mad Cow Industries 219464 fred 1935325 ghostbuster 564989 Haiko_N 9198068 HawkMedic 10838738 higemayuge 10790664 HMZ 9079227 Jeff 10639246 Jeffrey A. Smith 38247 Jerjes 1291426 Jorge Barrera 9650295 Juraxell 10864786 Kekke 46817 lastsworder 10878688 lupaslupas 10002927 Maulwurf 1516335 MaximusPrometheus 10240426 mnelsonx 272885 Niflhuem 113140 No Name@Extraterrestrial Intelligence 8116 Oriah 9838773 Otosan 8547502 PantherJon 9801065 Peter Furlong 7965665 phoenix7477 10773411 rgeens 10740140 Rocky 270621 Saint123 159425 Stephen Diem 36679 stogdan 10865456 StrayCat 177967 Strickland 34273 suhail ahmad 9878177 T66 3336343 toby 9442798 Tomik 8972653 Trezy 10367889 Tristan 9778349 VMS Software Inc 45538 xakei 10823091 Zac 100334866 |
betreger Send message Joined: 29 Jun 99 Posts: 11416 Credit: 29,581,041 RAC: 66 |
My very stable Nvidea host got thos invalid My 2 wingmen were both AMD hosts with mostly invalid results but they matched up to each other. https://setiathome.berkeley.edu/workunit.php?wuid=3780408798 |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
My very stable Nvidea host got thos invalid Yup, welcome to the club. All my invalids are matched up with ATI cards. |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3806 Credit: 1,114,826,392 RAC: 3,319 |
After hunting for bad cross-validations: it's hopeless to manually remove them. There are too many of these bad hosts, they produce invalids every 10-20 seconds so the cross-validation rate is horrific. Here is a computer showing 489 (edit: it went up to 508 in the time it took to write this!) valid tasks as of when I checked. 90% of them are from the RX 5700 and every single one of them cross-validated and is garbage... all from one single card in a few days. And there are probably hundreds of them like this. Thus there are probably tens if not hundreds of thousands of garbage results in the science database now, which will eventually corrupt Nebula. I think that this is the most serious threat to the data integrity of this project it has ever faced, as bad as a successful deliberately malicious attack on it. These cards need to be banned from getting any work ASAP... this was done with other broken platforms in the past so the capability is there. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
After hunting for bad cross-validations: it's hopeless to manually remove them. There are too many of these bad hosts, they produce invalids every 10-20 seconds so the cross-validation rate is horrific. Here is a computer showing 489 (edit: it went up to 508 in the time it took to write this!) valid tasks as of when I checked. 90% of them are from the RX 5700 and every single one of them cross-validated and is garbage... all from one single card in a few days. And there are probably hundreds of them like this. +1 Stephen ! ! ! |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
Has anyone emailed Eric about it? He was apparently responsible for the GPU/CPU limits change. So maybe he will act on this too. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3806 Credit: 1,114,826,392 RAC: 3,319 |
|
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
Are you sure they are getting removed from the data set? Tasks do have a shelf life of what’s visible on the website right? Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3806 Credit: 1,114,826,392 RAC: 3,319 |
|
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
Yeah I thought validated tasks only hung around for 1-2 days on the website. But I agree with you that something needs to be done about it. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3806 Credit: 1,114,826,392 RAC: 3,319 |
The very unfortunate part of the purging is that it would appear the platform info is lost with the purged results, so only the work unit info and stderr would go into the science database. Thus it may be be nigh-impossible to get rid of these garbage results after a few days once they are submitted because it won't be able to be determined that two RX 5700s "validated" them. |
Wiggo Send message Joined: 24 Jan 00 Posts: 36819 Credit: 261,360,520 RAC: 489 |
[AfZ]TomServo1 1483720I can add several more users to that list. :-( Brandon 8198367 Christopher 9894096 Derrek 219419 fred 1935325 grcpool.com 10434153 HawkMedic 10838738 Jeff 10639246 Jorge Barrera 9650295 Kekke 46817 mnelsonx 272885 Peter Furlong 7965665 rgeens 10740140 Saint123 159425 toby 9442798 Trezy 10367889 No cheers here. |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3806 Credit: 1,114,826,392 RAC: 3,319 |
|
MagicEye Send message Joined: 19 Sep 99 Posts: 70 Credit: 40,327,877 RAC: 75 |
I wrote to some people a private message - but didn't get any answer. :( |
Wiggo Send message Joined: 24 Jan 00 Posts: 36819 Credit: 261,360,520 RAC: 489 |
I wrote to some people a private message - but didn't get any answer. :(On average I only get 1 reply from about every 20 PM's that I've sent out and I've sent out hundreds over the years so don't feel disheartened about it. ;-) Cheers. |
Wiggo Send message Joined: 24 Jan 00 Posts: 36819 Credit: 261,360,520 RAC: 489 |
And a few more.[AfZ]TomServo1 1483720Brandon 8198367 lastsworder 10878688 MaximusPrometheus 10240426 stogdan 10865456 suhail ahmad 9878177 Still no cheers here. |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3806 Credit: 1,114,826,392 RAC: 3,319 |
And a few more. Pestered and thanks again! And also thanks to dsharbour and VMS Software Inc who have joined StrayCat and rgeens as the four who have confirmed they are turning off GPU computing for now! Edit: Also lastsworder, Derrek and Camiron! |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
apparently AMD released some new drivers today or yesterday. was supposed to be a big update. I wonder if they helped anything? Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.