Message boards :
Number crunching :
Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 20 · Next
Author | Message |
---|---|
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
AP app seems working well with the 5700XT ! https://setiathome.berkeley.edu/results.php?hostid=8772813&offset=0&show_names=0&state=0&appid=20 |
Wiggo Send message Joined: 24 Jan 00 Posts: 36596 Credit: 261,360,520 RAC: 489 |
AP app seems working well with the 5700XT !It's a shame about the rest. :-( https://setiathome.berkeley.edu/results.php?hostid=8772813 Cheers. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
AP app seems working well with the 5700XT !The RX5700 version works well in a Mac, https://setiathome.berkeley.edu/results.php?hostid=8592369&offset=120 You could try the other MB8 Windows App versions and see if they work with with the RX5700XT. Just install Lunatics and then insert the Apps from Here, https://setiathome.berkeley.edu/forum_thread.php?id=79765&postid=1801541#1801541 BOINC will run the NV & Intel Apps on the AMD 5700 by just keeping ATI in the <coproc> section instead of the other names. The current Stock Mac NV App is really the Intel_gpu build due to the normal NV builds not working very well on the NV GPUs. Use the included app_info.xml from the download to set it up and just use ATI in the <coproc> section. Make sure to set your cache very Low first, something like 0.001 will just download a couple of tasks at a time in case it doesn't work. Also, disable networking and suspend all but one task so you can test all the App versions. Or, just download and use the Benchmark package to test it offline, but, the Benchmark test may not run the NV & Intel Apps on the AMD card the way Anonymous platform will. |
Justin Turner Arthur Send message Joined: 20 Oct 03 Posts: 12 Credit: 3,929,052 RAC: 2 |
Just found another corrupt WU quorum from Windows Radeon 5700 XT users: 3734170312. I let Eric know. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13842 Credit: 208,696,464 RAC: 304 |
blc14_2bit_guppi_58691_76289_HIP40815_0077.5058.818.21.44.40.vlar Mugged by RX 5700XT's, again. *deep sigh* Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13842 Credit: 208,696,464 RAC: 304 |
And now a double mugging, 2 WUs lost to RX 5700XTs What's next, 4 WUs? Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13842 Credit: 208,696,464 RAC: 304 |
Thanks for this, TBar. I had seen this mentioned before but didn't read it enough to see that they unfortunately aren't erroring out but instead running to boinc_finish completion.It's been 3 months, the numbers of faulty data going in to the database are increasing. Might be worth a follow up. Grant Darwin NT |
Wiggo Send message Joined: 24 Jan 00 Posts: 36596 Credit: 261,360,520 RAC: 489 |
They're certainly corrupting the science with their false results. :-( Cheers. |
Sean Send message Joined: 10 Aug 00 Posts: 33 Credit: 125,775,158 RAC: 199 |
Yet another instance: https://setiathome.berkeley.edu/workunit.php?wuid=3744949125 |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
It's a problem with Windows in general. Look at these, they are all the same AP WUs, same in each case, Two bad Windows and One Good Mac. The Good loses every time; https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=71141&state=5 34836695 12179862 10 Nov 2019, 10:09:48 UTC 10 Nov 2019, 13:25:49 UTC Completed, marked as invalid 2,066.68 234.68 0.00 AstroPulse v7 v7.07 (opencl_nvidia_mac) 34836702 12179865 10 Nov 2019, 10:09:48 UTC 10 Nov 2019, 12:28:36 UTC Completed, marked as invalid 796.48 215.72 0.00 AstroPulse v7 v7.07 (opencl_nvidia_mac) 34823813 12171883 8 Nov 2019, 13:30:36 UTC 9 Nov 2019, 1:32:50 UTC Completed, marked as invalid 780.72 206.05 0.00 AstroPulse v7 v7.01 (sse3) 34813010 12171346 7 Nov 2019, 21:28:13 UTC 8 Nov 2019, 7:11:54 UTC Completed, marked as invalid 792.29 205.89 0.00 AstroPulse v7 v7.07 (opencl_ati_mac) 34812940 12171312 7 Nov 2019, 20:50:35 UTC 8 Nov 2019, 6:45:28 UTC Completed, marked as invalid 804.41 222.15 0.00 AstroPulse v7 v7.07 (opencl_ati_mac) 34811733 12170908 7 Nov 2019, 17:05:44 UTC 7 Nov 2019, 22:07:30 UTC Completed, marked as invalid 789.17 207.18 0.00 AstroPulse v7 v7.07 (opencl_ati_mac) 34811714 12170899 7 Nov 2019, 17:05:28 UTC 7 Nov 2019, 20:35:22 UTC Completed, marked as invalid 788.46 207.53 0.00 AstroPulse v7 v7.07 (opencl_ati_mac) 34811724 12170904 7 Nov 2019, 17:05:28 UTC 7 Nov 2019, 18:30:37 UTC Completed, marked as invalid 792.83 206.24 0.00 AstroPulse v7 v7.07 (opencl_ati_mac) 34811726 12170905 7 Nov 2019, 17:05:28 UTC 7 Nov 2019, 21:28:13 UTC Completed, marked as invalid 793.05 208.26 0.00 AstroPulse v7 v7.07 (opencl_ati_mac) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Just a refresher about the current state of AMD and their OpenCL drivers. https://www.anandtech.com/show/14618/the-amd-radeon-rx-5700-xt-rx-5700-review/13 Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
MagicEye Send message Joined: 19 Sep 99 Posts: 70 Credit: 40,327,877 RAC: 75 |
I check from time to time my invalid WUs. Very often there are other PCs with e.g. RX5700 GPU, which compute about 50% errors. As a wing man for my GPUs there are 2 different results for the WU and it is sent out to a third PC. When the third user also has such a error-GPU, he will validate the wrong result and the good result is rejected. I looked to this PCs and see about 50% error WUs. But i am sure the other 50% have also errors. Shouldn't it be better for the correctness of the results and for the science to block PCs with such high error rates? Or send then not more then 2-3 WUs per day until the will send back about 90+% correct results? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
The "bad host detected" is already in the server code. #3024 But it does not work very well. The project scientists are aware of the AMD 5700/XT problem I'm fairly sure, but the fix probably requires a fair bit of new coding. It should be handled by the project developers but they have lots of more urgent code updates pending. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tomcat雄猫 Send message Joined: 20 Dec 14 Posts: 9 Credit: 391,588 RAC: 19 |
I run a rx5700 and have noticed this issue. The task runs to completion and returns blatantly incorrect results. The only times when my rx5700 GPU gets a valid result is when it is validated against another AMD rx5700 series GPU (both gets the wrong result). I've currently stopped my computer from accepting GPU work units (it took me way too long to realize something was wrong, sorry). I believe this is an issue with the Navi architecture and not necessarily solely with AMD's OpenCL driver, as I see older AMD GPUs still returning "correct" results. Someone has to redo all the work units where the results came from Navi AMD GPUs (RX5700, RX 5700XT, RX 5500M, RX 5500), and ban all AMD Navi GPUs until a fix is found. Interestingly, my RX5700 has not been causing issues with other projects, like Einstein@home, Milkyway@home, Collatz, etc. Something about Navi and OpenCL really does not like Seti@home. If any of you need any testing or logs on an AMD RX5700, hit me up. edit: Corrected OpenGl to OpenCl, thanks Keith Myers |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
Sounds like the issue might come down to the app needing an update rather than the drivers. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I believe this is an issue with the Navi architecture and not necessarily with AMD's OpenGL driver, Correction. Distributed computing does not use the OpenGL component. It uses the OpenCL component of the drivers. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Sounds like the issue might come down to the app needing an update rather than the drivers. Anybody heard or seen any sign of Raistmer lately? He being the developer of the app. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tomcat雄猫 Send message Joined: 20 Dec 14 Posts: 9 Credit: 391,588 RAC: 19 |
Sounds like the issue might come down to the app needing an update rather than the drivers. He was here. It was in September, though. https://community.amd.com/thread/243179 |
Tomcat雄猫 Send message Joined: 20 Dec 14 Posts: 9 Credit: 391,588 RAC: 19 |
I have posted this issue in the AMD subReddit to hopefully get more attention and get owners of Navi GPUs to remove them from Seti@home. |
catavalon21 Send message Joined: 2 Nov 01 Posts: 13 Credit: 7,238,152 RAC: 48 |
The few 5700XT crunchers I have seen when manually perusing the results of folks who validate the same WUs I do largely error out; however, there are occasionally "valid" results returned, and they are STUPID fast...if correct, valid results close to an order of magnitude faster than my 970 on the special sauce, with them apparently running the default Windows AMD app. It would really be nice for solid drivers (if that's the issue) to eventually show up. Maybe with Big Navi <sigh> |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.