|
141)
留言板 :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
(消息 2028995)
发表于:24 Jan 2020 作者: Mr. Kevvy ![]() Post: This morning all hosts are completely empty. So either I have very bad luck in the filling queue or there is still a problem with the RTS buffer and splitting. Same here... only my slower hosts have enough work. Every time that I have checked the result creation rate since the out There are 5M results in the field even with this, so again it seems to fall back on the three-quorum validation being in place due to the bad AMD hosts. |
|
142)
留言板 :
Politics :
The Donald Trump Thread (IV)
(消息 2028989)
发表于:24 Jan 2020 作者: Mr. Kevvy ![]() Post: "So help me God". So what could possible go wrong:) Everything. :^) |
|
143)
留言板 :
News :
Low available work.
(消息 2028770)
发表于:23 Jan 2020 作者: Mr. Kevvy ![]() Post: I've been with no work for just over 24 hrs now on 5 computers and none of them have been able to "report" any results. I've about 300 plus units "ready to report" and it's been that way since yesterday. Expect several hours more like this due to the servers being battered by 100,000+ dry hosts after two days offline. |
|
144)
留言板 :
Number crunching :
Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database
(消息 2028540)
发表于:19 Jan 2020 作者: Mr. Kevvy ![]() Post: Lol thanks... and a rotating one. ;^) OK, here are the new bad host owners I contacted for the first time today if required (some were not): aplrapid 7807183 Leigh Green 123169 Vytautas Liesis 173783 Adam Tadian 9295811 firecrypt 10881832 (already had updated) Capizzi 10504781 AshlandPony 9004257 cprince1977 10886783 (already had updated) Illyria 10845292 jcr 3428 teargasm 10886461 BigDaddyDave 265982 killerepicprofurrygamer6969 10885981 dcox 10884993 iinkabob 102908 Simgiov 8796082 Hozer 61288 (no affected GPUs) unbound 10885610 Thankfully I was able to give them a solution immediately rather than just to disable their GPU and wait. I also re-contacted every single one of the known bad host owners to advise them of the solution as well. And here is the updated list: 李溪伦 9302807 achimbln 138625 Adam Tadian 9295811 Alexandr Galushchenko 9609912 AMD Jesus 70887 antoi 10856207 aplrapid 7807183 Arnab 10093567 AshlandPony 9004257 BigDaddyDave 265982 Bigthor 480399 Borktron 10682716 calendir 9663884 Capizzi 10504781 Christopher 9894096 CoffeeSloth 10266313 Crisu 7833612 dalex 10881818 Daniel Conrad Broom 8059986 Daniel frederikson 9813817 Daniel Penz 91581 Dank 49802 dcox 10884993 Doc_Jebus 10863878 Dzsozi 8002127 Earendil 146007 egon.sauter 494566 Eirikafh 10883218 Eric 9157146 eryndel 10878567 Esta 10624508 Foaming Mad Cow Industries 219464 Ffred 1935325 fredi 7913572 George Ko 639539 ghostbuster 564989 Haiko_N 9198068 HawkMedic 10838738 higemayuge 10790664 HMZ 9079227 iinkabob 102908 Illyria 10845292 jcr 3428 Jeff 10639246 Jerjes 1291426 JohnDoe 9166075 Jorge Barrera 9650295 Juraxell 10864786 Kekke 46817 killerepicprofurrygamer6969 10885981 knutella 9880098 Leigh Green 123169 lupaslupas 10002927 MadMikeDelta 8221690 MaximusPrometheus 10240426 mnelsonx 272885 Niflhuem 113140 No Name@Extraterrestrial Intelligence 8116 Oriah 9838773 Otosan 8547502 PantherJon 9801065 Peter Furlong 7965665 phoenix7477 10773411 Rafael 8249913 rame 10738 Recedham 954834 rgeens 10740140 Richard 8565733 Richard Hartland 9781177 Rocky 270621 Saint123 159425 Simgiov 8796082 Stephen Diem 36679 StrayCat 177967 Strickland 34273 suhail ahmad 9878177 Swagstergo 10882690 T66 3336343 Tanis 10773581 teargasm 10886461 toby 9442798 TomasFraus 8445239 Tomik 8972653 Trezy 10367889 unbound 10885610 vleermuis 1295921 VMS Software Inc 45538 Vytautas Liesis 173783 werewolf_007 10880222 xakei 10823091 Italicized names have replied and indicated they are disabling their affected GPUs. |
|
145)
留言板 :
Number crunching :
Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database
(消息 2028537)
发表于:19 Jan 2020 作者: Mr. Kevvy ![]() Post: The BOINC platform code itself isn't affected, rather it's the client executable that BOINC (or WCG) downloads. Each project will have to recompile their client for the new drivers to set the required flag. BOINC will then download the new client automatically (unless they have an "anonymous" manually installed client lke the Lunatics one here.) |
|
146)
留言板 :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
(消息 2028528)
发表于:19 Jan 2020 作者: Mr. Kevvy ![]() Post: The scheduler is being hammered by all the bone-dry hosts asking for work so yes, there are timeouts aplenty. |
|
147)
留言板 :
Politics :
Another example of USA Gun Laws (or lack of...)?
(消息 2028526)
发表于:19 Jan 2020 作者: Mr. Kevvy ![]() Post: Send more guns! "Two Shot at the Community Arms Apartment Complex" How appropriate. :^p |
|
148)
留言板 :
Politics :
Police and Law Enforcement #5
(消息 2028524)
发表于:19 Jan 2020 作者: Mr. Kevvy ![]() Post: I don't know why I haven't seen more lawsuits over this practice, but better late than never: Family Sues DEA and TSA After Elderly Man's Life Savings Were Seized at Airport A class-action lawsuit is now challenging the DEA's habit of seizing large amounts of cash from travelers without evidence of any crime. Er? "but all that money now belongs to the Drug Enforcement Administration (DEA)" Nah... stealing something doesn't make it yours. :^p |
|
149)
留言板 :
Number crunching :
Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database
(消息 2028510)
发表于:19 Jan 2020 作者: Mr. Kevvy ![]() Post: Thanks, Richard... I'm busily PMing all the known hosts' owners and I just ran into an Anonymous Windows one doubtless on Lunatics so your update was fortuitous. I'll link to this post in the private message to them. |
|
150)
留言板 :
Politics :
Computers & Technology 4
(消息 2028497)
发表于:19 Jan 2020 作者: Mr. Kevvy ![]() Post: I have an old Core2 Quad Q6600 on its Intel board, which I bought a dozen years ago, happily crunching away with 3xGTX970s stuffed in it, on the latest Mint Tricia 19.3 64-bit with the latest NVidia drivers and all updates. (Edit: disregard that it shows 24 GPUs lol) |
|
151)
留言板 :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
(消息 2028493)
发表于:19 Jan 2020 作者: Mr. Kevvy ![]() Post: I'm convinced, as they started at the same time, that as the root cause of the issues has been determined to be too many results out in the field that what actually triggered this was turning on triple-quorum validation for overflow work units due to the bad AMD drivers which have now been corrected. So, as it's now well-confirmed that the new drivers (and the recompiled 8.24 client to take advantage of them) resolve this problem, I'm making it a priority to re-contact all of the owners of known bad hosts and advise them to update, as well as all of the new owners of bad hosts I hadn't had a chance to. The faster we can get these updated, the less of a chance they can cross-validate and we can then go back to normal two-quorum validation which should resolve the excessive queues. I should have an update in that thread within a few hours once I have contacted all known bad host owners (plus catching up on my private message replies from them...) |
|
152)
留言板 :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
(消息 2028361)
发表于:18 Jan 2020 作者: Mr. Kevvy ![]() Post: It might even be worth declaring a splitter holiday and allowing work allocated before the outage to report, validate, assimilate, delete and purge before starting to fill up the database again. Coincidentally (or not lol) Dr. Korpela has updated the News feed: For a couple of reasons, the result table has grown to the point where it no longer fits in main memory. That has been slowing the validators and assimilators, which is causing the result table to grow further. Just what we were asking for... a simple brief update if there are issues. |
|
153)
留言板 :
News :
Low available work.
(消息 2028360)
发表于:18 Jan 2020 作者: Mr. Kevvy ![]() Post: Thanks for the update... much appreciated! |
|
154)
留言板 :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
(消息 2028353)
发表于:18 Jan 2020 作者: Mr. Kevvy ![]() Post: These issues have been ongoing since the day-long Tuesday out All I can hope for is that the next Tuesday out |
|
155)
留言板 :
Number crunching :
SETI@Home will now cache 150 CPU work units and 150 per installed GPU
(消息 2027996)
发表于:16 Jan 2020 作者: Mr. Kevvy ![]() Post: Title updated again. :^) |
|
156)
留言板 :
Number crunching :
Flakey AMD/ATI GPUs, including RX 5700 XT, Cross Validating, polluting the Database
(消息 2027994)
发表于:16 Jan 2020 作者: Mr. Kevvy ![]() Post: It would appear it's now time to follow up with the people who were producing bad results and let them know that the solution is at hand. As far as I know as 8.24 has gone to main, it will update automatically, and fixed drivers can be had at https://www.amd.com/en/support I don't think there's anything else I need to pass on, but please advise if otherwise. I'll give it a while and then start sending messages. |
|
157)
留言板 :
Number crunching :
SETI@Home will now cache 150 CPU work units and 150 per installed GPU
(消息 2027928)
发表于:16 Jan 2020 作者: Mr. Kevvy ![]() Post: Three good confirmations... I've edited the thread title (again). :^) |
|
158)
留言板 :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
(消息 2027828)
发表于:16 Jan 2020 作者: Mr. Kevvy ![]() Post: I dunno if it's a lottery or the scheduler is drunk again. I refreshed the server status page until I got one where the last update was only two minutes ago, the replica was less than a minute behind the master, and there were 780K work units in the RTS queue. Then I manually initiated a scheduler contact on an empty client: Wed 15 Jan 2020 07:45:43 PM EST | SETI@home | Scheduler request completed: got 0 new tasks Wed 15 Jan 2020 07:45:43 PM EST | SETI@home | [sched_op] Server version 709 Wed 15 Jan 2020 07:45:43 PM EST | SETI@home | Project has no tasks available |
|
159)
留言板 :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
(消息 2027741)
发表于:15 Jan 2020 作者: Mr. Kevvy ![]() Post: I also agree that a simple one-liner would only take a minute for an admin. to type and doesn't even have to explain the issue, just acknowledge that there is one. ie "We're currently having a work distribution issue and are working to identify and resolve it, so in the interim, our weekly outage may be much longer than normal and it may be advisable to obtain work from a backup project." One minute from a project admin. saves thousands from us fumbling around and checking the threads over and over... |
|
160)
留言板 :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
(消息 2027694)
发表于:15 Jan 2020 作者: Mr. Kevvy ![]() Post: I'm wondering if the out I'll stick to that rather than that the project is just broken. :^) |
©2020 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.