The Server Issues / Outages Thread - Panic Mode On! (117)

Author	Message
Siran d'Vel'nahr Volunteer tester Send message Joined: 23 May 99 Posts: 7379 Credit: 44,181,323 RAC: 238	Message 2012088 - Posted: 15 Sep 2019, 19:12:57 UTC Greetings, Ok, now my other Linux PC is out of GPU WUs. I have 36 CPU WUs left on my Linux/Winders PC. My other Linux PC has twice that many. Minty-Winders will be out of work in the next few hours. :( Have a great day, if'n you can! :) Siran CAPT Siran d'Vel'nahr - L L & P _\\// Winders 11 OS? "What a piece of junk!" - L. Skywalker "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath ID: 2012088 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 2012091 - Posted: 15 Sep 2019, 19:48:45 UTC Haven't been here in a while. What's going on with Sundays now? SSP looks like things should be running.. but I haven't had a good scheduler contact for 10 hours now. Uploads go through just fine though. (times are in UTC -4) 2019-09-15 15:44:59 SETI@home update requested by user 2019-09-15 15:45:00 SETI@home [sched_op_debug] Fetching master file 2019-09-15 15:45:00 SETI@home Fetching scheduler list 2019-09-15 15:45:02 Project communication failed: attempting access to reference site 2019-09-15 15:45:02 SETI@home [sched_op_debug] Deferring communication for 1 days 0 hr 0 min 0 sec 2019-09-15 15:45:02 SETI@home [sched_op_debug] Reason: 5 consecutive failures fetching scheduler list 2019-09-15 15:45:03 Internet access OK - project servers may be temporarily down. I've still got a good bit of cache left, but I don't know what's going on. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 2012091 ·

mohavewolfpup Send message Joined: 20 Oct 18 Posts: 32 Credit: 3,666,574 RAC: 24	Message 2012107 - Posted: 15 Sep 2019, 20:59:07 UTC - in response to Message 2012091. I've got lots of "Ready to report" sitting in queue currently and communication keeps getting put off. No reports or unit downloads, so something is up Historian for the Defunct Riviera Hotel and Casino, Former Classic Seti@home user for Team Art Bell. Greetings from the High Desert! ID: 2012107 ·

Siran d'Vel'nahr Volunteer tester Send message Joined: 23 May 99 Posts: 7379 Credit: 44,181,323 RAC: 238	Message 2012112 - Posted: 15 Sep 2019, 21:40:42 UTC Greetings, @Richard: Somehow I don't believe the "goin' to the pub" routine is gonna work this time. ;) Oh well. lol :( Have a great day! :) Siran CAPT Siran d'Vel'nahr - L L & P _\\// Winders 11 OS? "What a piece of junk!" - L. Skywalker "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath ID: 2012112 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 2012114 - Posted: 15 Sep 2019, 21:59:33 UTC It is beginning to look like this is the new normal, practicing for the Tuesday outage. ID: 2012114 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 2012116 - Posted: 15 Sep 2019, 22:31:23 UTC Reporting finished work still doesn't work and there is very little we can do about it. All I can think of is set some hair an fire and howl at the moon. ID: 2012116 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 2012119 - Posted: 15 Sep 2019, 22:51:47 UTC It worked! ID: 2012119 ·

HAL Send message Joined: 18 May 99 Posts: 535 Credit: 8,246,955 RAC: 3	Message 2012127 - Posted: 15 Sep 2019, 23:30:33 UTC Coming back to life here. :-) I'm putting myself to the fullest possible use, which is all, I think, that any conscious entity can ever hope to do. ID: 2012127 ·

Jimbocous Volunteer tester Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349	Message 2012130 - Posted: 15 Sep 2019, 23:50:40 UTC Staggering back to life, slowly. Know the feeling ... :\| ID: 2012130 ·

Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489	Message 2012139 - Posted: 16 Sep 2019, 0:10:01 UTC Caches here are now full and heat is being generated again (it was a cold morning here). Cheers. ID: 2012139 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 2012146 - Posted: 16 Sep 2019, 1:26:52 UTC - in response to Message 2012062. Oh dear. You're not saying I have to go and force down another Sunday roast, are you? The sacrifices I go through for this project... Your technique works perfect. Just arrived from the bar, and the work is flowing normally. I know how hard are the sacrifices you do for the project. ID: 2012146 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 2012268 - Posted: 17 Sep 2019, 2:07:16 UTC - in response to Message 2012091. Haven't been here in a while. What's going on with Sundays now? SSP looks like things should be running.. but I haven't had a good scheduler contact for 10 hours now. Uploads go through just fine though. (times are in UTC -4) 2019-09-15 15:44:59 SETI@home update requested by user 2019-09-15 15:45:00 SETI@home [sched_op_debug] Fetching master file 2019-09-15 15:45:00 SETI@home Fetching scheduler list 2019-09-15 15:45:02 Project communication failed: attempting access to reference site 2019-09-15 15:45:02 SETI@home [sched_op_debug] Deferring communication for 1 days 0 hr 0 min 0 sec 2019-09-15 15:45:02 SETI@home [sched_op_debug] Reason: 5 consecutive failures fetching scheduler list 2019-09-15 15:45:03 Internet access OK - project servers may be temporarily down. I've still got a good bit of cache left, but I don't know what's going on. Hmm.. this is still going on for me. I'm not sure what the problem is now. I've flushed the DNS cache, done a release/renew of DHCP. That machine can get to the internet just fine. But for some reason, it fails at fetching the scheduler list. Windows machine gets the scheduler just fine. Might just be time for a restart.. it's only been up for 194 days. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 2012268 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 2012271 - Posted: 17 Sep 2019, 2:26:11 UTC - in response to Message 2012268. Hmm.. this is still going on for me. I'm not sure what the problem is now. Obviously this is a training run for the Tuesday outage. ID: 2012271 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 2012276 - Posted: 17 Sep 2019, 3:41:17 UTC Last modified: 17 Sep 2019, 3:59:03 UTC Alright, so I don't know what the problem is now. I really don't. I've restarted, I've done all the updates that were available, i've restarted again. I flushed DNS cache. I even found some guide about disabling the default of using localhost as a DNS server to force it to use outside servers.. and after a restart.. it STILL defaults to using localhost for queries. I think there's something flawed in how it is trying to look the scheduler address up and that's causing it to not be able to find it. That's what I think is going on, is what this write-up explains: https://moss.sh/name-resolution-issue-systemd-resolved/ I was able to upload all the finished results, but I have a full cache of 100 to report, but the scheduler can't be reached for some reason. All other websites work in a browser, or just pinging them in a terminal. I miss when network config was so much simpler in older linux. When you could just put three DNS servers in /etc/resolv.conf and be done with it. Now there's 15 layers of complexity that all try to take control over everything. Urg. I don't know what else to do now. Maybe there's just a bad DNS record or something? The default time-out on those is 7 days. Maybe I just have to let the clock run out on that. edit: Okay, I did manage to disable the localhost being a server.. it is now actually obeying the addresses listed by DHCP. But it still isn't working. Can't fetch scheduler list. edit2: so in linux, nslookup returns 208.68.240.126 for setiboinc.ssl.berkeley.edu, which is what is listed in client_state.xml for scheduler. Can't ping in either linux or windows.. but windows machine connects just fine to it? Linux machine doesn't. And then boinc2.ssl.berkeley.edu behaves the same way. Can we not ping those anymore or something? But I see the placeholder page for Apache if I go to http://setiboinc.ssl.berkeley.edu , so... the site exists. Is that what is going on, or what? Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 2012276 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 2012282 - Posted: 17 Sep 2019, 5:33:04 UTC I think it's a SSL certificate problem. That's what an 'update now' looks like in wireshark. That's the only thing I can think of at this point is that it's a bad certificate. It looks like the server IS being contacted and responding, so it's not a network problem. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 2012282 ·

Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22	Message 2012283 - Posted: 17 Sep 2019, 5:37:12 UTC Cosmic Ocean...read this thread in the recent past...someone else just had this problem ID: 2012283 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 2012286 - Posted: 17 Sep 2019, 7:13:14 UTC - in response to Message 2012283. Last modified: 17 Sep 2019, 7:46:02 UTC Cosmic Ocean...read this thread in the recent past...someone else just had this problem Yeah, I see that now. Looks like some older clients are starting to act up with expiring certificates. I went and grabbed the current latest one from: http://curl.haxx.se/ and overwrote the old one that was in /etc/ssl/certs, but I'm still getting a/the same problem. 2019-09-17 02:47:49 SETI@home update requested by user 2019-09-17 02:47:54 SETI@home [sched_op_debug] Fetching master file 2019-09-17 02:47:54 [http_debug] HTTP_OP::init_get(): http://setiathome.berkeley.edu/ 2019-09-17 02:47:54 SETI@home Fetching scheduler list 2019-09-17 02:47:54 [http_debug] [ID#1] Info: About to connect() to setiathome.berkeley.edu port 80 (#0) 2019-09-17 02:47:54 [http_debug] [ID#1] Info: Trying 208.68.240.110... 2019-09-17 02:47:54 [http_debug] [ID#1] Info: Connected to setiathome.berkeley.edu (208.68.240.110) port 80 (#0) 2019-09-17 02:47:54 [http_debug] [ID#1] Sent header to server: GET / HTTP/1.1 2019-09-17 02:47:54 [http_debug] [ID#1] Info: Connected to setiathome.berkeley.edu (208.68.240.110) port 443 (#0) 2019-09-17 02:47:54 [http_debug] [ID#1] Info: successfully set certificate verify locations: 2019-09-17 02:47:54 [http_debug] [ID#1] Info: CAfile: ca-bundle.crt 2019-09-17 02:47:54 [http_debug] [ID#1] Info: CApath: none 2019-09-17 02:47:54 [http_debug] [ID#1] Info: SSLv2, Client hello (1): 2019-09-17 02:47:54 [http_debug] [ID#1] Info: SSLv3, TLS handshake, Server hello (2): 2019-09-17 02:47:54 [http_debug] [ID#1] Info: SSLv3, TLS handshake, CERT (11): 2019-09-17 02:47:54 [http_debug] [ID#1] Info: SSLv3, TLS alert, Server hello (2): 2019-09-17 02:47:54 [http_debug] [ID#1] Info: error:0D0C50A1:asn1 encoding routines:ASN1_item_verify:unknown message digest algorithm 2019-09-17 02:47:54 [http_debug] HTTP error: SSL connect error In the boinc data directory, ca-bundle.crt a symbolic link to /etc/ssl/certs/ca-certificates.crt. I changed the link and the CAfile listed in the output there changed to a non-existent path for where it would be if curl was being used.. so it DOES rely on it being in the data dir. And as a test, I downloaded the latest BOINC for x64 linux and pulled ca-bundle.crt out of that, and it still does the same thing. So now what? Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 2012286 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 2012287 - Posted: 17 Sep 2019, 7:15:45 UTC - in response to Message 2012283. Cosmic Ocean...read this thread in the recent past...someone else just had this problem https://manpages.ubuntu.com/manpages/bionic/man8/update-ca-certificates.8.html Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 2012287 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 2012289 - Posted: 17 Sep 2019, 7:45:28 UTC - in response to Message 2012287. Cosmic Ocean...read this thread in the recent past...someone else just had this problem https://manpages.ubuntu.com/manpages/bionic/man8/update-ca-certificates.8.html root@taurus:~# update-ca-certificates -v Updating certificates in /etc/ssl/certs... 0 added, 0 removed; done. Running hooks in /etc/ca-certificates/update.d... done. root@taurus:~# Don't know what that did, if anything. I see the file size and modified date of /etc/ssl/certs/ca-certificates.crt changed, and I restarted boinc-client, and it still gives the same error: unknown message digest algorithm. I don't know if it means anything, but in my googling, I found a boinc forum post from 2009 where someone was asking about this very thing for WCG, and one of the devs replied and said that the cert WCG relies-on is good until "2019". Maybe we've reached the point in "2019" for that particular cert, perhaps? I know we're not WCG, but it seems to line up. here's that particular thread: https://boinc.berkeley.edu/dev/forum_thread.php?id=4133 Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 2012289 ·

Oddbjornik Volunteer tester Send message Joined: 15 May 99 Posts: 220 Credit: 349,610,548 RAC: 1,728	Message 2012290 - Posted: 17 Sep 2019, 7:45:30 UTC - in response to Message 2012286. So now what? I had a problem very similar to this on my Mint hosts. That problem went away when I installed the libnsspem library. sudo apt-get install libnsspem As I recall it, the http_debug output in my case complained that it couldn't find libnsspem.so, so it may not have been exactly the same as your problem... ID: 2012290 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.