Message boards :
Technical News :
Network Drano (Mar 27 2007)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Usual database backup outage today except we took some extra time to do a couple things. First, we powered sidious down and back up to measure its current draw. Peaks at about 8 amps during drive spin-up. Then Bob and I did a bunch of tests, comparing table sizes and sums/averages of selected fields to confirm the replica is indeed in sync with the master BOINC database. Looks good. Upon coming back up I eventually noticed most of the file uploads were timing out on bruno. Jeff and I battled with this for a bit. We followed several red herrings and tuned various apache/tcp parameters but eventually the solution was cleaning up some nested sym links that contained a mount that fell away sometime recently. We think. Anyway, we cleaned up these links and that immediately fixed the problem. During all that kryten was working fine. It is still getting hit by a small but significant number of BOINC clients, probably due to libcurl DNS caching within the client - something we should probably fix sooner or later. By the way, this might have also been why the validator queue has been growing over the past day or so. That emptied immediately, too, though that forced a backlog in the deleter queues. I had to kick those just now to pick up the new sym link as well. Backing up the science database today and will make changes tomorrow. Will test the changes (re: the splitter, assimilator, and validator) on kryten before implementing on bruno (later in the week or next week). - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
bicyclist Send message Joined: 3 Jun 01 Posts: 3 Credit: 200,695 RAC: 0 |
sweet! yes |
gomeyer Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0 |
. . . During all that kryten was working fine. It is still getting hit by a small but significant number of BOINC clients, probably due to libcurl DNS caching within the client - something we should probably fix sooner or later. . . . Anything we can do to confirm we are "talking" to the correct server? Anything we can do to correct it if not? Tks, Gus Obermeyer EDIT, already tried the /flushdns trick in case that's it. /EDIT |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 |
. . . During all that kryten was working fine. It is still getting hit by a small but significant number of BOINC clients, probably due to libcurl DNS caching within the client - something we should probably fix sooner or later. . . . To correct the problem: Stop BOINC Flush the DNS at your router. (for router appliances, this is usually a power cycle). Run "IPConfig /flushdns" from a command prompt (drop the quotes). Start BOINC. I have no idea how to tell if you are talking to the correct server. BOINC WIKI |
gomeyer Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0 |
. . . During all that kryten was working fine. It is still getting hit by a small but significant number of BOINC clients, probably due to libcurl DNS caching within the client - something we should probably fix sooner or later. . . . Well, as I said in my edit I've already tried flushing DNS. The "correct server" is Bruno, or at least NOT Kryten. I will power cycle the router tho, thanks. |
Michael Gmirkin Send message Joined: 18 Dec 03 Posts: 50 Credit: 8,956,363 RAC: 26 |
. . . During all that kryten was working fine. It is still getting hit by a small but significant number of BOINC clients, probably due to libcurl DNS caching within the client - something we should probably fix sooner or later. . . . For some reason I seem to recall a note a few weeks back (when folks were having issues with a server/IP switch or something) saying first type: "ipconfig /flushdns" then type: "ipconfig /registerdns" Will it kill things if the latter isn't done? Just wondering. Sounded like the prior posts forgot the last step. Didn't know if it was crucial or not...? ~Michael If there were no time, how old would YOU be? ~Me BOINC Seti@Home Stats: Classic Seti@Home Stats: |
Gary Send message Joined: 7 Mar 07 Posts: 1 Credit: 6,815 RAC: 0 |
I just tried those commands - no joy here - I still cannot d/load new WU's. I just switched 'No New Tasks' on, I will leave it a while - maybe there is a backlog now of people trying to d/load WU's. |
Walla Send message Joined: 14 May 06 Posts: 329 Credit: 177,013 RAC: 0 |
I don't think the latter has to be done. I didn't do it. BOINC can still access a reference site but not the SETI servers. From Microsoft /flushdns : Flushes and resets the contents of the DNS client resolver cache. During DNS troubleshooting, you can use this procedure to discard negative cache entries from the cache, as well as any other entries that have been added dynamically. |
gomeyer Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0 |
I remember that as well. But unless I'm mistaken later on it was generally agreed that the /registerdns command was not really necessary; that would happen by itself. BTW, on 6 of my 8 machines I am not able to upload or download a single unit. Zero. The cricket graph has started to flatten out so things should be moving better than that by now. I've power cycled the modem and router, have flushed dns, and finally tried rebooting. Still nothing. Since I've seldom had any trouble in the past my guess is that something is still wrong at Berkeley. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65709 Credit: 55,293,173 RAC: 49 |
. . . During all that kryten was working fine. It is still getting hit by a small but significant number of BOINC clients, probably due to libcurl DNS caching within the client - something we should probably fix sooner or later. . . . I've tried all those and then some, Yeah I unplugged the BEFSR81 and that was a waste of time, I get system connect and http error, I did get through at least once, See below: Microsoft Windows [Version 5.2.3790] (C) Copyright 1985-2003 Microsoft Corp. C:\\Documents and Settings\\Administrator.BATPC1>ipconfig /registerdns Windows IP Configuration Registration of the DNS resource records for all adapters of this computer has been initiated. Any errors will be reported in the Event Viewe r in 15 minutes.. C:\\Documents and Settings\\Administrator.BATPC1>ping setiathome.berkeley.edu Pinging setiathome.SSL.berkeley.edu [128.32.18.152] with 32 bytes of data: Reply from 128.32.18.152: bytes=32 time=75ms TTL=241 Reply from 128.32.18.152: bytes=32 time=77ms TTL=241 Reply from 128.32.18.152: bytes=32 time=134ms TTL=241 Reply from 128.32.18.152: bytes=32 time=72ms TTL=241 Ping statistics for 128.32.18.152: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 72ms, Maximum = 134ms, Average = 89ms C:\\Documents and Settings\\Administrator.BATPC1> 3/27/2007 6:55:10 PM||Access to reference site succeeded - project servers may be temporarily down. 3/27/2007 6:55:22 PM||Project communication failed: attempting access to reference site 3/27/2007 6:55:22 PM|SETI@home|[file_xfer] Temporarily failed upload of 27se04ab.11205.9842.236066.3.255_3_0: http error 3/27/2007 6:55:22 PM|SETI@home|Backing off 2 min 22 sec on upload of file 27se04ab.11205.9842.236066.3.255_3_0 3/27/2007 6:55:22 PM|SETI@home|[file_xfer] Started download of file 03oc03aa.4127.1522.473568.3.42 3/27/2007 6:55:23 PM||Access to reference site succeeded - project servers may be temporarily down. 3/27/2007 6:55:44 PM||Project communication failed: attempting access to reference site 3/27/2007 6:55:44 PM|SETI@home|[file_xfer] Temporarily failed download of 03oc03aa.4127.1522.473568.3.42: system connect 3/27/2007 6:55:44 PM|SETI@home|Backing off 2 hr 3 min 17 sec on download of file 03oc03aa.4127.1522.473568.3.42 3/27/2007 6:55:44 PM|SETI@home|[file_xfer] Started download of file 03no03aa.5034.11282.742326.3.111 3/27/2007 6:55:45 PM||Access to reference site succeeded - project servers may be temporarily down. 3/27/2007 6:56:06 PM||Project communication failed: attempting access to reference site 3/27/2007 6:56:06 PM|SETI@home|[file_xfer] Temporarily failed download of 03no03aa.5034.11282.742326.3.111: system connect 3/27/2007 6:56:06 PM|SETI@home|Backing off 36 min 6 sec on download of file 03no03aa.5034.11282.742326.3.111 3/27/2007 6:56:08 PM||Access to reference site succeeded - project servers may be temporarily down. 3/27/2007 6:56:08 PM|SETI@home|[file_xfer] Started upload of file 27se04ab.11205.9842.236066.3.59_3_0 3/27/2007 6:56:30 PM||Project communication failed: attempting access to reference site 3/27/2007 6:56:30 PM|SETI@home|[file_xfer] Temporarily failed upload of 27se04ab.11205.9842.236066.3.59_3_0: system connect 3/27/2007 6:56:30 PM|SETI@home|Backing off 14 min 30 sec on upload of file 27se04ab.11205.9842.236066.3.59_3_0 3/27/2007 6:56:32 PM||Access to reference site succeeded - project servers may be temporarily down. 3/27/2007 6:57:41 PM|SETI@home|[file_xfer] Finished upload of file 27se04ab.11205.9794.567336.3.148_0_0 3/27/2007 6:57:41 PM|SETI@home|[file_xfer] Throughput 1934 bytes/sec 3/27/2007 6:57:44 PM|SETI@home|[file_xfer] Started upload of file 27se04ab.11205.9842.236066.3.255_3_0 3/27/2007 6:57:44 PM|SETI@home|Sending scheduler request: To report completed tasks 3/27/2007 6:57:44 PM|SETI@home|Reporting 1 tasks 3/27/2007 6:57:55 PM|SETI@home|Scheduler RPC succeeded [server version 509] 3/27/2007 6:57:55 PM|SETI@home|Deferring communication 11 sec, because requested by project 3/27/2007 6:58:06 PM||Project communication failed: attempting access to reference site 3/27/2007 6:58:06 PM|SETI@home|[file_xfer] Temporarily failed upload of 27se04ab.11205.9842.236066.3.255_3_0: system connect 3/27/2007 6:58:06 PM|SETI@home|Backing off 2 min 28 sec on upload of file 27se04ab.11205.9842.236066.3.255_3_0 3/27/2007 6:58:08 PM||Access to reference site succeeded - project servers may be temporarily down. 3/27/2007 6:58:14 PM|SETI@home|[file_xfer] Started download of file 03oc03aa.4127.1522.473568.3.45 The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Walla Send message Joined: 14 May 06 Posts: 329 Credit: 177,013 RAC: 0 |
Everything seems to be flowing nicely now. |
gomeyer Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0 |
Yup, mine also. Thanks to whomever kicked the box. |
Labbie Send message Joined: 19 Jun 06 Posts: 4083 Credit: 5,930,102 RAC: 0 |
I just got all mine to go thru too. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65709 Credit: 55,293,173 RAC: 49 |
My huge backlog is shrinking, Just like Jupiter in 2010. Nope, strike that last statement, I'm done. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
W-K 666 Send message Joined: 18 May 99 Posts: 19012 Credit: 40,757,560 RAC: 67 |
Matt, Good work on clearing problems etc. But looks like the validator problems have re-appeared. Andy |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.