Message boards :
Number crunching :
Panic Mode On (109) Server Problems?
Message board moderation
Previous · 1 . . . 19 · 20 · 21 · 22 · 23 · 24 · 25 . . . 36 · Next
Author | Message |
---|---|
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . Exactly the same here. I think Richard is on the right track when he says Vader has decided to take a holiday. Stephen :) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
The Scheduler however is still randomly refusing to allocate work, but it looks like someone has been to work this morning (or at least logged in) as the Average-turnaround and Received-last-hour numbers are updating again. And they're back from leave again. We'll see how long they last this time. Grant Darwin NT |
Chris904395093209d Send message Joined: 1 Jan 01 Posts: 112 Credit: 29,923,129 RAC: 6 |
Added 208.68.240.119 to my hosts file 3 hours ago, haven't had a problem downloading since. I just commented it out to see what happens overnight. 123 work units in my cache at the moment. We'll see what happens over the next 9 or so hours. ~Chris |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
The Scheduler however is still randomly refusing to allocate work, but it looks like someone has been to work this morning (or at least logged in) as the Average-turnaround and Received-last-hour numbers are updating again. About as long as last time. Looks like they start to update again, then stop almost straight away. Grant Darwin NT |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
Have either of you two actually read the recent posts in this thread? We know what the problem is, we know how to work round it. You can do that too.I've been having problems for the last 2 days with stuck downloads on different machines. I have to manually go into each machine and retry downloads. I get the impressions that too many work units try to download at the same time and some get blocked, so they get moved to retry in X minutes. Unfortunately, they get stuck there and never complete the retry. So manually doing it, corrects the issue. My 2 cents.+ 1 Same experience here. |
Chris904395093209d Send message Joined: 1 Jan 01 Posts: 112 Credit: 29,923,129 RAC: 6 |
Added 208.68.240.119 to my hosts file 3 hours ago, haven't had a problem downloading since. I just commented it out to see what happens overnight. 123 work units in my cache at the moment. We'll see what happens over the next 9 or so hours. 1 work unit got stuck overnight and my Windows 10 machine was down to 59 work units. For whatever reason, This machine can't let go of Vader when it gets a stuck work unit. My Linux machines hit Vader too, but within an hour or 2 the auto retry works for them. If I had time today I would look to see if a connection from the Windows 10 machine can be closed sooner (thought I saw something about a 300 second timeout for connections) or some other tweak could be done with the auto retry perhaps. ~Chris |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
@Chris, as Richard said, there is no need to do any further post-mortem on the problem. We know what the issue is with the download servers. Simple solution is to just comment out Vader server at 208.68.240.127 in the Hosts file and wait for staff to get it working next week. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Chris904395093209d Send message Joined: 1 Jan 01 Posts: 112 Credit: 29,923,129 RAC: 6 |
@Chris, as Richard said, there is no need to do any further post-mortem on the problem. We know what the issue is with the download servers. Simple solution is to just comment out Vader server at 208.68.240.127 in the Hosts file and wait for staff to get it working next week. Sorry, wasn't really a post about the server side problem. But more of a learning opportunity for me on how the client side works - not just with with BOINC or SETI, but with networking, specifically with DNS, load balancing, and proxy servers. All of which I had at home just for fun to see how they work but it's been awhile. ~Chris |
JaundicedEye Send message Joined: 14 Mar 12 Posts: 5375 Credit: 30,870,693 RAC: 1 |
Again, exposing my ignorance, I am not a code writer or analyst. I can however follow simple directions when I know where to look and what to change. Where exactly is the 'hosts' file located, and how do I 'comment out' 208.68.240.127 ? Some of us are not computer scientists but know enough to 'get around' and really want to contribute to the project but in posting work around and fixes sometimes the explanation assumes a level of basic knowledge that some of us just don't have........IMHO. Thanks for any assistance. "Sour Grapes make a bitter Whine." <(0)> |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
I think most of us are using the info Richard provided in the summer about bypassing the DNS servers and going direct. http://setiathome.berkeley.edu/forum_thread.php?id=81638&postid=1875152 EDIT: And this. http://setiathome.berkeley.edu/forum_thread.php?id=81638&postid=1875158 |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
It's generic, and very basic, TCP/IP networking. Add the hosts file to your list.@Chris, as Richard said, there is no need to do any further post-mortem on the problem. We know what the issue is with the download servers. Simple solution is to just comment out Vader server at 208.68.240.127 in the Hosts file and wait for staff to get it working next week.Sorry, wasn't really a post about the server side problem. But more of a learning opportunity for me on how the client side works - not just with with BOINC or SETI, but with networking, specifically with DNS, load balancing, and proxy servers. All of which I had at home just for fun to see how they work but it's been awhile. The unusual thing is the way SETI uses round-robin DNS. The best way to see this is from the command prompt. Type 'ipconfig /flushdns' to clear your browsing history from the last 24 hours - it makes life easier. Then trigger or retry a SETI download. Now type 'ipconfig /displaydns'. You should, among the chatter, see Windows IP Configuration boinc2.ssl.berkeley.edu ---------------------------------------- Record Name . . . . . : boinc2.ssl.berkeley.edu Record Type . . . . . : 1 Time To Live . . . . : 120 Data Length . . . . . : 4 Section . . . . . . . : Answer A (Host) Record . . . : 208.68.240.127 Record Name . . . . . : boinc2.ssl.berkeley.edu Record Type . . . . . : 1 Time To Live . . . . : 120 Data Length . . . . . : 4 Section . . . . . . . : Answer A (Host) Record . . . : 208.68.240.119Two download IP addresses. BOINC (or any other Windows program) will try the first one. In this case, Vader - and the downloads failed. But also note the very short TTL (Time To Live - seconds). Windows effectively never caches that IP address (although BOINC does). A new download will fetch a new DNS response, and may get the IPs the other way round - and succeed. Edit - Yup. Went back to BOINC after posting that, and clicked 'retry now' for just one file. All 40 downloaded at the first attempt. 07/01/2018 17:44:22 | SETI@home | Backing off 00:03:39 on download of blc05_2bit_guppi_57976_10703_HIP74981_0036.11026.818.22.45.128.vlar 07/01/2018 17:54:07 | SETI@home | Started download of blc05_2bit_guppi_57976_11502_HIP91971_0038.11545.818.22.45.109.vlar 07/01/2018 17:54:10 | SETI@home | Finished download of blc05_2bit_guppi_57976_11502_HIP91971_0038.11545.818.22.45.109.vlar10 minutes would seem to be a good time to wait. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
This is going back to original posts made a couple of years ago, (guessing) when we had issues with the two download servers and the way they handle download requests in a round-robin order. Fix was to list the servers in your Hosts file and not rely on DNS discovery mechanisms. The Hosts file is in the C:\Windows\System32\drivers\etc directory. Use Notepad to edit it and add this entry: 208.68.240.118 setiboincdata.ssl.berkeley.edu # upload server Oct 2016 208.68.240.119 boinc2.ssl.berkeley.edu # Georgem download server Oct 2016 208.68.240.126 setiboinc.ssl.berkeley.edu # scheduler Oct 2016 #208.68.240.127 vader.ssl.berkeley.edu # Vader download server Oct 2016 The hashtag in front of the 208.68.240.127 address means the line is commented out and is not read or acted upon. Vader is the server that is causing the stalled downloads. For now, you should only use the Georgem download server. You can do a ipconfig /flushdns from the command line to make sure the Hosts file is read and your old DNS cache is flushed. The downloads will start working again. You can check to see which download server is being used to get work by setting the http_debug flag option in the Event Log options menu. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
Use Notepad to edit it ...Run Notepad 'As Administrator' on Windows 7 or later - otherwise you won't be able to save edits. You can do a ipconfig /flushdns from the command line to make sure the Hosts file is read and your old DNS cache is flushed. The downloads will start working again. You can check to see which download server is being used to get work by setting the http_debug flag option in the Event Log options menu.Very rarely needed. The new values in the hosts file are read automatically whenever you save it, and supersede any previous values - whether from the file or cached. |
JaundicedEye Send message Joined: 14 Mar 12 Posts: 5375 Credit: 30,870,693 RAC: 1 |
OK. I successfully inserted the script Keith provided, flushed the dns cache and got confirmation in the change to the hosts file from my WInpatrol app. I only changed one machine and will monitor it for hangs. Thanks again for the detailed help. JE edit Strangely the machine I did not change has not encountered any hung downloads. "Sour Grapes make a bitter Whine." <(0)> |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Glad to hear your downloads are working again. Shoutout to @Richard for pointing out Hosts is a protected access file that needs Administrator privileges to edit. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
kittyman Send message Joined: 9 Jul 00 Posts: 51478 Credit: 1,018,363,574 RAC: 1,004 |
Yes, thank you, Richard, for the reminder. I had never used the hosts file in the past, but I am now, and it seems to have done the trick. "Time is simply the mechanism that keeps everything from happening all at once." |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
07/01/2018 20:06:53 | SETI@home | Not requesting tasks: some download is stalled 07/01/2018 20:06:57 | SETI@home | Scheduler request completed 07/01/2018 20:07:59 | SETI@home | Started download of blc05_2bit_guppi_57976_10027_HIP74981_0034.26512.818.21.44.135.vlar 07/01/2018 20:08:01 | SETI@home | Temporarily failed download of blc05_2bit_guppi_57976_10027_HIP74981_0034.26512.818.21.44.135.vlar: transient HTTP error 07/01/2018 20:08:01 | SETI@home | Backing off 00:16:07 on download of blc05_2bit_guppi_57976_10027_HIP74981_0034.26512.818.21.44.135.vlar 07/01/2018 20:08:02 | | Project communication failed: attempting access to reference site 07/01/2018 20:08:04 | | Internet access OK - project servers may be temporarily down. |
rob smith Send message Joined: 7 Mar 03 Posts: 22527 Credit: 416,307,556 RAC: 380 |
I too just suffered a "bump" like that on one of my crunchers. It appears to have cleared now, but..... Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
Yes, thank you, Richard, for the reminder.Yes, it can help. I recommend that you (and every other new user) comment out the active lines as soon as this particular problem is over - they would cause a problem if it's GeorgeM that falls over next time, and prevent Vader taking over. I've let Eric enjoy his weekend in peace - unless he's reading this thread - but I'm going to email him and Jeff to say that one of our servers is missing, before the lab opens for buisness tomorrow. I'll keep you posted. |
kittyman Send message Joined: 9 Jul 00 Posts: 51478 Credit: 1,018,363,574 RAC: 1,004 |
Yes, thank you, Richard, for the reminder.Yes, it can help. Will do, Richard. I shall watch this thread for news that the problem has been fixed, and neuter the hosts file. "Time is simply the mechanism that keeps everything from happening all at once." |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.