Panic Mode On (26) Server problems

Message boards : Number crunching : Panic Mode On (26) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 13 · Next

AuthorMessage
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 950609 - Posted: 28 Nov 2009, 21:43:30 UTC - in response to Message 950592.  

Here is an interesting experiment:

At a command prompt, type:

ping boinc2.ssl.berkeley.edu


Don't worry about the ping times, just look at the address.

If you get .13 half the time, and .18 half the time, everything is fine.

If you do it ten times in a row, and get just one of the two answers, then your operating system is not honoring the fact that there are two "A" records.

What is my system honoring when I still get 83.175.236.201? Well, at least BOINC finds the right IPs. Don't ask me how that is possible...
ID: 950609 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 950611 - Posted: 28 Nov 2009, 21:48:38 UTC - in response to Message 950609.  

Here is an interesting experiment:

At a command prompt, type:

ping boinc2.ssl.berkeley.edu


Don't worry about the ping times, just look at the address.

If you get .13 half the time, and .18 half the time, everything is fine.

If you do it ten times in a row, and get just one of the two answers, then your operating system is not honoring the fact that there are two "A" records.

What is my system honoring when I still get 83.175.236.201? Well, at least BOINC finds the right IPs. Don't ask me how that is possible...

Is that a proxy server on your network (or at your provider)?

It's part of a small block that belongs to a book shop in Spain.

My guess: everything is at 83.175.236.201 from your point of view.
ID: 950611 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 950614 - Posted: 28 Nov 2009, 22:03:48 UTC - in response to Message 950611.  

No, no proxy. And no, everything else is not at 83.175.236.201, i just tested few other adresses. But setiathome.berkeley.edu and even berkeley.edu is also on that IP.

So it's a book shop in Spain? OK... hopefully I don't need to understand that.
ID: 950614 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 950618 - Posted: 28 Nov 2009, 22:21:32 UTC
Last modified: 28 Nov 2009, 22:21:42 UTC

I got .18 every time.

So, I let things slt for a while (I have Windows set for a short cache time in the registry, as outlined in http://support.microsoft.com/kb/318803) and I'm now getting .13 consistently.

In five minutes, it'll probably be different.
ID: 950618 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30684
Credit: 53,134,872
RAC: 32
United States
Message 950623 - Posted: 28 Nov 2009, 22:43:51 UTC - in response to Message 950618.  

I got .18 every time.

So, I let things slt for a while (I have Windows set for a short cache time in the registry, as outlined in http://support.microsoft.com/kb/318803) and I'm now getting .13 consistently.

In five minutes, it'll probably be different.

I get boinc2.ssl.berkeley.edu (208.68.240.18) every time.

If Link is getting that weird IP then I suspect a bad DNS cache at his DNS provider. He might want to try switching DNS servers and see if that is the issue.

ID: 950623 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 950626 - Posted: 28 Nov 2009, 23:00:24 UTC - in response to Message 950623.  
Last modified: 28 Nov 2009, 23:00:40 UTC

I got .18 every time.

So, I let things slt for a while (I have Windows set for a short cache time in the registry, as outlined in http://support.microsoft.com/kb/318803) and I'm now getting .13 consistently.

In five minutes, it'll probably be different.

I get boinc2.ssl.berkeley.edu (208.68.240.18) every time.

I can't comment on Darwin, but for your Windows machines if you change the timeout per the knowledge base article, it should at least switch on the maximum timeout you set in the registry.
ID: 950626 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 950642 - Posted: 28 Nov 2009, 23:44:45 UTC

Well, OK, I've now re-tried it and got 10 out of 10 on .13 - with no lost packets.

So it looks like no problems on either .13 or .18 from this side of the pond...

F.
ID: 950642 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 950644 - Posted: 28 Nov 2009, 23:53:54 UTC - in response to Message 950642.  

Well, OK, I've now re-tried it and got 10 out of 10 on .13 - with no lost packets.

That just means that the physical server is running. Apache could have crashed, and the physical server would respond to pings.

The trick with the "hosts" file manually selects where the running Apache server is located.

The trick I've suggested just makes sure BOINC can try both without too much interference from the OS.

ID: 950644 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 950646 - Posted: 28 Nov 2009, 23:58:20 UTC - in response to Message 950642.  

...
So it looks like no problems on either .13 or .18 from this side of the pond...
...
.13 has been responding to pings the whole way through, but was handing out zero length files (last night anyway).

Curiously, I don't see a lot of Windows 7 users commenting here... could that mean the Lazy DNS cache issue is addressed in that OS ?

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 950646 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 950648 - Posted: 29 Nov 2009, 0:03:38 UTC

OK, something is really weird here.

1. I changed to the OpenDNS servers (my two other computers were using them for long time already).
2. flushdns
3. still get this IP for anything at "berkeley.edu"

Than I made some tests in web browser. As you can open a page by its name or by the IP, I tested it for several pages incl. the seti@home page. The result: I get right IPs for every other page. I can access them by the IP or by the name, for example I get the same page for http://boinc.bakerlab.org/ and for http://140.142.20.103, both lead me to the R@H homepage. That doesn't work for setiathome.berkeley.edu. I can only access the page by the name, if I use the IP (83.175.236.201), I get a page consisting of "<html><body><h1>It works!</h1></body></html>". Now: if my computer thinks, that http://setiathome.ssl.berkeley.edu/ is at 83.175.236.201 and the web browser cannot open the right page when I type in http://83.175.236.201 where does it get the right IP (whatever it is)?
ID: 950648 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 950649 - Posted: 29 Nov 2009, 0:11:11 UTC - in response to Message 950646.  

Curiously, I don't see a lot of Windows 7 users commenting here... could that mean the Lazy DNS cache issue is addressed in that OS ?

That would be a refreshing change.
ID: 950649 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 950650 - Posted: 29 Nov 2009, 0:14:29 UTC - in response to Message 950644.  

Well, OK, I've now re-tried it and got 10 out of 10 on .13 - with no lost packets.

That just means that the physical server is running. Apache could have crashed, and the physical server would respond to pings.

The trick with the "hosts" file manually selects where the running Apache server is located.

The trick I've suggested just makes sure BOINC can try both without too much interference from the OS.

Doh!! Chalk that one up as something I should have thought of - but didn't. Checked back through my log and, sure enough, about 1.5 hours ago I got
28/11/2009 22:22:32	SETI@home	Started download of 13mr07ab.2745.8661.9.10.192
28/11/2009 22:22:35		Project communication failed: attempting access to reference site


I have now set the registry not to cache on negative responses but I have no problem with positive responses being cached for Windoze default of 24 hours.

F.
ID: 950650 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 950652 - Posted: 29 Nov 2009, 0:19:06 UTC - in response to Message 950650.  


Doh!! Chalk that one up as something I should have thought of - but didn't. Checked back through my log and, sure enough, about 1.5 hours ago I got
28/11/2009 22:22:32	SETI@home	Started download of 13mr07ab.2745.8661.9.10.192
28/11/2009 22:22:35		Project communication failed: attempting access to reference site


I have now set the registry not to cache on negative responses but I have no problem with positive responses being cached for Windoze default of 24 hours.

F.

But, as far as DNS is concerned, this is not a negative response. It asked for an IP address, and it got an IP address.

You don't want to disable negative responses, it means if a name is mistyped, the OS will have to ask again, and again, and again when it could have just found the answer in the cache.

You want to limit the time on responses (both positive and negative) so that superficially valid, but not exactly useful answers don't outlast their welcome.

Five minutes (300 seconds) is probably a good starting place for both.
ID: 950652 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 950653 - Posted: 29 Nov 2009, 0:19:25 UTC
Last modified: 29 Nov 2009, 0:59:10 UTC


Special 'Thank you' to Link:
[http://setiathome.berkeley.edu/forum_thread.php?id=56286&nowrap=true#950520]_message 950520
and Nick:
[http://setiathome.berkeley.edu/forum_thread.php?id=56286&nowrap=true#950527]_message 950527

..for to explain it for 'dummies' like me.
Also 'Thank you' to all others here around for their tips and hints!

I made Nick's instructions and insert Link's both IPs and DL since hours (~ 22:00 - 01:20 local time) without probs the 3 day WU cache for my GPU cruncher. (~ 1,800 WUs)
Because of my 'DSL light' (384/64 DL/UL) it will last maybe 6 hours or something.


Quick instruction: (how I did it)

'Close/exit/finish' BOINC complete.

Go to..
C:\Windows\System32\drivers\etc
...and search the 'hosts' file and open it with 'Editor' (german), (I guess it's the US/UK 'Notepad'?).

Insert on the bottom: (copy/paste)
208.68.240.18 boinc2.ssl.berkeley.edu
208.68.240.13 boinc2.ssl.berkeley.edu

'save' and 'close' this file.

Reboot of the PC.

Start BOINC and the DL will go.




Maybe..
It will be 'safe' for to let this two IPs there?
I mean, I didn't opened a door for bad people out there for to reach my PCs?

BOINC will work with this two IPs like without, or I need to delete this two IPs in some days if the Berkeley crew had a look to the prob?

ID: 950653 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 950654 - Posted: 29 Nov 2009, 0:23:00 UTC - in response to Message 950653.  

Maybe..
It will be 'safe' for to let this two IPs there?
I mean, I didn't opened a door for bad people out there for to reach my PCs?

BOINC will work with this two IPs like without, or I need to delete this two IPs in some days if the Berkeley crew had a look to the prob?

There are no security questions -- this does not open your PC to hackers.

There is a reliability issue, because SETI@Home, may, without notice, change the server IP addresses.

In fact, the TTL setting on these "A" records says that they are absolutely guaranteed not to change for the next five minutes. After that, your computer should ask again.

ID: 950654 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 950655 - Posted: 29 Nov 2009, 0:29:46 UTC - in response to Message 950654.  
Last modified: 29 Nov 2009, 0:30:25 UTC

...
There is a reliability issue, because SETI@Home, may, without notice, change the server IP addresses..
And thousands of Cisco teachers walking around looking annoyed for a day or so.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 950655 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 950657 - Posted: 29 Nov 2009, 0:31:59 UTC - in response to Message 950652.  

You don't want to disable negative responses, it means if a name is mistyped, the OS will have to ask again, and again, and again when it could have just found the answer in the cache.

You want to limit the time on responses (both positive and negative) so that superficially valid, but not exactly useful answers don't outlast their welcome.

Five minutes (300 seconds) is probably a good starting place for both.

Point taken. And am about to edit my registry accordingly.

F.
ID: 950657 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 950660 - Posted: 29 Nov 2009, 0:33:51 UTC

if all else fails just a good ole.

net stop dnscache
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 950660 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 950661 - Posted: 29 Nov 2009, 0:38:14 UTC - in response to Message 950655.  

...
There is a reliability issue, because SETI@Home, may, without notice, change the server IP addresses..
And thousands of Cisco teachers walking around looking annoyed for a day or so.

Not just Cisco teachers.

It makes renumbering networks and moving servers a lot harder when things break the rules.
ID: 950661 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 950662 - Posted: 29 Nov 2009, 0:38:21 UTC - in response to Message 950654.  

[...]
BOINC will work with this two IPs like without, or I need to delete this two IPs in some days if the Berkeley crew had a look to the prob?

[...]
There is a reliability issue, because SETI@Home, may, without notice, change the server IP addresses.

In fact, the TTL setting on these "A" records says that they are absolutely guaranteed not to change for the next five minutes. After that, your computer should ask again.


You mean, if BOINC have a 'DL break' of 5 min., I need to close/exit BOINC and start again? Maybe reboot of PC?


ID: 950662 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 13 · Next

Message boards : Number crunching : Panic Mode On (26) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.