Configuring a PORT problem, or firewall

Message boards : Number crunching : Configuring a PORT problem, or firewall
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

AuthorMessage
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 198034 - Posted: 29 Nov 2005, 0:13:59 UTC

And to throw another wrinkle into this, 2 of the four affected users have attached to either PPAH or Einstein without issue, but still can't attach to seti
ID: 198034 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 198038 - Posted: 29 Nov 2005, 0:18:29 UTC

27/11/2005 21:41:45||Fetching config info from http://setiathome.berkeley.edu/get_project_config.php
27/11/2005 21:41:47||Missing account key
27/11/2005 21:42:03|http://setiathome.berkeley.edu/|Master file download succeeded

here it calls the scheduling server and gets the master page.

then it tries to call the UL/DL server and gets a 500 error,

27/11/2005 21:42:04|http://setiathome.berkeley.edu/|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
27/11/2005 21:42:04|http://setiathome.berkeley.edu/|Reason: Requested by user
27/11/2005 21:42:04|http://setiathome.berkeley.edu/|Requesting 8640 seconds of new work
27/11/2005 21:44:51|http://setiathome.berkeley.edu/|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500
27/11/2005 21:44:51|http://setiathome.berkeley.edu/|No schedulers responded
27/11/2005 21:44:51|http://setiathome.berkeley.edu/|Couldn't connect to URL http://setiathome.berkeley.edu/.Please check URL.

since it can't connect it detaches and resets.
ID: 198038 · Report as offensive
nick
Volunteer tester
Avatar

Send message
Joined: 22 Jul 05
Posts: 284
Credit: 3,902,174
RAC: 0
United States
Message 198087 - Posted: 29 Nov 2005, 1:11:07 UTC - in response to Message 198038.  

27/11/2005 21:41:45||Fetching config info from http://setiathome.berkeley.edu/get_project_config.php
27/11/2005 21:41:47||Missing account key
27/11/2005 21:42:03|http://setiathome.berkeley.edu/|Master file download succeeded

here it calls the scheduling server and gets the master page.

then it tries to call the UL/DL server and gets a 500 error,

27/11/2005 21:42:04|http://setiathome.berkeley.edu/|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
27/11/2005 21:42:04|http://setiathome.berkeley.edu/|Reason: Requested by user
27/11/2005 21:42:04|http://setiathome.berkeley.edu/|Requesting 8640 seconds of new work
27/11/2005 21:44:51|http://setiathome.berkeley.edu/|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500
27/11/2005 21:44:51|http://setiathome.berkeley.edu/|No schedulers responded
27/11/2005 21:44:51|http://setiathome.berkeley.edu/|Couldn't connect to URL http://setiathome.berkeley.edu/.Please check URL.

since it can't connect it detaches and resets.



that happen in linux when i typed the key in a missed a letter, so have them copy and paste it in boinc.


ID: 198087 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 198344 - Posted: 29 Nov 2005, 8:45:18 UTC - in response to Message 198038.  


27/11/2005 21:44:51|http://setiathome.berkeley.edu/|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500
27/11/2005 21:44:51|http://setiathome.berkeley.edu/|No schedulers responded
27/11/2005 21:44:51|http://setiathome.berkeley.edu/|Couldn't connect to URL http://setiathome.berkeley.edu/.Please check URL.

since it can't connect it detaches and resets.

'500' is a pretty generic "no answer" error. So, the big question is, how is the request to the scheduler CGI bad?
ID: 198344 · Report as offensive
Profile Tigher
Volunteer tester

Send message
Joined: 18 Mar 04
Posts: 1547
Credit: 760,577
RAC: 0
United Kingdom
Message 198420 - Posted: 29 Nov 2005, 11:47:05 UTC
Last modified: 29 Nov 2005, 11:52:23 UTC

I mailed the devs on this one. I have someone who can talk to all projects except seti using their lan and router. Take out the router and connect direct and they can talk to seti as well. Put it back in and they cannot talk to seti anymore. Weird!.

I asked the devs for what changes since 4.4x to 5.x.x in terms of tcp. I have looked at the packet traces for this guy and tcp checksum errors are being generated on his PC/LAN somewhere. UCB just responds accordingly with dup acks and/or tcp retrans and /or lost segment messages. A 500 error typically follows after the server gets fed up with partial data and then times out. Interstingly only the initial syn to the servers is not impaired with a checksum problem. I have asked Jeff C for the apache error list as we may be able to get closer with that as to what the server or fastcgi is complaining about.

I had this guy checking his hardware top to bottom but nothing has resolved this. At the moment I theorise that something in boinc is affecting the tcp stack somehow - a WAG for sure but I cannot see why the stack should be fine one minute and not the next.

There are two more of these btw on boinc forums; they too have connected directly and got over the problem but cannot connect through LAN/router facilities. I see there are few here at seti too - 6/7 people - Cannot be a coincidence.



ID: 198420 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 198755 - Posted: 29 Nov 2005, 19:19:11 UTC - in response to Message 198420.  

I mailed the devs on this one. I have someone who can talk to all projects except seti using their lan and router. Take out the router and connect direct and they can talk to seti as well. Put it back in and they cannot talk to seti anymore. Weird!.

I asked the devs for what changes since 4.4x to 5.x.x in terms of tcp. I have looked at the packet traces for this guy and tcp checksum errors are being generated on his PC/LAN somewhere. UCB just responds accordingly with dup acks and/or tcp retrans and /or lost segment messages. A 500 error typically follows after the server gets fed up with partial data and then times out. Interstingly only the initial syn to the servers is not impaired with a checksum problem. I have asked Jeff C for the apache error list as we may be able to get closer with that as to what the server or fastcgi is complaining about.

I had this guy checking his hardware top to bottom but nothing has resolved this. At the moment I theorise that something in boinc is affecting the tcp stack somehow - a WAG for sure but I cannot see why the stack should be fine one minute and not the next.

There are two more of these btw on boinc forums; they too have connected directly and got over the problem but cannot connect through LAN/router facilities. I see there are few here at seti too - 6/7 people - Cannot be a coincidence.



I don't know FastCGI in particular, but I assume it is similar to standard CGI, and the error message is likely coming from the CGI itself.

That kind of makes me think we have an MTU (or MTU Discovery) problem -- that the packets are getting fragmented in some unfortunate manner, and arriving at Berkeley mangled.

The easiest way to tweak settings is with Dr. TCP and I'd probably start with something very small, like 576 to see if that fixes the problem.
ID: 198755 · Report as offensive
Profile Tigher
Volunteer tester

Send message
Joined: 18 Mar 04
Posts: 1547
Credit: 760,577
RAC: 0
United Kingdom
Message 198767 - Posted: 29 Nov 2005, 19:39:00 UTC - in response to Message 198755.  
Last modified: 29 Nov 2005, 20:14:01 UTC



I don't know FastCGI in particular, but I assume it is similar to standard CGI, and the error message is likely coming from the CGI itself.

That kind of makes me think we have an MTU (or MTU Discovery) problem -- that the packets are getting fragmented in some unfortunate manner, and arriving at Berkeley mangled.

The easiest way to tweak settings is with Dr. TCP and I'd probably start with something very small, like 576 to see if that fixes the problem.


Hi Ned.
Well I'm a little puzzled by it. tcp is fine non LAN and fails when LAN used and then fine again when non LAN. Three cases of it. Strange for sure.

Do you know if running IPX on the same wire causes a problem? Its the only suspicious thing I can see in this trace. Even though its only a bit of routing data as far as I can see. Hmmmmmm routing data - might remember that for when I am clueless. We always avoided it thinking it was bad to do but we never proved it.

Edit: Just reading a bit on IPX/TCP/IP mixing = potentially bad news. Would be interested to know if anyone with server 500 error problems is running IPX or NetBEUI please? This may be related but need to gather some evidence. Thanks.

ID: 198767 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 198964 - Posted: 29 Nov 2005, 23:54:43 UTC - in response to Message 198767.  



I don't know FastCGI in particular, but I assume it is similar to standard CGI, and the error message is likely coming from the CGI itself.

That kind of makes me think we have an MTU (or MTU Discovery) problem -- that the packets are getting fragmented in some unfortunate manner, and arriving at Berkeley mangled.

The easiest way to tweak settings is with Dr. TCP and I'd probably start with something very small, like 576 to see if that fixes the problem.


Hi Ned.
Well I'm a little puzzled by it. tcp is fine non LAN and fails when LAN used and then fine again when non LAN. Three cases of it. Strange for sure.

Do you know if running IPX on the same wire causes a problem? Its the only suspicious thing I can see in this trace. Even though its only a bit of routing data as far as I can see. Hmmmmmm routing data - might remember that for when I am clueless. We always avoided it thinking it was bad to do but we never proved it.

Edit: Just reading a bit on IPX/TCP/IP mixing = potentially bad news. Would be interested to know if anyone with server 500 error problems is running IPX or NetBEUI please? This may be related but need to gather some evidence. Thanks.


Hey Ian,

Back in the day (Windows 95 era) the default was to have all three protocols loaded and in use at all times. That often lead to confusion because two machines could easily be talking IPX while two others ran NetBEUI, and a third talked IP. The network browsers (the thing that collected names and told machines how to talk to each other) are seperate for each network, and can have different machines.

... but I wouldn't expect this to be a problem for SETI.

One gripe with Microsoft TCP stacks is that they don't all do MTU discovery correctly, and at least some (most?) also set the "Don't Fragment" flag. That means that short packets (TCP "SYN" packets with no data) go through, but anything over 1500 bytes might not make it if there is some router in between with a smaller MTU.

In effect, Microsoft is enforcing an MTU of 1500 across the entire internet.

Or, I could be entirely wrong.

I suggested Dr. TCP as an easy way to try different MTU values -- because it's easy, and if it works, then we know more about the problem, and if it doesn't, well, it's really easy to try.

-- Ned
ID: 198964 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 199024 - Posted: 30 Nov 2005, 1:16:36 UTC

THe user here found out something interesting. He's one of the affected users I spoke of:

Help with "attach to project"

especially read the solution found in the last post of the last one.

tony

ID: 199024 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 199027 - Posted: 30 Nov 2005, 1:19:00 UTC

we now have a fifth user with this problem. here's a link to all of them:

Help me Please......

Boinc connection error 182

Error 500

Help with "attach to project"

Failed to attach to project

ID: 199027 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 199029 - Posted: 30 Nov 2005, 1:21:05 UTC - in response to Message 199024.  
Last modified: 30 Nov 2005, 1:22:40 UTC

THe user here found out something interesting. He's one of the affected users I spoke of:

Help with "attach to project"

especially read the solution found in the last post of the last one.

tony


Hmmm. This points to a MTU size problem, as others have suspected, too.
(Cable modems add extra data to TCP packets and can cause them to get too large)

Maybe you should suggest using "Dr. TCP" from below to reduce the MTU size to something like 1300 for a start.

Regards Hans
ID: 199029 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 199032 - Posted: 30 Nov 2005, 1:23:22 UTC

The user from the last thread I listed is still online, if someone who understood MTU could go there, I'd appreciate it.
ID: 199032 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 199033 - Posted: 30 Nov 2005, 1:25:24 UTC - in response to Message 199032.  

The user from the last thread I listed is still online, if someone who understood MTU could go there, I'd appreciate it.


Hmmm I'm on linux here and won't be able to help much with debugging WinXP networking problems.

Regards Hans
ID: 199033 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 199047 - Posted: 30 Nov 2005, 1:38:07 UTC

I just had an epiphany, I pointed them here. LOL, I know they can't post here, but they can read here, can't they?
ID: 199047 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 199048 - Posted: 30 Nov 2005, 1:39:26 UTC
Last modified: 30 Nov 2005, 1:40:00 UTC

Hi Tony, hope you read this.

I have a user over there that may have to use a proxy to connect.
Since I know zilch about the GUI, maybe you can tell him how to get to the proxy dialog. :o)

Regards Hans
ID: 199048 · Report as offensive
Larin
Avatar

Send message
Joined: 8 Aug 02
Posts: 45
Credit: 11,534,919
RAC: 0
United States
Message 199169 - Posted: 30 Nov 2005, 4:43:04 UTC - in response to Message 196563.  

I did not start this thred but I do a simular prob is the what you wear lookin for.

Larin


alg.exe:1244 TCP snape:1028 snape:0 LISTENING
ARSAsync.exe:1768 TCP snape.slithern.hogwartz.edu:1144 207.189.107.204:http CLOSE_WAIT
ARSAsync.exe:1768 TCP snape.slithern.hogwartz.edu:1147 207.189.107.204:http CLOSE_WAIT
boinc.exe:1536 TCP snape:31416 snape:0 LISTENING
boinc.exe:1536 TCP snape:31416 localhost:1140 ESTABLISHED
boincmgr.exe:2696 TCP snape:1140 localhost:31416 ESTABLISHED
lsass.exe:788 UDP snape:isakmp *:*
lsass.exe:788 UDP snape:4500 *:*
svchost.exe:1020 TCP snape:epmap snape:0 LISTENING
svchost.exe:1104 UDP snape:ntp *:*
svchost.exe:1104 UDP snape.slithern.hogwartz.edu:ntp *:*
svchost.exe:1152 UDP snape:1058 *:*
svchost.exe:1152 UDP snape:1029 *:*
svchost.exe:1152 UDP snape:1026 *:*
svchost.exe:1300 UDP snape:1900 *:*
svchost.exe:1300 UDP snape.slithern.hogwartz.edu:1900 *:*
svchost.exe:944 TCP snape:3389 snape:0 LISTENING
System:4 TCP snape:microsoft-ds snape:0 LISTENING
System:4 TCP snape.slithern.hogwartz.edu:netbios-ssn snape:0 LISTENING
System:4 UDP snape:microsoft-ds *:*
System:4 UDP snape.slithern.hogwartz.edu:netbios-dgm *:*
System:4 UDP snape.slithern.hogwartz.edu:netbios-ns *:*

One should not meddle in the affairs of Dragons.
For you are crunchy and very tasty with ketchup.
ID: 199169 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 199175 - Posted: 30 Nov 2005, 4:55:01 UTC

Larin, I have your Q&A thread listed below.
ID: 199175 · Report as offensive
Larin
Avatar

Send message
Joined: 8 Aug 02
Posts: 45
Credit: 11,534,919
RAC: 0
United States
Message 199188 - Posted: 30 Nov 2005, 5:17:19 UTC - in response to Message 199175.  
Last modified: 30 Nov 2005, 5:17:56 UTC

Larin, I have your Q&A thread listed below.




Thanks just trying to help out if I can I have been trying verying MTU settings no help ther.
I will probly bring down someones wrath for saying this but I think it is in the upload/download server kryten.ssl.berkeley.edu I have not been getting to it with tracerout but I can get to all of the others would you try it?


Larin
One should not meddle in the affairs of Dragons.
For you are crunchy and very tasty with ketchup.
ID: 199188 · Report as offensive
osKar one

Send message
Joined: 16 Oct 99
Posts: 26
Credit: 5,671
RAC: 0
France
Message 199254 - Posted: 30 Nov 2005, 9:23:33 UTC - in response to Message 199188.  
Last modified: 30 Nov 2005, 9:47:46 UTC


Thanks just trying to help out if I can I have been trying verying MTU settings no help ther.
I will probly bring down someones wrath for saying this but I think it is in the upload/download server kryten.ssl.berkeley.edu I have not been getting to it with tracerout but I can get to all of the others would you try it?



Yes, there is also a problem for me with the kryten.ssl.berkeley.edu (128.32.18.154) server. I cannot trace it and the answer is always "Request timed out".

Could not someone from Seti@home staff do something for this server, please????

I think that is not a configuring port problem but a Seti@home server problem!!!





ID: 199254 · Report as offensive
Profile Tigher
Volunteer tester

Send message
Joined: 18 Mar 04
Posts: 1547
Credit: 760,577
RAC: 0
United Kingdom
Message 199259 - Posted: 30 Nov 2005, 9:36:50 UTC

Well I have an email from the sys ops and Matt L is saying there are no recent error 500 messages on the server logs. He will track back a little to see what has happened recently. Strange though as if you get a 500 it must come from somewhere!

@Ned
We tried the MTU resize with our Australian boincers if you recall but to no avail. Worth a go but I think Larin is saying it does not help for him. I appreciate large segments can and do get fragmented so its worth a try with other users who have the problem. This can be a solution to unreliable routes but Larin is US based and should not be having extended route problems.

I'm still thinking about the trace in front of me and can see checksum errors. I know these calcs can be delegated to hardware for "late" addition so that can be a reason but then I dont understand why when there is no LAN involved people can connect and this trace shows no such problem.

So still interested in anyone with a 500 problem checking if they have protocols other than tcp/ip running. The advice I have seen on this is as late as Win 2000 which says don't mix so I appreciate what you have said there Ned but I don't think we can rule it out just yet?

Matt L is saying workload for the servers is climbing : "during the "active" periods. We seem to almost double our load every day" and seti is taking 3000 new people per day. That is may be why, Larin, your pings are unsuccessful - either the server or the route to the server is busy.

Anyhow I got some more thinking to do and will try and post anything I dream up.

ID: 199259 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

Message boards : Number crunching : Configuring a PORT problem, or firewall


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.