Configuring a PORT problem, or firewall

Author	Message
Astro Volunteer tester Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0	Message 198034 - Posted: 29 Nov 2005, 0:13:59 UTC And to throw another wrinkle into this, 2 of the four affected users have attached to either PPAH or Einstein without issue, but still can't attach to seti ID: 198034 ·

Astro Volunteer tester Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0	Message 198038 - Posted: 29 Nov 2005, 0:18:29 UTC 27/11/2005 21:41:45\|\|Fetching config info from http://setiathome.berkeley.edu/get_project_config.php 27/11/2005 21:41:47\|\|Missing account key 27/11/2005 21:42:03\|http://setiathome.berkeley.edu/\|Master file download succeeded here it calls the scheduling server and gets the master page. then it tries to call the UL/DL server and gets a 500 error, 27/11/2005 21:42:04\|http://setiathome.berkeley.edu/\|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi 27/11/2005 21:42:04\|http://setiathome.berkeley.edu/\|Reason: Requested by user 27/11/2005 21:42:04\|http://setiathome.berkeley.edu/\|Requesting 8640 seconds of new work 27/11/2005 21:44:51\|http://setiathome.berkeley.edu/\|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500 27/11/2005 21:44:51\|http://setiathome.berkeley.edu/\|No schedulers responded 27/11/2005 21:44:51\|http://setiathome.berkeley.edu/\|Couldn't connect to URL http://setiathome.berkeley.edu/.Please check URL. since it can't connect it detaches and resets. ID: 198038 ·

nick Volunteer tester Send message Joined: 22 Jul 05 Posts: 284 Credit: 3,902,174 RAC: 0	Message 198087 - Posted: 29 Nov 2005, 1:11:07 UTC - in response to Message 198038. 27/11/2005 21:41:45\|\|Fetching config info from http://setiathome.berkeley.edu/get_project_config.php 27/11/2005 21:41:47\|\|Missing account key 27/11/2005 21:42:03\|http://setiathome.berkeley.edu/\|Master file download succeeded here it calls the scheduling server and gets the master page. then it tries to call the UL/DL server and gets a 500 error, 27/11/2005 21:42:04\|http://setiathome.berkeley.edu/\|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi 27/11/2005 21:42:04\|http://setiathome.berkeley.edu/\|Reason: Requested by user 27/11/2005 21:42:04\|http://setiathome.berkeley.edu/\|Requesting 8640 seconds of new work 27/11/2005 21:44:51\|http://setiathome.berkeley.edu/\|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500 27/11/2005 21:44:51\|http://setiathome.berkeley.edu/\|No schedulers responded 27/11/2005 21:44:51\|http://setiathome.berkeley.edu/\|Couldn't connect to URL http://setiathome.berkeley.edu/.Please check URL. since it can't connect it detaches and resets. that happen in linux when i typed the key in a missed a letter, so have them copy and paste it in boinc. ID: 198087 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 198344 - Posted: 29 Nov 2005, 8:45:18 UTC - in response to Message 198038. 27/11/2005 21:44:51\|http://setiathome.berkeley.edu/\|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500 27/11/2005 21:44:51\|http://setiathome.berkeley.edu/\|No schedulers responded 27/11/2005 21:44:51\|http://setiathome.berkeley.edu/\|Couldn't connect to URL http://setiathome.berkeley.edu/.Please check URL. since it can't connect it detaches and resets. '500' is a pretty generic "no answer" error. So, the big question is, how is the request to the scheduler CGI bad? ID: 198344 ·

Tigher Volunteer tester Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0	Message 198420 - Posted: 29 Nov 2005, 11:47:05 UTC Last modified: 29 Nov 2005, 11:52:23 UTC I mailed the devs on this one. I have someone who can talk to all projects except seti using their lan and router. Take out the router and connect direct and they can talk to seti as well. Put it back in and they cannot talk to seti anymore. Weird!. I asked the devs for what changes since 4.4x to 5.x.x in terms of tcp. I have looked at the packet traces for this guy and tcp checksum errors are being generated on his PC/LAN somewhere. UCB just responds accordingly with dup acks and/or tcp retrans and /or lost segment messages. A 500 error typically follows after the server gets fed up with partial data and then times out. Interstingly only the initial syn to the servers is not impaired with a checksum problem. I have asked Jeff C for the apache error list as we may be able to get closer with that as to what the server or fastcgi is complaining about. I had this guy checking his hardware top to bottom but nothing has resolved this. At the moment I theorise that something in boinc is affecting the tcp stack somehow - a WAG for sure but I cannot see why the stack should be fine one minute and not the next. There are two more of these btw on boinc forums; they too have connected directly and got over the problem but cannot connect through LAN/router facilities. I see there are few here at seti too - 6/7 people - Cannot be a coincidence. ID: 198420 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 198755 - Posted: 29 Nov 2005, 19:19:11 UTC - in response to Message 198420. I mailed the devs on this one. I have someone who can talk to all projects except seti using their lan and router. Take out the router and connect direct and they can talk to seti as well. Put it back in and they cannot talk to seti anymore. Weird!. I asked the devs for what changes since 4.4x to 5.x.x in terms of tcp. I have looked at the packet traces for this guy and tcp checksum errors are being generated on his PC/LAN somewhere. UCB just responds accordingly with dup acks and/or tcp retrans and /or lost segment messages. A 500 error typically follows after the server gets fed up with partial data and then times out. Interstingly only the initial syn to the servers is not impaired with a checksum problem. I have asked Jeff C for the apache error list as we may be able to get closer with that as to what the server or fastcgi is complaining about. I had this guy checking his hardware top to bottom but nothing has resolved this. At the moment I theorise that something in boinc is affecting the tcp stack somehow - a WAG for sure but I cannot see why the stack should be fine one minute and not the next. There are two more of these btw on boinc forums; they too have connected directly and got over the problem but cannot connect through LAN/router facilities. I see there are few here at seti too - 6/7 people - Cannot be a coincidence. I don't know FastCGI in particular, but I assume it is similar to standard CGI, and the error message is likely coming from the CGI itself. That kind of makes me think we have an MTU (or MTU Discovery) problem -- that the packets are getting fragmented in some unfortunate manner, and arriving at Berkeley mangled. The easiest way to tweak settings is with Dr. TCP and I'd probably start with something very small, like 576 to see if that fixes the problem. ID: 198755 ·

Tigher Volunteer tester Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0	Message 198767 - Posted: 29 Nov 2005, 19:39:00 UTC - in response to Message 198755. Last modified: 29 Nov 2005, 20:14:01 UTC I don't know FastCGI in particular, but I assume it is similar to standard CGI, and the error message is likely coming from the CGI itself. That kind of makes me think we have an MTU (or MTU Discovery) problem -- that the packets are getting fragmented in some unfortunate manner, and arriving at Berkeley mangled. The easiest way to tweak settings is with Dr. TCP and I'd probably start with something very small, like 576 to see if that fixes the problem. Hi Ned. Well I'm a little puzzled by it. tcp is fine non LAN and fails when LAN used and then fine again when non LAN. Three cases of it. Strange for sure. Do you know if running IPX on the same wire causes a problem? Its the only suspicious thing I can see in this trace. Even though its only a bit of routing data as far as I can see. Hmmmmmm routing data - might remember that for when I am clueless. We always avoided it thinking it was bad to do but we never proved it. Edit: Just reading a bit on IPX/TCP/IP mixing = potentially bad news. Would be interested to know if anyone with server 500 error problems is running IPX or NetBEUI please? This may be related but need to gather some evidence. Thanks. ID: 198767 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 198964 - Posted: 29 Nov 2005, 23:54:43 UTC - in response to Message 198767. I don't know FastCGI in particular, but I assume it is similar to standard CGI, and the error message is likely coming from the CGI itself. That kind of makes me think we have an MTU (or MTU Discovery) problem -- that the packets are getting fragmented in some unfortunate manner, and arriving at Berkeley mangled. The easiest way to tweak settings is with Dr. TCP and I'd probably start with something very small, like 576 to see if that fixes the problem. Hi Ned. Well I'm a little puzzled by it. tcp is fine non LAN and fails when LAN used and then fine again when non LAN. Three cases of it. Strange for sure. Do you know if running IPX on the same wire causes a problem? Its the only suspicious thing I can see in this trace. Even though its only a bit of routing data as far as I can see. Hmmmmmm routing data - might remember that for when I am clueless. We always avoided it thinking it was bad to do but we never proved it. Edit: Just reading a bit on IPX/TCP/IP mixing = potentially bad news. Would be interested to know if anyone with server 500 error problems is running IPX or NetBEUI please? This may be related but need to gather some evidence. Thanks. Hey Ian, Back in the day (Windows 95 era) the default was to have all three protocols loaded and in use at all times. That often lead to confusion because two machines could easily be talking IPX while two others ran NetBEUI, and a third talked IP. The network browsers (the thing that collected names and told machines how to talk to each other) are seperate for each network, and can have different machines. ... but I wouldn't expect this to be a problem for SETI. One gripe with Microsoft TCP stacks is that they don't all do MTU discovery correctly, and at least some (most?) also set the "Don't Fragment" flag. That means that short packets (TCP "SYN" packets with no data) go through, but anything over 1500 bytes might not make it if there is some router in between with a smaller MTU. In effect, Microsoft is enforcing an MTU of 1500 across the entire internet. Or, I could be entirely wrong. I suggested Dr. TCP as an easy way to try different MTU values -- because it's easy, and if it works, then we know more about the problem, and if it doesn't, well, it's really easy to try. -- Ned ID: 198964 ·

Astro Volunteer tester Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0	Message 199024 - Posted: 30 Nov 2005, 1:16:36 UTC THe user here found out something interesting. He's one of the affected users I spoke of: Help with "attach to project" especially read the solution found in the last post of the last one. tony ID: 199024 ·

Astro Volunteer tester Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0	Message 199027 - Posted: 30 Nov 2005, 1:19:00 UTC we now have a fifth user with this problem. here's a link to all of them: Help me Please...... Boinc connection error 182 Error 500 Help with "attach to project" Failed to attach to project ID: 199027 ·

Hans Dorn Volunteer developer Volunteer tester Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0	Message 199029 - Posted: 30 Nov 2005, 1:21:05 UTC - in response to Message 199024. Last modified: 30 Nov 2005, 1:22:40 UTC THe user here found out something interesting. He's one of the affected users I spoke of: Help with "attach to project" especially read the solution found in the last post of the last one. tony Hmmm. This points to a MTU size problem, as others have suspected, too. (Cable modems add extra data to TCP packets and can cause them to get too large) Maybe you should suggest using "Dr. TCP" from below to reduce the MTU size to something like 1300 for a start. Regards Hans ID: 199029 ·

Astro Volunteer tester Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0	Message 199032 - Posted: 30 Nov 2005, 1:23:22 UTC The user from the last thread I listed is still online, if someone who understood MTU could go there, I'd appreciate it. ID: 199032 ·

Hans Dorn Volunteer developer Volunteer tester Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0	Message 199033 - Posted: 30 Nov 2005, 1:25:24 UTC - in response to Message 199032. The user from the last thread I listed is still online, if someone who understood MTU could go there, I'd appreciate it. Hmmm I'm on linux here and won't be able to help much with debugging WinXP networking problems. Regards Hans ID: 199033 ·

Astro Volunteer tester Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0	Message 199047 - Posted: 30 Nov 2005, 1:38:07 UTC I just had an epiphany, I pointed them here. LOL, I know they can't post here, but they can read here, can't they? ID: 199047 ·

Hans Dorn Volunteer developer Volunteer tester Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0	Message 199048 - Posted: 30 Nov 2005, 1:39:26 UTC Last modified: 30 Nov 2005, 1:40:00 UTC Hi Tony, hope you read this. I have a user over there that may have to use a proxy to connect. Since I know zilch about the GUI, maybe you can tell him how to get to the proxy dialog. :o) Regards Hans ID: 199048 ·

Larin Send message Joined: 8 Aug 02 Posts: 45 Credit: 11,534,919 RAC: 0	Message 199169 - Posted: 30 Nov 2005, 4:43:04 UTC - in response to Message 196563. I did not start this thred but I do a simular prob is the what you wear lookin for. Larin alg.exe:1244 TCP snape:1028 snape:0 LISTENING ARSAsync.exe:1768 TCP snape.slithern.hogwartz.edu:1144 207.189.107.204:http CLOSE_WAIT ARSAsync.exe:1768 TCP snape.slithern.hogwartz.edu:1147 207.189.107.204:http CLOSE_WAIT boinc.exe:1536 TCP snape:31416 snape:0 LISTENING boinc.exe:1536 TCP snape:31416 localhost:1140 ESTABLISHED boincmgr.exe:2696 TCP snape:1140 localhost:31416 ESTABLISHED lsass.exe:788 UDP snape:isakmp : lsass.exe:788 UDP snape:4500 : svchost.exe:1020 TCP snape:epmap snape:0 LISTENING svchost.exe:1104 UDP snape:ntp : svchost.exe:1104 UDP snape.slithern.hogwartz.edu:ntp : svchost.exe:1152 UDP snape:1058 : svchost.exe:1152 UDP snape:1029 : svchost.exe:1152 UDP snape:1026 : svchost.exe:1300 UDP snape:1900 : svchost.exe:1300 UDP snape.slithern.hogwartz.edu:1900 : svchost.exe:944 TCP snape:3389 snape:0 LISTENING System:4 TCP snape:microsoft-ds snape:0 LISTENING System:4 TCP snape.slithern.hogwartz.edu:netbios-ssn snape:0 LISTENING System:4 UDP snape:microsoft-ds : System:4 UDP snape.slithern.hogwartz.edu:netbios-dgm : System:4 UDP snape.slithern.hogwartz.edu:netbios-ns : One should not meddle in the affairs of Dragons. For you are crunchy and very tasty with ketchup. ID: 199169 ·

Astro Volunteer tester Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0	Message 199175 - Posted: 30 Nov 2005, 4:55:01 UTC Larin, I have your Q&A thread listed below. ID: 199175 ·

Larin Send message Joined: 8 Aug 02 Posts: 45 Credit: 11,534,919 RAC: 0	Message 199188 - Posted: 30 Nov 2005, 5:17:19 UTC - in response to Message 199175. Last modified: 30 Nov 2005, 5:17:56 UTC Larin, I have your Q&A thread listed below. Thanks just trying to help out if I can I have been trying verying MTU settings no help ther. I will probly bring down someones wrath for saying this but I think it is in the upload/download server kryten.ssl.berkeley.edu I have not been getting to it with tracerout but I can get to all of the others would you try it? Larin One should not meddle in the affairs of Dragons. For you are crunchy and very tasty with ketchup. ID: 199188 ·

osKar one Send message Joined: 16 Oct 99 Posts: 26 Credit: 5,671 RAC: 0	Message 199254 - Posted: 30 Nov 2005, 9:23:33 UTC - in response to Message 199188. Last modified: 30 Nov 2005, 9:47:46 UTC Thanks just trying to help out if I can I have been trying verying MTU settings no help ther. I will probly bring down someones wrath for saying this but I think it is in the upload/download server kryten.ssl.berkeley.edu I have not been getting to it with tracerout but I can get to all of the others would you try it? Yes, there is also a problem for me with the kryten.ssl.berkeley.edu (128.32.18.154) server. I cannot trace it and the answer is always "Request timed out". Could not someone from Seti@home staff do something for this server, please???? I think that is not a configuring port problem but a Seti@home server problem!!! ID: 199254 ·

Tigher Volunteer tester Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0	Message 199259 - Posted: 30 Nov 2005, 9:36:50 UTC Well I have an email from the sys ops and Matt L is saying there are no recent error 500 messages on the server logs. He will track back a little to see what has happened recently. Strange though as if you get a 500 it must come from somewhere! @Ned We tried the MTU resize with our Australian boincers if you recall but to no avail. Worth a go but I think Larin is saying it does not help for him. I appreciate large segments can and do get fragmented so its worth a try with other users who have the problem. This can be a solution to unreliable routes but Larin is US based and should not be having extended route problems. I'm still thinking about the trace in front of me and can see checksum errors. I know these calcs can be delegated to hardware for "late" addition so that can be a reason but then I dont understand why when there is no LAN involved people can connect and this trace shows no such problem. So still interested in anyone with a 500 problem checking if they have protocols other than tcp/ip running. The advice I have seen on this is as late as Win 2000 which says don't mix so I appreciate what you have said there Ned but I don't think we can rule it out just yet? Matt L is saying workload for the servers is climbing : "during the "active" periods. We seem to almost double our load every day" and seti is taking 3000 new people per day. That is may be why, Larin, your pings are unsuccessful - either the server or the route to the server is busy. Anyhow I got some more thinking to do and will try and post anything I dream up. ID: 199259 ·

©2025 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.