HE connection problems thread

Message boards : Number crunching : HE connection problems thread
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 25 · Next

AuthorMessage
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1157664 - Posted: 1 Oct 2011, 1:46:13 UTC - in response to Message 1157562.  

[T]he project is having significant technical problems. The complex collection of servers - some very elderly - the project relies on are suffering intermittent failures: a key component (a router) linking the lab to the outside world is showing signs of impending failure: and the inexorable increase in volunteers' computer power is putting every component under increasing strain.

The human resources in the lab are also under strain. There are too few of them at the best of times: one key member is on extended sabbatical: and they have few - effectively zero - financial resources to replace or purchase necessary equipment.

Under those circumstances, the staff do what they can. The project has suffered extended breakdowns before - you've probably noticed them. They get fixed - eventually - and life goes on.

But nobody gets any special treatment. The scientific results that your 40+ tasks represent are of course important - but so are the other three million that other people are trying to return.

Just at the moment, the best answer that any of us can give is "it'll be fixed when it's fixed". That's not meant to be condescending, but just the honest truth: nobody knows.

If you're interested in the background, spend a bit of time reading about the technicalities on these message boards. If you want a bit of peace of mind, join Chris, Claggy and I down the pub.

Wish you could post that message at the top of the NEWS Forum as a Locked and Sticky thread titled STATUS OF PROJECT - READ ME BEFORE POSTING.

Donald
Infernal Optimist / Submariner, retired
ID: 1157664 · Report as offensive
Scarecrow

Send message
Joined: 15 Jul 00
Posts: 4520
Credit: 486,601
RAC: 0
United States
Message 1157679 - Posted: 1 Oct 2011, 2:08:30 UTC

At 21:00 CDT I tried all the same pings, traceroutes, and nslookups that John did, and they are all up and reachable and show no problems. Using Cox Communications in Omaha, NE.
ID: 1157679 · Report as offensive
Profile Bill Walker
Avatar

Send message
Joined: 4 Sep 99
Posts: 3868
Credit: 2,697,267
RAC: 0
Canada
Message 1157856 - Posted: 1 Oct 2011, 13:01:18 UTC

Based on pinging, etc. it appears the HE connection problem comes and goes for me. When the problem is here, my messages say "BOINC can't access the Internet". That tells me its time to switch to a proxy, clear my uploads and fill my cache. I can't leave it on proxy, since some of the other projects don't work then. That is more babysitting than I normally do with BOINC, but it keeps me crunching.

ID: 1157856 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1157877 - Posted: 1 Oct 2011, 13:55:00 UTC - in response to Message 1157856.  
Last modified: 1 Oct 2011, 13:55:15 UTC

Based on pinging, etc. it appears the HE connection problem comes and goes for me. When the problem is here, my messages say "BOINC can't access the Internet". That tells me its time to switch to a proxy, clear my uploads and fill my cache. I can't leave it on proxy, since some of the other projects don't work then. That is more babysitting than I normally do with BOINC, but it keeps me crunching.

I believe if you get the response that Boinc can't access the internet, the problem is on your end, as I think that is generated after Boinc tries to get to Amazon or Google as a connection test. I only get that if my router or modem goes flaky and needs a reboot.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1157877 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1157887 - Posted: 1 Oct 2011, 14:18:35 UTC - in response to Message 1157877.  

Based on pinging, etc. it appears the HE connection problem comes and goes for me. When the problem is here, my messages say "BOINC can't access the Internet". That tells me its time to switch to a proxy, clear my uploads and fill my cache. I can't leave it on proxy, since some of the other projects don't work then. That is more babysitting than I normally do with BOINC, but it keeps me crunching.

I believe if you get the response that Boinc can't access the internet, the problem is on your end, as I think that is generated after Boinc tries to get to Amazon or Google as a connection test. I only get that if my router or modem goes flaky and needs a reboot.

When the site is wonky and your internet is OK the message looks a bit like this.
10/1/2011 9:25:52 AM	SETI@home	Scheduler request failed: Failure when receiving data from the peer
10/1/2011 9:25:54 AM			Project communication failed: attempting access to reference site
10/1/2011 9:25:56 AM			Internet access OK - project servers may be temporarily down.
10/1/2011 9:25:57 AM			Stop clicking buttons. Go have breakfast, or pet your kitteh.

SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1157887 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1157888 - Posted: 1 Oct 2011, 14:20:49 UTC - in response to Message 1157887.  

Based on pinging, etc. it appears the HE connection problem comes and goes for me. When the problem is here, my messages say "BOINC can't access the Internet". That tells me its time to switch to a proxy, clear my uploads and fill my cache. I can't leave it on proxy, since some of the other projects don't work then. That is more babysitting than I normally do with BOINC, but it keeps me crunching.

I believe if you get the response that Boinc can't access the internet, the problem is on your end, as I think that is generated after Boinc tries to get to Amazon or Google as a connection test. I only get that if my router or modem goes flaky and needs a reboot.

When the site is wonky and your internet is OK the message looks a bit like this.
10/1/2011 9:25:52 AM	SETI@home	Scheduler request failed: Failure when receiving data from the peer
10/1/2011 9:25:54 AM			Project communication failed: attempting access to reference site
10/1/2011 9:25:56 AM			Internet access OK - project servers may be temporarily down.
10/1/2011 9:25:57 AM			Stop clicking buttons. Go have breakfast, or pet your kitteh.

LOL......

"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1157888 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1157891 - Posted: 1 Oct 2011, 14:30:14 UTC

The HE Connection Problem RETURNS!!!

8 150 ms 149 ms 149 ms ve446.ar9.NYC1.gblx.net [208.51.198.109]
9 221 ms 221 ms 221 ms po2-20G.ar3.SJC2.gblx.net [67.17.109.102]
10 206 ms 208 ms 222 ms Hurrican-Electric-LLC.Port-channel100.ar3.SJC2.g
lx.net [64.214.174.246]
11 * * * Esgotado o tempo limite do pedido.

The PROXI connection option still works... for the moment....
ID: 1157891 · Report as offensive
Jeff Cobb Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Mar 99
Posts: 122
Credit: 40,367
RAC: 0
United States
Message 1157929 - Posted: 1 Oct 2011, 16:18:40 UTC
Last modified: 1 Oct 2011, 16:27:19 UTC

We are working with HE to clear up this routing issue. They are being very helpful.

We will need to involve parties beyond HE. Here is a little bit of background. Our PAIX router physically talks to HE on one side and to CENIC on the other, whereupon the physical path proceeds through CENIC, then through UCB, then through SSL before finally landing at the router in our closet. All of this even though, logically, we tunnel directly from our PAIX router to our closet router. Both of our routers were donated by Packet Clearing House (PCH). They also helped us initially set them up and physically host our PAIX router in their rack at the PAIX. PCH really came to our rescue way back when we learned that we could not use (or afford to use) the UCB commodity provider. They also helped us migrate from Cogent to HE.

Back to he problem...
HE guessed that we don't have enough memory in our PAIX router to hold the giant routing table that we need to hold. This is more or less confirmed by looking at the memory stats on that router. One solution (the best) is to add memory, or replace the router if it cannot take more memory.

Another, perhaps interim, solution suggested by HE is to have our router utilize a default route rather than maintain a huge routing table. We tried that. Well, at least HE did their part to make it so. And it flat-lined our traffic. Thus our need to involve the other players. I have emailed them all. But everyone is super busy and, in the case of PCH, all effort is donated.

Speaking of being super busy.... I apologize for the the lack of communication recently. Both Matt and Bob have been gone for the past month, and Eric for the past week. That leaves one man on deck. So I'm a bit um.. stretched.
ID: 1157929 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 1157931 - Posted: 1 Oct 2011, 16:23:56 UTC - in response to Message 1157929.  

Thanks for the update Jeff, it is much appreciated!
ID: 1157931 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1157933 - Posted: 1 Oct 2011, 16:39:45 UTC - in response to Message 1157929.  

Thanks for the update Jeff. Should we be thinking about a fund drive for more router memory yet? Or should we wait until we see if you might need a new router?

Thank HE for us and everyone else involved too.

Now, get back to work!!! Only joking. Thanks for all the work you have been doing and for keeping us updated.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1157933 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1157939 - Posted: 1 Oct 2011, 16:47:22 UTC

Thanks for the info Jeff. Hopefully the router isn't maxed out at 1GB already and just has like 128mb or 256mb in it.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1157939 · Report as offensive
Wembley
Volunteer tester
Avatar

Send message
Joined: 16 Sep 09
Posts: 429
Credit: 1,844,293
RAC: 0
United States
Message 1157944 - Posted: 1 Oct 2011, 16:58:02 UTC - in response to Message 1157939.  
Last modified: 1 Oct 2011, 17:00:12 UTC

Thanks for the info Jeff. Hopefully the router isn't maxed out at 1GB already and just has like 128mb or 256mb in it.


Quick search on eBay shows a bunch of 1gb kits, lowest at $136.

cisco 7301 memory 1gb
ID: 1157944 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1157945 - Posted: 1 Oct 2011, 17:00:00 UTC - in response to Message 1157929.  

And thanks from me, too.

All of which raises another question. What happened back in June to make the routing table, apparently, outgrow the memory which has served us for the last three years or more?

According to http://boinc.netsoft-online.com/e107_plugins/boinc/bp.php?project=19 the numbers of both active users and active hosts - then only ones of interest to a router, surely? - have been declining steadily for the last three months. Even if there was a peak in June, we should have dipped back down below the limit by now. Or have the routing tables themselves grown in size with the preparations for IPv6?
ID: 1157945 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1157949 - Posted: 1 Oct 2011, 17:05:13 UTC - in response to Message 1157945.  

And thanks from me, too.

All of which raises another question. What happened back in June to make the routing table, apparently, outgrow the memory which has served us for the last three years or more?

According to http://boinc.netsoft-online.com/e107_plugins/boinc/bp.php?project=19 the numbers of both active users and active hosts - then only ones of interest to a router, surely? - have been declining steadily for the last three months. Even if there was a peak in June, we should have dipped back down below the limit by now. Or have the routing tables themselves grown in size with the preparations for IPv6?

With the IPv6 stuff starting up in June that is a very good possibility. Maybe just having the router drop IPv6 packets for now could be a solution.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1157949 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1157957 - Posted: 1 Oct 2011, 17:35:24 UTC - in response to Message 1157929.  

We are working with HE to clear up this routing issue. They are being very helpful.

We will need to involve parties beyond HE. Here is a little bit of background. Our PAIX router physically talks to HE on one side and to CENIC on the other, whereupon the physical path proceeds through CENIC, then through UCB, then through SSL before finally landing at the router in our closet. All of this even though, logically, we tunnel directly from our PAIX router to our closet router. Both of our routers were donated by Packet Clearing House (PCH). They also helped us initially set them up and physically host our PAIX router in their rack at the PAIX. PCH really came to our rescue way back when we learned that we could not use (or afford to use) the UCB commodity provider. They also helped us migrate from Cogent to HE.

Back to he problem...
HE guessed that we don't have enough memory in our PAIX router to hold the giant routing table that we need to hold. This is more or less confirmed by looking at the memory stats on that router. One solution (the best) is to add memory, or replace the router if it cannot take more memory.

Another, perhaps interim, solution suggested by HE is to have our router utilize a default route rather than maintain a huge routing table. We tried that. Well, at least HE did their part to make it so. And it flat-lined our traffic. Thus our need to involve the other players. I have emailed them all. But everyone is super busy and, in the case of PCH, all effort is donated.

Speaking of being super busy.... I apologize for the the lack of communication recently. Both Matt and Bob have been gone for the past month, and Eric for the past week. That leaves one man on deck. So I'm a bit um.. stretched.

Thanks Jeff
Can you please post this as a locked thread up the top of this forum? Good info like this should not be buried near the bottom of a thread with 300 odd posts

T.A.
ID: 1157957 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1157992 - Posted: 1 Oct 2011, 19:38:50 UTC - in response to Message 1157929.  

Thanks for the update Jeff,

Any chance re-enabling the scheduler at Seti Beta so we can have half a chance of reporting our results there please?

Claggy
ID: 1157992 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1158033 - Posted: 1 Oct 2011, 22:24:08 UTC

Thank you very much Jeff for the informative post.
Now perhaps some can understand how involved the problem is.

Please do let us know if the router memory is upgradable, and if so, what you need. I am sure we can get together and get it in your hands post haste.

Same goes for a different router, if that is what is required to get the problem completely fixed. Just give us the specs on what would be needed for a proper replacement, and most assuredly it shall be a top priority for us.

Meow!!
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1158033 · Report as offensive
Profile Mad Max
Volunteer tester
Avatar

Send message
Joined: 16 Mar 00
Posts: 475
Credit: 213,231,775
RAC: 407
United States
Message 1158050 - Posted: 1 Oct 2011, 23:24:45 UTC

Not sure if anyone else has tried but I flushed the dns on two of my work machines that had not connected in over a week. Now they have shown that they have communicated with S@H. Did it yesterday on one machine and saw results, and another today. Just throwing this out there if no one else tried it yet. You never know.
IAS - Where Space Is Golden!
ID: 1158050 · Report as offensive
Steve
Volunteer tester

Send message
Joined: 30 Jan 06
Posts: 3
Credit: 727,695
RAC: 3
United States
Message 1158203 - Posted: 2 Oct 2011, 11:24:26 UTC - in response to Message 1122104.  

Hello there,

The only project I'm having trouble with is the Seti@home project.

This problem only majorly appeared in the past week. Before I have had sporadic problems but it always worked itself out.

My ISP is Cincinnati Bell.

I live in Cincinnati, Ohio

All pings to the 4 ip addresses failed with 0 results (100% failure)

Unsure what traceroute or nslookup is.

Hope this helps.
ID: 1158203 · Report as offensive
Steve
Volunteer tester

Send message
Joined: 30 Jan 06
Posts: 3
Credit: 727,695
RAC: 3
United States
Message 1158443 - Posted: 3 Oct 2011, 4:36:57 UTC - in response to Message 1122104.  

IP address: 208.68.240.13
Host name: boinc2.ssl.berkeley.edu
208.68.240.13 is from United States(US) in region North America

TraceRoute to 208.68.240.13 [boinc2.ssl.berkeley.edu]

Hop (ms) (ms) (ms) IP Address Host name
1 1 0 0 206.123.64.154 jbdr2.0.dal.colo4.com
2 0 0 0 173.219.246.92 173-219-246-92-link.sta.suddenlink.net
3 0 0 0 206.223.118.37 10gigabitethernet3-1.core1.dal1.he.net
4 26 27 26 72.52.92.253 10gigabitethernet1-2.core1.phx1.he.net
5 36 42 36 72.52.92.249 10gigabitethernet2-2.core1.lax1.he.net
6 Timed out Timed out Timed out -
7 45 45 45 64.71.140.42 -
8 49 48 49 208.68.243.254 -
9 49 49 49 208.68.240.13 boinc2.ssl.berkeley.edu
Trace complete


Retrieving DNS records for boinc2.ssl.berkeley.edu...
DNS servers
adns2.berkeley.edu [128.32.136.14]
adns1.berkeley.edu [128.32.136.3]

Answer records
boinc2.ssl.berkeley.edu A 208.68.240.13 300s
boinc2.ssl.berkeley.edu A 208.68.240.18 300s

Authority records
ssl.berkeley.edu NS adns2.berkeley.edu 86400s
ssl.berkeley.edu NS adns1.berkeley.edu 86400s

Additional records
adns1.berkeley.edu A 128.32.136.3 172800s
adns1.berkeley.edu 28 [16 bytes] 3600s
adns2.berkeley.edu A 128.32.136.14 172800s
adns2.berkeley.edu 28 [16 bytes] 3600s

For all the ip addresses it times out on hop 6 if that is any use to you.
ID: 1158443 · Report as offensive
Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 25 · Next

Message boards : Number crunching : HE connection problems thread


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.