Panic Mode On (24) Server problems


log in

Advanced search

Message boards : Number crunching : Panic Mode On (24) Server problems

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 · Next
Author Message
rob smith
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8146
Credit: 52,814,016
RAC: 75,794
United Kingdom
Message 932872 - Posted: 12 Sep 2009, 20:51:26 UTC

Looking at the server status there's something amiss with the AP splitters, with none running and jobs stalled in progress. MB splitters are working, but ti would be nice to get a few APs coming out.....
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 932877 - Posted: 12 Sep 2009, 21:39:35 UTC - in response to Message 932844.

Is there still problems with the uploads, some goes through however my latest one has not gone through since 19:08 UTC? Will wait until it does when that is the the question.

Depends a lot on what you mean by "problem."

In my opinion, it is only a problem when work misses the deadline because it couldn't be uploaded in time.
____________

Profile twister@austria-national-team.at
Volunteer tester
Send message
Joined: 26 Jan 00
Posts: 30
Credit: 60,419,551
RAC: 0
Austria
Message 933049 - Posted: 13 Sep 2009, 14:41:35 UTC

My Pending the way to the 600,000 are, I still can not see.

And when I with my computer, the "task" look, I see that I yesterday on 12 Sept. WU will have, even that is not true. Even a whole day, the PC has requested and received packets. PHP also this result is not correct.


____________

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 356,897
RAC: 19
Germany
Message 933055 - Posted: 13 Sep 2009, 15:12:39 UTC - in response to Message 933049.
Last modified: 13 Sep 2009, 15:13:27 UTC

My Pending the way to the 600,000 are, I still can not see.

And when I with my computer, the "task" look, I see that I yesterday on 12 Sept. WU will have, even that is not true. Even a whole day, the PC has requested and received packets. PHP also this result is not correct.


The replica database is offline.

Gruß,
Gundolf

FiveHamlet
Avatar
Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 933069 - Posted: 13 Sep 2009, 15:56:00 UTC
Last modified: 13 Sep 2009, 15:56:25 UTC

Replica database now back online.

From us all.
Thank's to whoever gave it a kick.
____________

Josef W. Segur
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4206
Credit: 1,030,844
RAC: 270
United States
Message 933134 - Posted: 13 Sep 2009, 20:26:48 UTC

Astropulse splitters producing, download bandwidth heading for saturation....

Joe

HAL
Send message
Joined: 28 Mar 03
Posts: 704
Credit: 870,617
RAC: 0
United States
Message 933162 - Posted: 13 Sep 2009, 23:10:27 UTC

replica database retired AGAIN - very short life span, new game of guess your score.

Profile Vistro
Avatar
Send message
Joined: 6 Aug 08
Posts: 233
Credit: 316,549
RAC: 0
United States
Message 933166 - Posted: 13 Sep 2009, 23:19:35 UTC

BOINC replica database jocelyn Running

Replica seconds behind master Offline 0m

Which one is lying?
____________
30+ Computers heading our way! Currently at the "Zomg we need to talk to our tech expert at the co-op about this first!!!" stage. 16 Lab machines and 14+ Staff machines each with 2.2Ghz CPUs and 256MB ram. Think they balance? The RAM certainly is bad

Fred W
Volunteer tester
Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 933168 - Posted: 13 Sep 2009, 23:28:09 UTC - in response to Message 933166.
Last modified: 13 Sep 2009, 23:33:05 UTC

Which one is lying?

Probably neither. The replica database is probably running as a database but not as a "replica" according to my Tasks page (i.e. it is not being kept up to date).

F.

[Edited]
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5697
Credit: 56,444,074
RAC: 48,995
Australia
Message 933238 - Posted: 14 Sep 2009, 6:34:27 UTC


Someone appears to have done something during the day- uploads are going through without any problems now.
____________
Grant
Darwin NT.

Fred W
Volunteer tester
Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 933242 - Posted: 14 Sep 2009, 6:59:24 UTC - in response to Message 933238.


Someone appears to have done something during the day- uploads are going through without any problems now.

Not from this side of the globe:

14/09/2009 07:56:20 SETI@home Started upload of 25au09ab.19058.25016.7.10.249_1_0
14/09/2009 07:56:22 Project communication failed: attempting access to reference site
14/09/2009 07:56:22 SETI@home Temporarily failed upload of 25au09ab.19058.25016.7.10.249_1_0: HTTP error
14/09/2009 07:56:22 SETI@home Backing off 1 min 30 sec on upload of 25au09ab.19058.25016.7.10.249_1_0
14/09/2009 07:56:24 Internet access OK - project servers may be temporarily down.
14/09/2009 07:57:01 SETI@home Started upload of 25au09ab.19058.25016.7.10.249_1_0
14/09/2009 07:57:03 Project communication failed: attempting access to reference site
14/09/2009 07:57:03 SETI@home Temporarily failed upload of 25au09ab.19058.25016.7.10.249_1_0: HTTP error
14/09/2009 07:57:03 SETI@home Backing off 5 min 9 sec on upload of 25au09ab.19058.25016.7.10.249_1_0


F.

____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5697
Credit: 56,444,074
RAC: 48,995
Australia
Message 933252 - Posted: 14 Sep 2009, 8:30:15 UTC - in response to Message 933242.


Someone appears to have done something during the day- uploads are going through without any problems now.

Not from this side of the globe:

My fault.
It was working well for a few hours there, untill not long after i commented on how well it had been working.
Mutliple attempts at uploading before successfull have resumed here as well.
____________
Grant
Darwin NT.

W5DMG - Dave
Send message
Joined: 19 May 99
Posts: 155
Credit: 32,154,377
RAC: 14,003
United States
Message 933284 - Posted: 14 Sep 2009, 13:42:29 UTC - in response to Message 933252.
Last modified: 14 Sep 2009, 13:49:25 UTC

9/14/2009 8:19:43 AM SETI@home Started upload of 29au09aa.20757.19699.10.10.252_0_0
9/14/2009 8:19:44 AM Project communication failed: attempting access to reference site
9/14/2009 8:19:44 AM SETI@home Temporarily failed upload of 29au09aa.20757.19699.10.10.252_0_0: HTTP error
9/14/2009 8:19:44 AM SETI@home Backing off 1 min 0 sec on upload of 29au09aa.20757.19699.10.10.252_0_0
9/14/2009 8:19:46 AM Internet access OK - project servers may be temporarily down.
9/14/2009 8:19:46 AM SETI@home Started upload of 28au09ad.15131.19699.3.10.73_0_0
9/14/2009 8:19:47 AM Project communication failed: attempting access to reference site
9/14/2009 8:19:47 AM SETI@home Temporarily failed upload of 28au09ad.15131.19699.3.10.73_0_0: HTTP error
9/14/2009 8:19:47 AM SETI@home Backing off 18 min 8 sec on upload of 28au09ad.15131.19699.3.10.73_0_0
9/14/2009 8:19:49 AM Internet access OK - project servers may be temporarily down.
9/14/2009 8:19:53 AM SETI@home Started upload of 28au09ad.15131.19699.3.10.169_0_0
9/14/2009 8:19:54 AM Project communication failed: attempting access to reference site


Yes I have been having this same problem now for the past 24 hours.

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7026
Credit: 59,289,366
RAC: 20,714
Germany
Message 933287 - Posted: 14 Sep 2009, 13:49:12 UTC
Last modified: 14 Sep 2009, 13:49:59 UTC


Hey - thanks for enable AstroPulse again..


UL don't go through..

____________
BR



>Das Deutsche Cafe. The German Cafe.<

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5697
Credit: 56,444,074
RAC: 48,995
Australia
Message 933450 - Posted: 15 Sep 2009, 4:28:03 UTC - in response to Message 933287.
Last modified: 15 Sep 2009, 4:29:02 UTC

The upload problem is getting more severe.
It used to only take a few attempts for an upload to go through, now it's taking dozens (at a minimum).
____________
Grant
Darwin NT.

PP
Send message
Joined: 3 Jul 99
Posts: 42
Credit: 10,012,664
RAC: 0
Sweden
Message 933475 - Posted: 15 Sep 2009, 8:48:40 UTC - in response to Message 933450.
Last modified: 15 Sep 2009, 8:49:27 UTC

Yes, something is really wrong. I have hundreds of finished jobs pending upload. Any connection attempt looks like this:

10:42:44.802571 IP my.ip.42902 > setiboincdata.ssl.berkeley.edu.http: S 2647950799:2647950799(0) win 5840 <mss 1460,sackOK,timestamp 47594046 0,nop,wscale 7>
10:42:44.802727 IP my.ip.42903 > setiboincdata.ssl.berkeley.edu.http: S 1789212786:1789212786(0) win 5840 <mss 1460,sackOK,timestamp 47594046 0,nop,wscale 7>
10:42:45.008852 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42902: S 3884893807:3884893807(0) ack 2647950800 win 5696 <mss 1436,sackOK,timestamp 1175400497 47594046,nop,wscale 10>
10:42:45.009103 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42903: S 3888998256:3888998256(0) ack 1789212787 win 5696 <mss 1436,sackOK,timestamp 1175400498 47594046,nop,wscale 10>
10:42:45.011758 IP my.ip.42902 > setiboincdata.ssl.berkeley.edu.http: P 1:255(254) ack 1 win 46 <nop,nop,timestamp 47594109 1175400497>
10:42:45.012778 IP my.ip.42903 > setiboincdata.ssl.berkeley.edu.http: P 1:255(254) ack 1 win 46 <nop,nop,timestamp 47594109 1175400498>
10:42:45.014820 IP my.ip.42902 > setiboincdata.ssl.berkeley.edu.http: . ack 1 win 46 <nop,nop,timestamp 47594109 1175400497>
10:42:45.014823 IP my.ip.42903 > setiboincdata.ssl.berkeley.edu.http: . ack 1 win 46 <nop,nop,timestamp 47594109 1175400498>
10:42:45.218107 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42902: . ack 255 win 7 <nop,nop,timestamp 1175400707 47594109>
10:42:45.218233 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42902: R 1:1(0) ack 255 win 7 <nop,nop,timestamp 1175400707 47594109>
10:42:45.219858 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42903: . ack 255 win 7 <nop,nop,timestamp 1175400708 47594109>
10:42:45.220233 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42903: R 1:1(0) ack 255 win 7 <nop,nop,timestamp 1175400708 47594109>
10:42:45.220731 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42903: R 3888998257:3888998257(0) win 0
10:42:45.220981 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42902: R 3884893808:3884893808(0) win 0
10:42:45.224115 IP my.ip.42902 > setiboincdata.ssl.berkeley.edu.http: P 255:542(287) ack 1 win 46 <nop,nop,timestamp 47594172 1175400707>
10:42:45.226156 IP my.ip.42903 > setiboincdata.ssl.berkeley.edu.http: P 255:542(287) ack 1 win 46 <nop,nop,timestamp 47594172 1175400708>
10:42:45.430738 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42902: R 3884893808:3884893808(0) win 0
10:42:45.432610 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42903: R 3888998257:3888998257(0) win 0
10:42:46.400563 IP my.ip.42904 > setiboincdata.ssl.berkeley.edu.http: S 2087092463:2087092463(0) win 5840 <mss 1460,sackOK,timestamp 47594525 0,nop,wscale 7>
10:42:46.400567 IP my.ip.42905 > setiboincdata.ssl.berkeley.edu.http: S 807197585:807197585(0) win 5840 <mss 1460,sackOK,timestamp 47594525 0,nop,wscale 7>
10:42:46.605942 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42904: S 3925343597:3925343597(0) ack 2087092464 win 5696 <mss 1436,sackOK,timestamp 1175402095 47594525,nop,wscale 10>
10:42:46.606941 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42905: S 3929144410:3929144410(0) ack 807197586 win 5696 <mss 1436,sackOK,timestamp 1175402095 47594525,nop,wscale 10>
10:42:46.610895 IP my.ip.42904 > setiboincdata.ssl.berkeley.edu.http: P 1:255(254) ack 1 win 46 <nop,nop,timestamp 47594588 1175402095>
10:42:46.612937 IP my.ip.42905 > setiboincdata.ssl.berkeley.edu.http: P 1:255(254) ack 1 win 46 <nop,nop,timestamp 47594588 1175402095>
10:42:46.613956 IP my.ip.42904 > setiboincdata.ssl.berkeley.edu.http: . ack 1 win 46 <nop,nop,timestamp 47594588 1175402095>
10:42:46.613960 IP my.ip.42905 > setiboincdata.ssl.berkeley.edu.http: . ack 1 win 46 <nop,nop,timestamp 47594588 1175402095>
10:42:46.817697 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42904: . ack 255 win 7 <nop,nop,timestamp 1175402306 47594588>
10:42:46.817713 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42904: R 1:1(0) ack 255 win 7 <nop,nop,timestamp 1175402306 47594588>
10:42:46.819167 IP my.ip.42904 > setiboincdata.ssl.berkeley.edu.http: P 255:542(287) ack 1 win 46 <nop,nop,timestamp 47594652 1175402306>
10:42:46.820824 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42904: R 3925343598:3925343598(0) win 0
10:42:46.820837 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42905: . ack 255 win 7 <nop,nop,timestamp 1175402308 47594588>
10:42:46.820944 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42905: R 1:1(0) ack 255 win 7 <nop,nop,timestamp 1175402308 47594588>
10:42:46.822070 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42905: R 3929144411:3929144411(0) win 0
10:42:46.834479 IP my.ip.42905 > setiboincdata.ssl.berkeley.edu.http: P 255:538(283) ack 1 win 46 <nop,nop,timestamp 47594652 1175402308>
10:42:47.026203 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42904: R 3925343598:3925343598(0) win 0
10:42:47.041319 IP setiboincdata.ssl.berkeley.edu.http > my.ip.42905: R 3929144411:3929144411(0) win 0


The TCP handshake appears to go well but then the server simply resets the connection. No FIN, no graceful disconnect. I think those Berkeley network guys need the motivation of a well placed blow torch. ;-)
____________

Terror Australis
Volunteer tester
Send message
Joined: 14 Feb 04
Posts: 1668
Credit: 203,571,288
RAC: 24,880
Australia
Message 933476 - Posted: 15 Sep 2009, 8:54:25 UTC
Last modified: 15 Sep 2009, 8:58:15 UTC

Is Astropulse really to blame ??
The Cricket graphs show the average "bits in" (or output to us) as 64Mbs and currently at 50Mbs. This should leave plenty of headroom for our uploads but the "bits out" (our uploads) has been falling steadily since midnight Sunday Berkeley time and is currently only 4.29Mbs.

This means that there is a lot of unused capacity on the system atm. So what is happening ?

PP
Send message
Joined: 3 Jul 99
Posts: 42
Credit: 10,012,664
RAC: 0
Sweden
Message 933477 - Posted: 15 Sep 2009, 9:05:23 UTC - in response to Message 933476.

Maybe they forgot to upgrade all of their Cisco routers... :-)
http://www.cisco.com/warp/public/707/cisco-sa-20090908-tcp24.shtml
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5697
Credit: 56,444,074
RAC: 48,995
Australia
Message 933478 - Posted: 15 Sep 2009, 9:23:02 UTC - in response to Message 933476.

Is Astropulse really to blame ??

Nope.
It's some sort of issue with the upload server, hopefully just a configuration tweak is all that'll be required.
____________
Grant
Darwin NT.

Richard Haselgrove
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8375
Credit: 46,768,506
RAC: 22,847
United Kingdom
Message 933479 - Posted: 15 Sep 2009, 10:12:36 UTC - in response to Message 933478.

Is Astropulse really to blame ??

Nope.
It's some sort of issue with the upload server, hopefully just a configuration tweak is all that'll be required.

Or possibly a router problem: PP's information is interesting.

Cisco's advisory suggests that a router reboot may be helpful: that should be a relatively easy thing to try during the maintenance outage today.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 · Next

Message boards : Number crunching : Panic Mode On (24) Server problems

Copyright © 2014 University of California