Dropped packets

Message boards : News : Dropped packets
Message board moderation

To post messages, you must log in.

AuthorMessage
Eric Korpela Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1382
Credit: 54,506,847
RAC: 60
United States
Message 1932903 - Posted: 30 Apr 2018, 17:23:04 UTC

The UC data center switched over to a new firewall this morning. Since then packets into and out of the data center have been suffering drops. The Data Center staff is debugging the problem, we'll probably be dropping packets until it's resolved.
@SETIEric@qoto.org (Mastodon)

ID: 1932903 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51478
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1932906 - Posted: 30 Apr 2018, 17:37:03 UTC

Thank you for the update, Eric.
At least we know they are aware of the problem and I am sure it will be resolved in due course.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1932906 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11415
Credit: 29,581,041
RAC: 66
United States
Message 1932939 - Posted: 30 Apr 2018, 22:09:49 UTC - in response to Message 1932936.  

Sten, did you drop a packet?
ID: 1932939 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 31006
Credit: 53,134,872
RAC: 32
United States
Message 1932948 - Posted: 1 May 2018, 0:49:08 UTC

FYI there may be more wonky this evening starting 10:00PM Berkeley time
http://ucbsystems.org/category/active/ <--- save the link for future reference if the system goes down some other time
Description: UPDATE: Monday, April 30, 2018 3:38pm – The firewalls have been stable since 11:40am this morning. Users may need to check/restart services that could have hung during the outage.
IST staff are still working on the root cause of this outage. This evening after business hours at 10:00pm, the network team will troubleshoot further to fully restore network services.
ID: 1932948 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1932952 - Posted: 1 May 2018, 1:52:58 UTC - in response to Message 1932948.  

Thanks for that update Gary. Need to make sure the systems are ready to put to bed before 10PM.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1932952 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1932955 - Posted: 1 May 2018, 7:51:44 UTC - in response to Message 1932948.  

Description: UPDATE: Monday, April 30, 2018 3:38pm – The firewalls have been stable since 11:40am this morning. Users may need to check/restart services that could have hung during the outage.

If only that were so.

Another IST update-
UPDATE: Monday, April 30, 2018 8:51pm – This evening at 7:06PM the data center firewalls reloaded on their own. The vendor is currently working to restore service.
Grant
Darwin NT
ID: 1932955 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1932961 - Posted: 1 May 2018, 8:23:06 UTC
Last modified: 1 May 2018, 8:29:25 UTC

The web site & forums are back & responsive.
The Scheduler is responding quickly & download speeds are OK.

But uploads- taking lots & lots of retries to get them to go through, at 2-4kB/s when they eventually do upload.
Edit- now it's down to 1-2kB/s.
Grant
Darwin NT
ID: 1932961 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36774
Credit: 261,360,520
RAC: 489
Australia
Message 1932967 - Posted: 1 May 2018, 8:43:06 UTC

If I can can get some uploads done I'll let yous know how the downloads go.

Cheers.
ID: 1932967 · Report as offensive
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 716
Credit: 8,032,827
RAC: 62
France
Message 1932969 - Posted: 1 May 2018, 8:49:54 UTC

the beginning of the new fail .. (UTC+2)

01-May-2018 04:07:08 [SETI@home] Sending scheduler request: To fetch work.
01-May-2018 04:07:08 [SETI@home] Requesting new tasks for CPU and AMD/ATI GPU
01-May-2018 04:07:30 [SETI@home] Scheduler request failed: Couldn't connect to server
01-May-2018 04:07:34 [---] Project communication failed: attempting access to reference site
01-May-2018 04:07:36 [---] Internet access OK - project servers may be temporarily down.

ID: 1932969 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1932976 - Posted: 1 May 2018, 9:36:45 UTC - in response to Message 1932961.  

But uploads- taking lots & lots of retries to get them to go through, at 2-4kB/s when they eventually do upload.
Edit- now it's down to 1-2kB/s.
The actual upload speed is OK. The speed reported in BOINC Manager is averaged over all the stops, starts, and retries. Some uploads get stuck at the 16K point and have to go through the full restart from the beginning again.
ID: 1932976 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36774
Credit: 261,360,520
RAC: 489
Australia
Message 1932982 - Posted: 1 May 2018, 10:16:04 UTC

After laboriously hitting the retry button I finally got all my uploads done and then the downloads just flowed down.

Cheers.
ID: 1932982 · Report as offensive
Eric Korpela Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1382
Credit: 54,506,847
RAC: 60
United States
Message 1933040 - Posted: 1 May 2018, 15:33:09 UTC - in response to Message 1932982.  

Posted by CSS IT ~ mp
On 4/30/2018 at 7:14 am PST
Modified on 5/1/2018 at 5:06 am PST
Modified by CSS IT ~ mp
Posted in Unscheduled Outage
Outage Type: UNSCHEDULED OUTAGE (AMENDED)
Date Submitted: Monday, April 30, 2018
Outage Start/End Time: 0700 – TBD
Groups Impacted: Campus
Equipment: Campus Network

Description: UPDATE: Tuesday, May 1, 2018 1:59am – The firewall has been stable since 12:42am, services appear to be restored. The vendor will continue monitoring.

Monday, April 30, 2018 8:51pm – This evening at 7:06pm the data center firewalls reloaded on their own. The vendor is currently working to restore service.

Monday, April 30, 2018 3:38pm – The firewalls have been stable since 11:40am this morning. Users may need to check/restart services that could have hung during the outage.

IST staff are still working on the root cause of this outage. This evening after business hours at 10:00pm, the network team will troubleshoot further to fully restore network services.

Monday, April 30, 2018 2:20pm – This continues to be a sporadic ongoing issue and the network team is working to resolve the problem. There is no ETA at this time.

Monday, April 30, 2018 10:17am – IST staff is aware of instability to the Palto Alto Firewall in the Earl Warrent Data Center and are troubleshooting to determine the cause and work toward a resolution.

Monday, April 30, 2018 9:17am – The Service Desk continues to receive reports of network issues affecting many services including CAS, VPN, and connectivity to other applications hosted on campus. The network team is working to correct the issue as quickly as possible. All workloads hosted in our environment are up and running and should respond normally as soon as the issue is resolved.

The Service Desk are receiving calls of intermittent network issues.

IST staff is working quickly to identify the source of the problem and to restore services as quickly as possible.

INC:INC0660327
@SETIEric@qoto.org (Mastodon)

ID: 1933040 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1933104 - Posted: 2 May 2018, 1:32:29 UTC - in response to Message 1932976.  

But uploads- taking lots & lots of retries to get them to go through, at 2-4kB/s when they eventually do upload.
Edit- now it's down to 1-2kB/s.
The actual upload speed is OK. The speed reported in BOINC Manager is averaged over all the stops, starts, and retries. Some uploads get stuck at the 16K point and have to go through the full restart from the beginning again.

Ok, so the longer it takes to time out, and the more retries it requires to go through, the lower the reported speed.
Grant
Darwin NT
ID: 1933104 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1933128 - Posted: 2 May 2018, 3:58:26 UTC

Final IST update.
Outage Type: UNSCHEDULED OUTAGE (AMENDED) (RESOLVED)
Date Submitted: Monday, April 30 – Tuesday, May 1, 2018
Outage Start/End Time: 0700 – 1431
Groups Impacted: Campus
Equipment: Campus Network

Description: UPDATE: Tuesday, May 1, 2018 2:31pm – The firewall and network continue to remain stable. This issue is now resolved at this time.

Grant
Darwin NT
ID: 1933128 · Report as offensive
JamesJenkins

Send message
Joined: 14 Jul 14
Posts: 1
Credit: 11,920,594
RAC: 0
United States
Message 1933504 - Posted: 4 May 2018, 5:22:42 UTC
Last modified: 4 May 2018, 5:30:07 UTC

I once solved issues with network messaging software called JGroups by watching the UDP receive buffers using a linux command. We had a UDP receive buffer that was overflowing and thus dropping UDP packets so we increased concurrency (threads) so that the receive buffer queue would process quicker and not overflow. That fixed it. Computers can have concurrency limits or buffer limits.
ID: 1933504 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1935013 - Posted: 11 May 2018, 5:43:50 UTC

Another update,
Outage Type: UNSCHEDULED OUTAGE (AMENDED)
Date Submitted: Thursday, May 3, 2018
Outage Start/End Time: 0815 – ongoing
Groups Impacted: Datacenter Firewall Administrators
Equipment: Panorama Firewall Management System

Description: UPDATE: Monday, May 7, 2018 11:51am – The vendor is still investigating the equipment failure. Release of the configuration freeze is scheduled for Monday May 14th at 12:00 noon to help ensure no disruption during finals.

Some come midday of the 14th, things may get ugly again if they haven't sorted out what went wrong last time.


Copyright 2006 The Regents of the University of California.

It's been a while since they gave that page a good going over.
Grant
Darwin NT
ID: 1935013 · Report as offensive

Message boards : News : Dropped packets


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.