Message boards :
News :
Dropped packets
Message board moderation
Author | Message |
---|---|
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
The UC data center switched over to a new firewall this morning. Since then packets into and out of the data center have been suffering drops. The Data Center staff is debugging the problem, we'll probably be dropping packets until it's resolved. @SETIEric@qoto.org (Mastodon) |
kittyman Send message Joined: 9 Jul 00 Posts: 51478 Credit: 1,018,363,574 RAC: 1,004 |
Thank you for the update, Eric. At least we know they are aware of the problem and I am sure it will be resolved in due course. "Time is simply the mechanism that keeps everything from happening all at once." |
betreger Send message Joined: 29 Jun 99 Posts: 11415 Credit: 29,581,041 RAC: 66 |
Sten, did you drop a packet? |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 31006 Credit: 53,134,872 RAC: 32 |
FYI there may be more wonky this evening starting 10:00PM Berkeley time http://ucbsystems.org/category/active/ <--- save the link for future reference if the system goes down some other time Description: UPDATE: Monday, April 30, 2018 3:38pm – The firewalls have been stable since 11:40am this morning. Users may need to check/restart services that could have hung during the outage. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Thanks for that update Gary. Need to make sure the systems are ready to put to bed before 10PM. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
Description: UPDATE: Monday, April 30, 2018 3:38pm – The firewalls have been stable since 11:40am this morning. Users may need to check/restart services that could have hung during the outage. If only that were so. Another IST update- UPDATE: Monday, April 30, 2018 8:51pm – This evening at 7:06PM the data center firewalls reloaded on their own. The vendor is currently working to restore service. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
The web site & forums are back & responsive. The Scheduler is responding quickly & download speeds are OK. But uploads- taking lots & lots of retries to get them to go through, at 2-4kB/s when they eventually do upload. Edit- now it's down to 1-2kB/s. Grant Darwin NT |
Wiggo Send message Joined: 24 Jan 00 Posts: 36774 Credit: 261,360,520 RAC: 489 |
If I can can get some uploads done I'll let yous know how the downloads go. Cheers. |
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
the beginning of the new fail .. (UTC+2) 01-May-2018 04:07:08 [SETI@home] Sending scheduler request: To fetch work. 01-May-2018 04:07:08 [SETI@home] Requesting new tasks for CPU and AMD/ATI GPU 01-May-2018 04:07:30 [SETI@home] Scheduler request failed: Couldn't connect to server 01-May-2018 04:07:34 [---] Project communication failed: attempting access to reference site 01-May-2018 04:07:36 [---] Internet access OK - project servers may be temporarily down. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
But uploads- taking lots & lots of retries to get them to go through, at 2-4kB/s when they eventually do upload.The actual upload speed is OK. The speed reported in BOINC Manager is averaged over all the stops, starts, and retries. Some uploads get stuck at the 16K point and have to go through the full restart from the beginning again. |
Wiggo Send message Joined: 24 Jan 00 Posts: 36774 Credit: 261,360,520 RAC: 489 |
After laboriously hitting the retry button I finally got all my uploads done and then the downloads just flowed down. Cheers. |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
Posted by CSS IT ~ mp On 4/30/2018 at 7:14 am PST Modified on 5/1/2018 at 5:06 am PST Modified by CSS IT ~ mp Posted in Unscheduled Outage Outage Type: UNSCHEDULED OUTAGE (AMENDED) Date Submitted: Monday, April 30, 2018 Outage Start/End Time: 0700 – TBD Groups Impacted: Campus Equipment: Campus Network Description: UPDATE: Tuesday, May 1, 2018 1:59am – The firewall has been stable since 12:42am, services appear to be restored. The vendor will continue monitoring. Monday, April 30, 2018 8:51pm – This evening at 7:06pm the data center firewalls reloaded on their own. The vendor is currently working to restore service. Monday, April 30, 2018 3:38pm – The firewalls have been stable since 11:40am this morning. Users may need to check/restart services that could have hung during the outage. IST staff are still working on the root cause of this outage. This evening after business hours at 10:00pm, the network team will troubleshoot further to fully restore network services. Monday, April 30, 2018 2:20pm – This continues to be a sporadic ongoing issue and the network team is working to resolve the problem. There is no ETA at this time. Monday, April 30, 2018 10:17am – IST staff is aware of instability to the Palto Alto Firewall in the Earl Warrent Data Center and are troubleshooting to determine the cause and work toward a resolution. Monday, April 30, 2018 9:17am – The Service Desk continues to receive reports of network issues affecting many services including CAS, VPN, and connectivity to other applications hosted on campus. The network team is working to correct the issue as quickly as possible. All workloads hosted in our environment are up and running and should respond normally as soon as the issue is resolved. The Service Desk are receiving calls of intermittent network issues. IST staff is working quickly to identify the source of the problem and to restore services as quickly as possible. INC:INC0660327 @SETIEric@qoto.org (Mastodon) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
But uploads- taking lots & lots of retries to get them to go through, at 2-4kB/s when they eventually do upload.The actual upload speed is OK. The speed reported in BOINC Manager is averaged over all the stops, starts, and retries. Some uploads get stuck at the 16K point and have to go through the full restart from the beginning again. Ok, so the longer it takes to time out, and the more retries it requires to go through, the lower the reported speed. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
Final IST update. Outage Type: UNSCHEDULED OUTAGE (AMENDED) (RESOLVED) Grant Darwin NT |
JamesJenkins Send message Joined: 14 Jul 14 Posts: 1 Credit: 11,920,594 RAC: 0 |
I once solved issues with network messaging software called JGroups by watching the UDP receive buffers using a linux command. We had a UDP receive buffer that was overflowing and thus dropping UDP packets so we increased concurrency (threads) so that the receive buffer queue would process quicker and not overflow. That fixed it. Computers can have concurrency limits or buffer limits. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
Another update, Outage Type: UNSCHEDULED OUTAGE (AMENDED) Some come midday of the 14th, things may get ugly again if they haven't sorted out what went wrong last time. Copyright 2006 The Regents of the University of California. It's been a while since they gave that page a good going over. Grant Darwin NT |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.