Panic Mode On (116) Server Problems?

Message boards : Number crunching : Panic Mode On (116) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 37 · 38 · 39 · 40 · 41 · 42 · 43 . . . 47 · Next

AuthorMessage
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1999807 - Posted: 26 Jun 2019, 20:36:26 UTC
Last modified: 26 Jun 2019, 20:38:05 UTC

I'd love to know the trick to clear the stalled uploads. Nothing I have tried has allowed them to upload. Several thousand uploads pending on each host.


Not sure about "thousands", but I just use the "transfers" tab in Boinc Tasks and a lot of mouse clicks!!

Saying that, all my Linux machines have cleared their backlog, however both Windows machines with less uploads take forever.
ID: 1999807 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1999810 - Posted: 26 Jun 2019, 20:40:25 UTC - in response to Message 1999807.  

Saying that all my Linux machines have cleared their backlog, however both Windows machines with less uploads take forever.
Since the servers also run Linux, is that an indication that - deliberately or not - 'Linux shall speak unto Linux', and foreign dialects are harder to understand? :P
ID: 1999810 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1999811 - Posted: 26 Jun 2019, 20:42:27 UTC - in response to Message 1999734.  

This is TORTURE... SETI style...

+1

Stephen
ID: 1999811 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1999813 - Posted: 26 Jun 2019, 20:47:34 UTC - in response to Message 1999754.  

Fun and games on the left coast I see. Upload server having a very difficult time handling the huge volume of uploads from the noise bombs. Out of cpu work again, gpu cache continues to drop because stuck uploads prevent downloading of any new work.


. . Completely out of work here :(

Stephen

<sigh>
ID: 1999813 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1999814 - Posted: 26 Jun 2019, 20:51:22 UTC - in response to Message 1999813.  
Last modified: 26 Jun 2019, 20:52:17 UTC

Fun and games on the left coast I see. Upload server having a very difficult time handling the huge volume of uploads from the noise bombs. Out of cpu work again, gpu cache continues to drop because stuck uploads prevent downloading of any new work.


. . Completely out of work here :(

Stephen

<sigh>

Hmm. Almost full caches across 6 machines here
ID: 1999814 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1999816 - Posted: 26 Jun 2019, 21:01:13 UTC - in response to Message 1999810.  
Last modified: 26 Jun 2019, 21:03:55 UTC

Saying that all my Linux machines have cleared their backlog, however both Windows machines with less uploads take forever.
Since the servers also run Linux, is that an indication that - deliberately or not - 'Linux shall speak unto Linux', and foreign dialects are harder to understand? :P

Could it be something to do with TCP 1323, for timestamps and window scaling?

Linux has all of that enabled by default, and I know I remember before the servers moved to the co-lo, windows machines had tons of problems with the saturated link up the hill, but once timestamps and window scaling were enabled in the registry, comms got much better.

https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-2000-server/cc938205(v=technet.10)

Also found an old thread from this forum when searching for 1323..
https://setiathome.berkeley.edu/forum_thread.php?id=71002

I had *some* issues with uploads earlier today, but they did clear up and go through, but it is also a linux machine, so.. instead of having a failed connection after a few seconds, the connection would stay open for several minutes, with data transfer going to 100% and taking 60+ seconds for the server to send an ACK.

timestamps and window scaling certainly help, but they're not a complete and total fix
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1999816 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1999822 - Posted: 26 Jun 2019, 21:35:08 UTC - in response to Message 1999814.  

Fun and games on the left coast I see. Upload server having a very difficult time handling the huge volume of uploads from the noise bombs. Out of cpu work again, gpu cache continues to drop because stuck uploads prevent downloading of any new work.


. . Completely out of work here :(

Stephen

<sigh>

Hmm. Almost full caches across 6 machines here


. . My Linux boxes are getting some work now but still having problems with uploads on and off. My Windows boxes are completely stalled and OOW.

Stephen

:(
ID: 1999822 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1999827 - Posted: 26 Jun 2019, 21:47:56 UTC - in response to Message 1999816.  

Yes, I remember RFC 1323 too - somebody (not me) made a good find there, and I was happy to confirm it - it was exactly what we needed at the time.

But that need was different: it was about downloads, when packets were dropped on an established, but congested, link.

But this problem is about establishing the link in the first place - getting a connection. And there are very few packets to drop in an upload. There may be (almost certainly is) an equivalent RFC defining the connection handshake, but I don't know how to find it offhand.
ID: 1999827 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1999828 - Posted: 26 Jun 2019, 21:47:59 UTC

When a beer doesn't work, have a sleep instead. ;-)

Cheers.
ID: 1999828 · Report as offensive
Profile Freewill Project Donor
Avatar

Send message
Joined: 19 May 99
Posts: 766
Credit: 354,398,348
RAC: 11,693
United States
Message 1999829 - Posted: 26 Jun 2019, 21:52:58 UTC

Getting downloads, but the upload rate isn't fast enough. The problem for me seems to be so many of these noise bombs that I'm working through. They finish in 3 seconds and take a lot longer than that to upload.
ID: 1999829 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1999830 - Posted: 26 Jun 2019, 21:54:06 UTC - in response to Message 1999807.  
Last modified: 26 Jun 2019, 22:11:04 UTC

I'd love to know the trick to clear the stalled uploads. Nothing I have tried has allowed them to upload. Several thousand uploads pending on each host.


Not sure about "thousands", but I just use the "transfers" tab in Boinc Tasks and a lot of mouse clicks!!

Saying that, all my Linux machines have cleared their backlog, however both Windows machines with less uploads take forever.

Clicking Update does nothing since each machine has a constant 8 tasks Active with all the rest pending. When a task does eventually upload, it sits at 100% and then waits from a minute to a half hour before getting an ACK.

I run spoofed clients, so "thousands" is accurate and not some exaggeration. The Threadripper host has 3300 tasks in Upload currently.

[Edit] Finally out of work on the Threadripper host. Off to do work for Einstein and MilkyWay and wait for all the slow Seti uploads to clear so I can ask for work again.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1999830 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65738
Credit: 55,293,173
RAC: 49
United States
Message 1999831 - Posted: 26 Jun 2019, 21:54:40 UTC

I have no problems downloading, it's uploads that are stuck.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1999831 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1999833 - Posted: 26 Jun 2019, 22:02:47 UTC - in response to Message 1999830.  

I'd love to know the trick to clear the stalled uploads. Nothing I have tried has allowed them to upload. Several thousand uploads pending on each host.

Not sure about "thousands", but I just use the "transfers" tab in Boinc Tasks and a lot of mouse clicks!!
Saying that, all my Linux machines have cleared their backlog, however both Windows machines with less uploads take forever.

Clicking Update does nothing since each machine has a constant 8 tasks Active with all the rest pending. When a task does eventually upload, it sits at 100% and then waits from a minute to a half hour before getting an ACK.
I run spoofed clients, so "thousands" is accurate and not some exaggeration. The Threadripper host has 3300 tasks in Upload currently.


. . OK, so you're the culprit ... :)

Stephen

:)
ID: 1999833 · Report as offensive
Loren Datlof

Send message
Joined: 24 Jan 14
Posts: 73
Credit: 19,652,385
RAC: 0
United States
Message 1999855 - Posted: 26 Jun 2019, 23:49:52 UTC - in response to Message 1999835.  

All clear here on my 6 linux computers with GPUs. One has an upload that hasn't cleared and its back off is now 5 hours. It is strange because when a task finishes that same computer uploads it in about 20 seconds. This one upload won't upload even when I hit retry. At least I am back to crunching Seti.
ID: 1999855 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1999860 - Posted: 27 Jun 2019, 0:25:19 UTC - in response to Message 1999855.  

Try suspending network communication in the Manager, wait a minute and restart network communication, then hit Retry. That sometimes unsticks a recalcitrant upload.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1999860 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1999864 - Posted: 27 Jun 2019, 0:52:35 UTC

It looks like uploads have started to resume.
ID: 1999864 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1999867 - Posted: 27 Jun 2019, 1:19:24 UTC - in response to Message 1999864.  

Yes, I have finally managed to clear my uploads. Now starting to get downloads but some of them are stalling out. So still a struggle to rebuild caches.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1999867 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1999868 - Posted: 27 Jun 2019, 1:28:00 UTC - in response to Message 1999691.  

Well I've suspended GPU processing on my 2 rigs until I can get a bit of a buffer downloaded, but it's not fun downloading all this work when most of it is just garbage. :-(
Well all that crap that I downloaded has exceeded my peak internet allowance so I'm on a slow boat now until the end of the month during the day. :-(

Cheers.
ID: 1999868 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1999869 - Posted: 27 Jun 2019, 1:32:40 UTC - in response to Message 1999868.  

That really sucks ...
ID: 1999869 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1999870 - Posted: 27 Jun 2019, 1:34:45 UTC

And now the splitters have fallen on their face with the downloads finally working. RTS basically nil.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1999870 · Report as offensive
Previous · 1 . . . 37 · 38 · 39 · 40 · 41 · 42 · 43 . . . 47 · Next

Message boards : Number crunching : Panic Mode On (116) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.