Panic Mode On (80) Server Problems?

Message boards : Number crunching : Panic Mode On (80) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 25 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1330648 - Posted: 24 Jan 2013, 6:50:52 UTC - in response to Message 1330458.  


Glad it recovered. There are a few dips in the inbound traffic, but at least my systems have as much work as they are allowed to have.
Of concern is the AP validation- nothing appears to be happening, the number waiting for validaion is growing, quickly.
Grant
Darwin NT
ID: 1330648 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1330670 - Posted: 24 Jan 2013, 8:41:02 UTC - in response to Message 1330648.  

Of concern is the AP validation- nothing appears to be happening, the number waiting for validaion is growing, quickly.

The numbers may be growing.. but I just reported six tasks and all but one got validated instantly. There may be more to that story, like some specific conditions or a certain date in time that are being slow to be validated, etc.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1330670 · Report as offensive
Profile KWSN Ekky Ekky Ekky
Avatar

Send message
Joined: 25 May 99
Posts: 944
Credit: 52,956,491
RAC: 67
United Kingdom
Message 1330681 - Posted: 24 Jan 2013, 10:26:05 UTC
Last modified: 24 Jan 2013, 10:28:53 UTC

Possible reporting blip starting?
Cricket "bits out" definitely heading southwards.

ID: 1330681 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1330685 - Posted: 24 Jan 2013, 11:29:30 UTC - in response to Message 1330681.  

Possible reporting blip starting?
Cricket "bits out" definitely heading southwards.

Coming back up a bit now, but I've been having trouble reporting on all my machines since sometime before 0900Z -- timeouts, etc.
ID: 1330685 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1330698 - Posted: 24 Jan 2013, 13:04:07 UTC

Why is it that as soon as APs start generating, everything goes south? Normal u/d loads until then. Has anyone given any thought of compressing these WUs and then having the apps uncompress once received at the users machine? Doesn't Einstein use something like this to send their large packets?


I don't buy computers, I build them!!
ID: 1330698 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1330707 - Posted: 24 Jan 2013, 14:00:33 UTC - in response to Message 1330698.  

Why is it that as soon as APs start generating, everything goes south? Normal u/d loads until then. Has anyone given any thought of compressing these WUs and then having the apps uncompress once received at the users machine? Doesn't Einstein use something like this to send their large packets?

Yes, people have thought about it - and the answer is that even with LZMA compression (7-zip), you only save about 3% of the 8MB file size. Try it for yourself - the WUs are practically incompressible, or to put it another way, they're already compressed almost as far as it's possible to compress them.

Mind you - you very rarely get a WU that will compress down to half its original size. They (almost?) always come from the B3_P1 channel, and indicate that once again a bit has got stuck at the receiver or in the Aercibo antenna cabling. The data is useless, and the task will overflow in seconds - it's a useful diagnostic test.
ID: 1330707 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1330708 - Posted: 24 Jan 2013, 14:14:26 UTC - in response to Message 1330707.  

Why is it that as soon as APs start generating, everything goes south? Normal u/d loads until then. Has anyone given any thought of compressing these WUs and then having the apps uncompress once received at the users machine? Doesn't Einstein use something like this to send their large packets?

Yes, people have thought about it - and the answer is that even with LZMA compression (7-zip), you only save about 3% of the 8MB file size. Try it for yourself - the WUs are practically incompressible, or to put it another way, they're already compressed almost as far as it's possible to compress them.

Mind you - you very rarely get a WU that will compress down to half its original size. They (almost?) always come from the B3_P1 channel, and indicate that once again a bit has got stuck at the receiver or in the Aercibo antenna cabling. The data is useless, and the task will overflow in seconds - it's a useful diagnostic test.


Thanks Richard, you've answered a question that's been bugging me for quite some time.


I don't buy computers, I build them!!
ID: 1330708 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 1330755 - Posted: 24 Jan 2013, 17:18:10 UTC

Since 15:00 U.T.C. one of my machine has been getting scheduler request failures is there something wrong or is my timing for new work wrong
ID: 1330755 · Report as offensive
Mark Lybeck

Send message
Joined: 9 Aug 99
Posts: 245
Credit: 216,677,290
RAC: 173
Finland
Message 1330756 - Posted: 24 Jan 2013, 17:19:10 UTC

Scheduler connections are not getting through again...
ID: 1330756 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 1330757 - Posted: 24 Jan 2013, 17:19:38 UTC - in response to Message 1330708.  

So then the simple answer is that processing AP's is a test of band width capabilities for SETI, and that seemingly each time that happens the bandwidth is found inadequate.


I suppose, absent increasing bandwidth, there really is only a single solution to this.

Curious that this conclusion is as yet undetected.

And no, I don't run AP's -- don't want to be part of the problem there.
ID: 1330757 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1330763 - Posted: 24 Jan 2013, 17:38:26 UTC

Or could it be that the servers are tuned for working with 366kb files and not 8mb files, never mind a mixture of the two?
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1330763 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1330765 - Posted: 24 Jan 2013, 17:44:20 UTC - in response to Message 1330757.  

So then the simple answer is that processing AP's is a test of band width capabilities for SETI, and that seemingly each time that happens the bandwidth is found inadequate.


I suppose, absent increasing bandwidth, there really is only a single solution to this.

Curious that this conclusion is as yet undetected.

And no, I don't run AP's -- don't want to be part of the problem there.


Wait and see what happens when the new work units hit the airwaves. You think APs have slowed the system down? The only thing I can hope for is that there are only one type of work unit. And that is the new one. If thety have 3 we can forget about ever connecting again. (said tongue in cheek of course.)
[/quote]

Old James
ID: 1330765 · Report as offensive
Profile KWSN Ekky Ekky Ekky
Avatar

Send message
Joined: 25 May 99
Posts: 944
Credit: 52,956,491
RAC: 67
United Kingdom
Message 1330766 - Posted: 24 Jan 2013, 17:46:37 UTC

Still everything maxed out but nothing getting there from here :-(

ID: 1330766 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1330796 - Posted: 24 Jan 2013, 18:18:04 UTC
Last modified: 24 Jan 2013, 18:24:51 UTC

I've had the odd "Failure when receiving data from the peer" but it's mostly "Couldn't connect to server" for any attempt to contact the Scheduler. Looks like it turned to crap again about 8 hours ago- inbound network traffic dropped off around then.

EDIT- and if i do get a response, it's taking at least 2-3minutes.
Grant
Darwin NT
ID: 1330796 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1330805 - Posted: 24 Jan 2013, 18:33:13 UTC

Just noticed one thing in the log of one of my computers:

24/01/2013 19:21:16 SETI@home Sending scheduler request: To fetch work.
24/01/2013 19:21:16 SETI@home Reporting 4 completed tasks, requesting new tasks
24/01/2013 19:23:17 Project communication failed: attempting access to reference site
24/01/2013 19:23:21 Internet access OK - project servers may be temporarily down.
24/01/2013 19:23:21 SETI@home Scheduler request failed: Timeout was reached

Is the timeout not supposed to be 5 minutes and not just 2?
ID: 1330805 · Report as offensive
Profile Swordfish
Avatar

Send message
Joined: 5 Aug 06
Posts: 72
Credit: 3,014,493
RAC: 0
United Kingdom
Message 1330818 - Posted: 24 Jan 2013, 18:43:39 UTC

I cant get anything to report here either
ID: 1330818 · Report as offensive
Profile Swordfish
Avatar

Send message
Joined: 5 Aug 06
Posts: 72
Credit: 3,014,493
RAC: 0
United Kingdom
Message 1330824 - Posted: 24 Jan 2013, 18:47:14 UTC

It's ironic no sooner as I posted my message below, all tasks reported.
ID: 1330824 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 1330836 - Posted: 24 Jan 2013, 19:04:09 UTC - in response to Message 1330824.  

Swordfish, Not to worry:

1/24/2013 11:59:06 AM SETI@home Reporting 3 completed tasks, not requesting new tasks
1/24/2013 12:00:40 PM Project communication failed: attempting access to reference site
1/24/2013 12:00:40 PM SETI@home Scheduler request failed: Server returned nothing (no headers, no data)
1/24/2013 12:00:42 PM Internet access OK - project servers may be temporarily down.

ID: 1330836 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1330858 - Posted: 24 Jan 2013, 19:23:08 UTC - in response to Message 1330805.  

Just noticed one thing in the log of one of my computers:

24/01/2013 19:21:16 SETI@home Sending scheduler request: To fetch work.
24/01/2013 19:21:16 SETI@home Reporting 4 completed tasks, requesting new tasks
24/01/2013 19:23:17 Project communication failed: attempting access to reference site
24/01/2013 19:23:21 Internet access OK - project servers may be temporarily down.
24/01/2013 19:23:21 SETI@home Scheduler request failed: Timeout was reached

Is the timeout not supposed to be 5 minutes and not just 2?

Timeout is whatever you have configured locally in place of

<http_transfer_timeout>seconds</http_transfer_timeout>
abort HTTP transfers if idle for this many seconds; default 300
ID: 1330858 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1330873 - Posted: 24 Jan 2013, 19:36:53 UTC - in response to Message 1330858.  


Inbound traffic has dropped even further.
Grant
Darwin NT
ID: 1330873 · Report as offensive
Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 25 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.