Panic Mode On (80) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (80) Server Problems?

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 25 · Next
Author Message
WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8737
Credit: 25,596,728
RAC: 12,408
United Kingdom
Message 1330439 - Posted: 23 Jan 2013, 18:19:39 UTC

Cannot see anything wrong on cricket or server status but last three report/requests have
23/01/2013 18:14:39 | SETI@home | Scheduler request failed: HTTP service unavailable

First was at 18:04:03

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5917
Credit: 61,697,533
RAC: 27,513
Australia
Message 1330441 - Posted: 23 Jan 2013, 18:23:05 UTC


Looks like we have Scheduler issues again- for the last 10min every Scheduler request has resulted in "Server returned nothing (No headers, no data)". Network traffic shows downloads have dropped off, and there have been several glitches in inbound traffic. Going back through my log shows several periods of Scheduler returning nothing (no headers, no data) over the last 10 hours or so.
____________
Grant
Darwin NT.

Profile KWSN Ekky Ekky Ekky
Avatar
Send message
Joined: 25 May 99
Posts: 926
Credit: 12,373,956
RAC: 7,492
United Kingdom
Message 1330446 - Posted: 23 Jan 2013, 18:38:41 UTC - in response to Message 1330441.

Someone using insecticide on crickets?
____________

BarryAZ
Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 12,382,484
RAC: 2,723
United States
Message 1330447 - Posted: 23 Jan 2013, 18:45:09 UTC - in response to Message 1330446.

Well, at least this outage is during the day so someone can take a look onsite in a timely manner...

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5917
Credit: 61,697,533
RAC: 27,513
Australia
Message 1330451 - Posted: 23 Jan 2013, 18:50:53 UTC


Sever status shows green, but inbound & outbound traffic have hit bottom.
____________
Grant
Darwin NT.

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8759
Credit: 52,709,054
RAC: 26,372
United Kingdom
Message 1330458 - Posted: 23 Jan 2013, 19:19:37 UTC - in response to Message 1330451.


Sever status shows green, but inbound & outbound traffic have hit bottom.

And bounced....

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5917
Credit: 61,697,533
RAC: 27,513
Australia
Message 1330648 - Posted: 24 Jan 2013, 6:50:52 UTC - in response to Message 1330458.


Glad it recovered. There are a few dips in the inbound traffic, but at least my systems have as much work as they are allowed to have.
Of concern is the AP validation- nothing appears to be happening, the number waiting for validaion is growing, quickly.
____________
Grant
Darwin NT.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2326
Credit: 8,867,941
RAC: 957
United States
Message 1330670 - Posted: 24 Jan 2013, 8:41:02 UTC - in response to Message 1330648.

Of concern is the AP validation- nothing appears to be happening, the number waiting for validaion is growing, quickly.

The numbers may be growing.. but I just reported six tasks and all but one got validated instantly. There may be more to that story, like some specific conditions or a certain date in time that are being slow to be validated, etc.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile KWSN Ekky Ekky Ekky
Avatar
Send message
Joined: 25 May 99
Posts: 926
Credit: 12,373,956
RAC: 7,492
United Kingdom
Message 1330681 - Posted: 24 Jan 2013, 10:26:05 UTC
Last modified: 24 Jan 2013, 10:28:53 UTC

Possible reporting blip starting?
Cricket "bits out" definitely heading southwards.
____________

Profile ivan
Volunteer tester
Avatar
Send message
Joined: 5 Mar 01
Posts: 639
Credit: 146,876,046
RAC: 84,975
United Kingdom
Message 1330685 - Posted: 24 Jan 2013, 11:29:30 UTC - in response to Message 1330681.

Possible reporting blip starting?
Cricket "bits out" definitely heading southwards.

Coming back up a bit now, but I've been having trouble reporting on all my machines since sometime before 0900Z -- timeouts, etc.
____________

Profile Cliff HardingProject donor
Volunteer tester
Avatar
Send message
Joined: 18 Aug 99
Posts: 1026
Credit: 53,858,483
RAC: 18,296
United States
Message 1330698 - Posted: 24 Jan 2013, 13:04:07 UTC

Why is it that as soon as APs start generating, everything goes south? Normal u/d loads until then. Has anyone given any thought of compressing these WUs and then having the apps uncompress once received at the users machine? Doesn't Einstein use something like this to send their large packets?
____________


I don't buy computers, I build them!!

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8759
Credit: 52,709,054
RAC: 26,372
United Kingdom
Message 1330707 - Posted: 24 Jan 2013, 14:00:33 UTC - in response to Message 1330698.

Why is it that as soon as APs start generating, everything goes south? Normal u/d loads until then. Has anyone given any thought of compressing these WUs and then having the apps uncompress once received at the users machine? Doesn't Einstein use something like this to send their large packets?

Yes, people have thought about it - and the answer is that even with LZMA compression (7-zip), you only save about 3% of the 8MB file size. Try it for yourself - the WUs are practically incompressible, or to put it another way, they're already compressed almost as far as it's possible to compress them.

Mind you - you very rarely get a WU that will compress down to half its original size. They (almost?) always come from the B3_P1 channel, and indicate that once again a bit has got stuck at the receiver or in the Aercibo antenna cabling. The data is useless, and the task will overflow in seconds - it's a useful diagnostic test.

Profile Cliff HardingProject donor
Volunteer tester
Avatar
Send message
Joined: 18 Aug 99
Posts: 1026
Credit: 53,858,483
RAC: 18,296
United States
Message 1330708 - Posted: 24 Jan 2013, 14:14:26 UTC - in response to Message 1330707.

Why is it that as soon as APs start generating, everything goes south? Normal u/d loads until then. Has anyone given any thought of compressing these WUs and then having the apps uncompress once received at the users machine? Doesn't Einstein use something like this to send their large packets?

Yes, people have thought about it - and the answer is that even with LZMA compression (7-zip), you only save about 3% of the 8MB file size. Try it for yourself - the WUs are practically incompressible, or to put it another way, they're already compressed almost as far as it's possible to compress them.

Mind you - you very rarely get a WU that will compress down to half its original size. They (almost?) always come from the B3_P1 channel, and indicate that once again a bit has got stuck at the receiver or in the Aercibo antenna cabling. The data is useless, and the task will overflow in seconds - it's a useful diagnostic test.


Thanks Richard, you've answered a question that's been bugging me for quite some time.
____________


I don't buy computers, I build them!!

Profile [B^S] madmac
Volunteer tester
Avatar
Send message
Joined: 9 Feb 04
Posts: 1155
Credit: 3,904,137
RAC: 2,569
United Kingdom
Message 1330755 - Posted: 24 Jan 2013, 17:18:10 UTC

Since 15:00 U.T.C. one of my machine has been getting scheduler request failures is there something wrong or is my timing for new work wrong
____________

Mark Lybeck
Send message
Joined: 9 Aug 99
Posts: 209
Credit: 105,044,539
RAC: 40,955
Finland
Message 1330756 - Posted: 24 Jan 2013, 17:19:10 UTC

Scheduler connections are not getting through again...
____________

BarryAZ
Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 12,382,484
RAC: 2,723
United States
Message 1330757 - Posted: 24 Jan 2013, 17:19:38 UTC - in response to Message 1330708.

So then the simple answer is that processing AP's is a test of band width capabilities for SETI, and that seemingly each time that happens the bandwidth is found inadequate.


I suppose, absent increasing bandwidth, there really is only a single solution to this.

Curious that this conclusion is as yet undetected.

And no, I don't run AP's -- don't want to be part of the problem there.

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8734
Credit: 61,630,530
RAC: 52,380
United Kingdom
Message 1330763 - Posted: 24 Jan 2013, 17:38:26 UTC

Or could it be that the servers are tuned for working with 366kb files and not 8mb files, never mind a mixture of the two?
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile James Sotherden
Avatar
Send message
Joined: 16 May 99
Posts: 9026
Credit: 36,974,348
RAC: 23,824
United States
Message 1330765 - Posted: 24 Jan 2013, 17:44:20 UTC - in response to Message 1330757.

So then the simple answer is that processing AP's is a test of band width capabilities for SETI, and that seemingly each time that happens the bandwidth is found inadequate.


I suppose, absent increasing bandwidth, there really is only a single solution to this.

Curious that this conclusion is as yet undetected.

And no, I don't run AP's -- don't want to be part of the problem there.


Wait and see what happens when the new work units hit the airwaves. You think APs have slowed the system down? The only thing I can hope for is that there are only one type of work unit. And that is the new one. If thety have 3 we can forget about ever connecting again. (said tongue in cheek of course.)
____________

Old James

Profile KWSN Ekky Ekky Ekky
Avatar
Send message
Joined: 25 May 99
Posts: 926
Credit: 12,373,956
RAC: 7,492
United Kingdom
Message 1330766 - Posted: 24 Jan 2013, 17:46:37 UTC

Still everything maxed out but nothing getting there from here :-(
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5917
Credit: 61,697,533
RAC: 27,513
Australia
Message 1330796 - Posted: 24 Jan 2013, 18:18:04 UTC
Last modified: 24 Jan 2013, 18:24:51 UTC

I've had the odd "Failure when receiving data from the peer" but it's mostly "Couldn't connect to server" for any attempt to contact the Scheduler. Looks like it turned to crap again about 8 hours ago- inbound network traffic dropped off around then.

EDIT- and if i do get a response, it's taking at least 2-3minutes.
____________
Grant
Darwin NT.

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 25 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?

Copyright © 2014 University of California