Panic Mode On (80) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (80) Server Problems?

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 25 · Next
Author Message
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 26 May 99
Posts: 6892
Credit: 25,526,083
RAC: 37,131
United Kingdom
Message 1326447 - Posted: 10 Jan 2013, 20:13:53 UTC - in response to Message 1326415.

I think I'm going to assume that the fact that you've got nothing better to do than argue semantic points from a post eight messages, two days and a maintenance outage ago is a good thing.

It means that the project has (very quietly) started running so smoothly that you've all got nothing better to panic about :P

Quite correct I haven't been panicking for ages
____________


Today is life, the only life we're sure of. Make the most of today.

Rolf
Send message
Joined: 16 Jun 09
Posts: 114
Credit: 7,816,984
RAC: 151
Switzerland
Message 1326450 - Posted: 10 Jan 2013, 20:24:45 UTC

Please, no panic!
Panic is when people start loosing connection to reality - but we are living in reality.

If "he" doesn't care (or "they"don't care), there are still two possibilities: Either he doesn't want to take care (his own decision) or he does not have the time to take care (somebody else decided).
As an example take Matt's blog: http://setiathome.berkeley.edu/forum_thread.php?id=59157
Although it's nearly 3 years old, some parts could still be valid!

I prefer the last Q-A-block:

Q: What do you mean "you hit the limit?" Are you mad at us?
A: No. No. No. Nothing personal, but I'd really like other project staff to chime in more often, and maybe I need to step away in order for that to happen. Plus being in the spotlight I sometimes find myself having to argue on behalf of policies or practices around here which I don't know much about, or I don't exactly agree with. Meh. I do wish SETI@home had better avenues and resources for public relations. In any case I'll still keep working towards improving that in the future whenever I can, whether or not that's direct contact from me or otherwise. The daily notes were fun, but honestly probably not the best use of my time. I may just report about bigger projects as they happen. We'll see.

So I'll be around - just a lot quieter.

TBar
Volunteer tester
Send message
Joined: 22 May 99
Posts: 1209
Credit: 44,881,667
RAC: 117,949
United States
Message 1326452 - Posted: 10 Jan 2013, 20:30:31 UTC

I've been receiving Short CPU tasks that insist they run Immediately. Plus, all my recently downloaded CUDA 23 tasks are...Shorties. I see an approaching Shortie Storm.

Batten down the hatches, no need to panic, it's just a little Short thing. Unless...

Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 26 May 99
Posts: 6892
Credit: 25,526,083
RAC: 37,131
United Kingdom
Message 1326513 - Posted: 11 Jan 2013, 0:15:39 UTC

Matt has posted an update and I believe that all serious Setizens should read:

Matt's Post Here
____________


Today is life, the only life we're sure of. Make the most of today.

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38828
Credit: 576,007,412
RAC: 537,953
United States
Message 1326670 - Posted: 11 Jan 2013, 14:21:55 UTC

Something's gone away....
Everything was still going fine about 5 hours ago when I went to bed. This morning I have a bunch of rigs that have dropped back to Einstein because they can't connect to the server to report and get new work.

Whazzup?
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4028
Credit: 110,875,590
RAC: 146,221
United States
Message 1326673 - Posted: 11 Jan 2013, 14:31:16 UTC - in response to Message 1326670.

Something's gone away....
Everything was still going fine about 5 hours ago when I went to bed. This morning I have a bunch of rigs that have dropped back to Einstein because they can't connect to the server to report and get new work.

Whazzup?

Looking over my logs for the night I see spurts of "Scheduler request failed: HTTP gateway timeout", but I often see that the bandwidth is maxed constantly pegged.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38828
Credit: 576,007,412
RAC: 537,953
United States
Message 1326675 - Posted: 11 Jan 2013, 14:36:45 UTC - in response to Message 1326673.

Something's gone away....
Everything was still going fine about 5 hours ago when I went to bed. This morning I have a bunch of rigs that have dropped back to Einstein because they can't connect to the server to report and get new work.

Whazzup?

Looking over my logs for the night I see spurts of "Scheduler request failed: HTTP gateway timeout", but I often see that the bandwidth is maxed constantly pegged.

Seems to be not working as smoothly as it has been for a while...
My #1 rig took a bunch of update button pushing and about 20-25 attempts to connect to report. Mostly could not connect to server and a couple of partial connects with no response. Finally just now got through and got 100 GPU tasks in one shot.
But something is tying up scheduler comms badly again.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Sp@ceNv@derProject donor
Avatar
Send message
Joined: 10 Jul 05
Posts: 41
Credit: 81,049,530
RAC: 118,828
Belgium
Message 1326679 - Posted: 11 Jan 2013, 14:51:00 UTC - in response to Message 1326675.

Same here in Belgium .... can upload finished tasks, but can't report them nor get new work as long as reporting fails ...

here we go again lol :D
____________
To boldly crunch ...

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38828
Credit: 576,007,412
RAC: 537,953
United States
Message 1326681 - Posted: 11 Jan 2013, 14:56:05 UTC - in response to Message 1326679.
Last modified: 11 Jan 2013, 15:01:48 UTC

Same here in Belgium .... can upload finished tasks, but can't report them nor get new work as long as reporting fails ...

here we go again lol :D

Yup.
You can't fool the kitties. When in a matter of hours 5 out of 9 rigs here all of a sudden have started running Einstein, something's gone away in the server closet again.
That means that they have run through their 100 task GPU cache, which takes at least a few of hours on the fast rigs, have not been able to replenish the cache, and the kitties have gone off sniffing for something else to do.

And if you look at the inbound traffic on the Cricket graph, the flow had been very smooth....now looks a bit ragged.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Tom*
Send message
Joined: 12 Aug 11
Posts: 114
Credit: 4,672,674
RAC: 9,147
United States
Message 1326694 - Posted: 11 Jan 2013, 15:35:45 UTC - in response to Message 1326415.

Not quietly enough :-)

It means that the project has (very quietly) started running so smoothly that you've all got nothing better to panic about :P

LANDO is not running also errors on MB Channels

someone said we started getting a storm of shorties that with AP's is what usually throws us over the Network Performance Knee

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8457
Credit: 48,534,387
RAC: 79,266
United Kingdom
Message 1326712 - Posted: 11 Jan 2013, 16:26:31 UTC

I just did an update on a baby host that only does about 1 CPU task a day. Report one, request zero, only three tasks listed as 'other results'. It still took 71 seconds to turn it round (successfully) - that's mighty slow.

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38828
Credit: 576,007,412
RAC: 537,953
United States
Message 1326718 - Posted: 11 Jan 2013, 16:37:44 UTC - in response to Message 1326712.
Last modified: 11 Jan 2013, 16:52:10 UTC

I just did an update on a baby host that only does about 1 CPU task a day. Report one, request zero, only three tasks listed as 'other results'. It still took 71 seconds to turn it round (successfully) - that's mighty slow.

Well, like I said Richard, you can't fool the kitties.
With 9 rigs running, all with GPUs, some fast, some slow, when I can see a trend in 5 of them going off to another project, nobody can tell me something has not changed. Been at this too long not to see the writing on the wall.

It's loosened up just a little bit in the last half an hour, but still not back to where it was last week.


EDIT...
And, LOL...
I must confess it drives me a bit mad when I seen my backup project, Einstein, download MBits worth of files with all speed and see reporting work requests go through in less than a second or two. (And I understand why this is, I just wish it could be so for Seti).
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,761,316
RAC: 2,075
United States
Message 1326723 - Posted: 11 Jan 2013, 17:00:10 UTC
Last modified: 11 Jan 2013, 17:02:33 UTC

My logs started showing reporting problems starting around 5am EST. Uploads fine just scheduler reports.

Edit: And always as soon as I post here a request goes through...
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,761,316
RAC: 2,075
United States
Message 1326747 - Posted: 11 Jan 2013, 18:31:01 UTC - in response to Message 1326723.

My logs started showing reporting problems starting around 5am EST. Uploads fine just scheduler reports.

Edit: And always as soon as I post here a request goes through...


Well, it worked that one time... now back to not reporting unless I NNT.

____________
"Life is just nature's way of keeping meat fresh." - The Doctor

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38828
Credit: 576,007,412
RAC: 537,953
United States
Message 1326748 - Posted: 11 Jan 2013, 18:39:39 UTC - in response to Message 1326747.

My logs started showing reporting problems starting around 5am EST. Uploads fine just scheduler reports.

Edit: And always as soon as I post here a request goes through...


Well, it worked that one time... now back to not reporting unless I NNT.

It's still a train wreck.
I have to get some more sleep sooner or later, so the update button shall have to rest. And then the asinine Boincmanager backoffs will take hold again.
As well as the asinine 100 WU GPU limits.

Whatever.

I have my rigs all pledged to 100% Seti service.
If the Seti servers are not up to that pledge, the kitties go elsewhere.
Not that I am happy about it.

I just don't have 24/7 to sit here and punch the buttons to avoid the dang backoffs.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Profile KWSN Ekky Ekky Ekky
Avatar
Send message
Joined: 25 May 99
Posts: 922
Credit: 11,297,453
RAC: 13,168
United Kingdom
Message 1326776 - Posted: 11 Jan 2013, 19:29:48 UTC

Cricket bits out still spiraling ever so slowly downwards. Hope someone can fix it all before the weekend curtain.
____________

Profile James SotherdenProject donor
Avatar
Send message
Joined: 16 May 99
Posts: 8671
Credit: 32,879,130
RAC: 56,656
United States
Message 1326781 - Posted: 11 Jan 2013, 19:32:40 UTC

I cant get a thing even with NNT set for all 3 computers.
____________

Old James

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38828
Credit: 576,007,412
RAC: 537,953
United States
Message 1326789 - Posted: 11 Jan 2013, 19:44:16 UTC - in response to Message 1326776.

Cricket bits out still spiraling ever so slowly downwards. Hope someone can fix it all before the weekend curtain.

No worries here no more.
The kitties have all the rigs set to full time Seti.
When that does not happen for any reason, Einstein sets in on the zero workshare default mode.

I just don't sweat it anymore. I may bitch about it from time to time, but I realize, as one who has been here for over 12 years, it just all works out.

Or doesn't, as the case may be. I know what I want to happen, and when it doesn't, I go to sleep and dream about what might be.

These things are quite apparent to myself, and wish they would be to most Seti users. It just DON't work all the time. Most of the user's computers don't either.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

nairb
Send message
Joined: 18 Mar 03
Posts: 193
Credit: 3,809,743
RAC: 114
United Kingdom
Message 1326849 - Posted: 11 Jan 2013, 22:22:36 UTC

still cannot report/get any w/u here yet. So I guess its not all fixed yet?

Nairb
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5778
Credit: 57,768,283
RAC: 48,854
Australia
Message 1326854 - Posted: 11 Jan 2013, 22:34:43 UTC - in response to Message 1326849.


Still borked- inbound network traffic has dropped off considerably. Looks like problems started around 12 hours ago, but it became really bad about 2-3 hours ago.
"Scheduler request failed: Couldn't connect to server" is about the only response i'm getting. Occasionally i get some work, but not often enough to keep what little cache i've got. It's not so slowly dwindling.
____________
Grant
Darwin NT.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 25 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?

Copyright © 2014 University of California