Panic Mode On (76) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (76) Server Problems?

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 21 · Next
Author Message
Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 359,338
RAC: 33
Germany
Message 1273249 - Posted: 20 Aug 2012, 11:09:13 UTC - in response to Message 1273246.

For me uploads are fine but totally unable to report, scheduler time out

How many tasks are you trying to report?

Gruß,
Gundolf

Profile Fred E.Project donor
Volunteer tester
Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,139,004
RAC: 1
United States
Message 1273251 - Posted: 20 Aug 2012, 11:10:29 UTC

Uploads okay here. Scheduler requests okay when I'm on NNT. When I request work, 3 of 4 time out, then I get lost tasks when I do connect.
____________
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.

Wedge009
Volunteer tester
Avatar
Send message
Joined: 3 Apr 99
Posts: 340
Credit: 149,193,673
RAC: 199,026
Australia
Message 1273252 - Posted: 20 Aug 2012, 11:13:37 UTC
Last modified: 20 Aug 2012, 11:14:57 UTC

I understand and expect things to be poor after a multi-day-long outage, but past few days have been atrocious. Normal experience for me is periods of downloads < 1 KiB/s which often stall after a minute or so, alternating with (lesser) periods of downloads ~15 KiB/s. Uploads and scheduler responses consistently speedy.

In contrast, the past few days have been long periods where downloads stall after less than one second and even uploads stalling too, plus scheduler requests hitting time-outs regularly. The only counterpart is brief periods where hosts might get downloads at ~30 KiB/s, but with scheduler responses so rare, those downloads don't do enough anyway.

Any ideas what's going on this time around? As I said, I understand that post-outage recovery takes a while, but the few days after the outage were actually better than they have been recently. Judging from the countries of the recent posters, the network issues don't seem to be location-specific.

Edit: I experience what Fred remarked about the lost tasks as well: when scheduler responses do make it back, it's usually with lost tasks that never made it the first time.
____________
Soli Deo Gloria

.clair.
Volunteer moderator
Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 23,063,663
RAC: 474
United Kingdom
Message 1273255 - Posted: 20 Aug 2012, 11:20:58 UTC

Uploade are slow or take several tries
Scheduler requests time out and try again and time out etc.....
Reporting worn fails for the last six hours (cc_config set at 250)
Three hours of work left
I am not panicing
I am not panicing
I am not panicing
I am not panicing
I_am_not_pan_ic_inn_ggg.

Profile Link
Avatar
Send message
Joined: 18 Sep 03
Posts: 828
Credit: 1,571,842
RAC: 247
Germany
Message 1273256 - Posted: 20 Aug 2012, 11:23:40 UTC

Uploads and scheduler requests go thru from here OK as far as I can tell from the few I had/needed today. But downloads are quite impossible, with or without a proxy between.
____________
.

Profile alephnull
Volunteer tester
Send message
Joined: 16 Mar 03
Posts: 119
Credit: 161,986,054
RAC: 386
United States
Message 1273270 - Posted: 20 Aug 2012, 12:24:29 UTC - in response to Message 1273251.

Uploads okay here. Scheduler requests okay when I'm on NNT. When I request work, 3 of 4 time out, then I get lost tasks when I do connect.

+1

although ive had two or three occasions where the scheduler requests hung with nnt as well.

lowered cache settings to 2 days but didnt seem to help.

ive noticed these issues only on the machines with gpu work. all my machines with only cpu work schedule fine.

uploads ok with the occasional timeout/retry.

downloads are pretty good depending on the dl server per request (as already known).

rob
____________

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 7358
Credit: 96,901,157
RAC: 66,634
Australia
Message 1273273 - Posted: 20 Aug 2012, 12:34:44 UTC - in response to Message 1273270.

Other than the usual giving downloads a nudge it's been plain sailing at this end.

Cheers.
____________

Profile Fred E.Project donor
Volunteer tester
Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,139,004
RAC: 1
United States
Message 1273277 - Posted: 20 Aug 2012, 12:43:05 UTC

Any ideas what's going on this time around? As I said, I understand that post-outage recovery takes a while, but the few days after the outage were actually better than they have been recently. Judging from the countries of the recent posters, the network issues don't seem to be location-specific.


My uneducated guess is database problems, and Scheduler can't get a fast response when work is requested. The startup after the outage was ragged, and they had to disable the replica. So master is running everything - even forums are sluggish at times - I had a lag just starting this post. If it is the database, expect some downtime - hope they don't wait until Tuesday.

That doesn't explain what looks like intermittent upload/download issues. I haven't seen any, but there are plenty of reports. Any other guesses?

____________
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.

Wedge009
Volunteer tester
Avatar
Send message
Joined: 3 Apr 99
Posts: 340
Credit: 149,193,673
RAC: 199,026
Australia
Message 1273281 - Posted: 20 Aug 2012, 12:54:36 UTC - in response to Message 1273273.

Other than the usual giving downloads a nudge it's been plain sailing at this end.

Lucky you! Probably not location-specific, then, just luck of the draw.
____________
Soli Deo Gloria

Profile KWSN Ekky Ekky Ekky
Avatar
Send message
Joined: 25 May 99
Posts: 922
Credit: 12,060,943
RAC: 13,759
United Kingdom
Message 1273324 - Posted: 20 Aug 2012, 15:21:59 UTC - in response to Message 1273255.

Uploade are slow or take several tries
Scheduler requests time out and try again and time out etc.....
Reporting worn fails for the last six hours (cc_config set at 250)
Three hours of work left
I am not panicing
I am not panicing
I am not panicing
I am not panicing
I_am_not_pan_ic_inn_ggg.


Is this a transatlantic thing? I am having the same sort of problem. One time work pours in and out at incredible speed, then slows to a crawl, timing out again and again.
____________

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2290
Credit: 8,813,816
RAC: 4,080
United States
Message 1273328 - Posted: 20 Aug 2012, 15:26:02 UTC

Looks fine now, but I had two failed uploads in the 0730-0800utc time period. They both re-tried about 5 times before making it up to a 2+ hour back-off and then went through around 1100utc.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile KWSN Ekky Ekky Ekky
Avatar
Send message
Joined: 25 May 99
Posts: 922
Credit: 12,060,943
RAC: 13,759
United Kingdom
Message 1273338 - Posted: 20 Aug 2012, 15:42:20 UTC

Just had a look at cricket - bits out has dropped to the bottom line, so perhaps there's a problem after all?

____________

Terror Australis
Volunteer tester
Send message
Joined: 14 Feb 04
Posts: 1715
Credit: 205,826,362
RAC: 27,765
Australia
Message 1273341 - Posted: 20 Aug 2012, 15:45:03 UTC

Someone could be kicking something. It's 08:40 Monday Berkeley time, the server status page is blank and the crickets have flatlined.

T.A.

meijin
Send message
Joined: 24 Mar 12
Posts: 8
Credit: 2,097,274
RAC: 0
United States
Message 1273348 - Posted: 20 Aug 2012, 16:02:05 UTC

Is anyone else having issue reporting/uploading results this morning? All 4 of my boxes can get anything uploaded/reported right now.

Thanks!

Profile Slavac
Volunteer tester
Avatar
Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1273357 - Posted: 20 Aug 2012, 16:12:25 UTC - in response to Message 1259773.

Looking like Bruno barfed.
____________


Executive Director GPU Users Group Inc. -
brad@gpuug.org

Profile Sunny129
Avatar
Send message
Joined: 7 Nov 00
Posts: 190
Credit: 3,163,755
RAC: 0
United States
Message 1273371 - Posted: 20 Aug 2012, 16:46:15 UTC - in response to Message 1273370.

And it looks like we are starting to come back up....
Crickets showing some life again, just completed some uploads and reported.

yeah, an hour ago i had 100 or so tasks waiting to upload...they've all uploaded since then. however i'm still having problems reporting...in fact i probably have several hundred tasks waiting to report.
____________

Profile Sunny129
Avatar
Send message
Joined: 7 Nov 00
Posts: 190
Credit: 3,163,755
RAC: 0
United States
Message 1273387 - Posted: 20 Aug 2012, 17:15:55 UTC - in response to Message 1273373.

And it looks like we are starting to come back up....
Crickets showing some life again, just completed some uploads and reported.

yeah, an hour ago i had 100 or so tasks waiting to upload...they've all uploaded since then. however i'm still having problems reporting...in fact i probably have several hundred tasks waiting to report.

Have you tried <max_tasks_reported>100</max_tasks_reported> in your cc_config.xml file?

thanks...that got my reporting working again.
____________

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (76) Server Problems?

Copyright © 2014 University of California