Panic Mode On (110) Server Problems?

Message boards : Number crunching : Panic Mode On (110) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 37 · Next

AuthorMessage
Profile Dr.Diesel Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 14 May 99
Posts: 41
Credit: 123,695,755
RAC: 139
United States
Message 1914652 - Posted: 22 Jan 2018, 22:25:36 UTC - in response to Message 1914650.  



I approved the account fairly soon after you created the account.


Thank you Sir, I was just able to log-in now, don't know what happened before. :)

Onward with Panic Mode, ha ha.
ID: 1914652 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1636
Credit: 12,921,799
RAC: 89
New Zealand
Message 1914662 - Posted: 23 Jan 2018, 0:52:48 UTC - in response to Message 1914551.  

Hope they load up some more data soon- we've just about finished off the current batch of work.

More work was added overnight New Zealand time
ID: 1914662 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5516
Credit: 528,817,460
RAC: 242
United States
Message 1914693 - Posted: 23 Jan 2018, 5:11:17 UTC

I often wonder if we are processing and returning so much data so quickly that we are slamming the servers more than they can handle. Thus leading to our never ending issues with the servers getting stuck or crashing..

Random thoughts in the middle of the night....
ID: 1914693 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1914702 - Posted: 23 Jan 2018, 8:04:20 UTC - in response to Message 1914693.  

I often wonder if we are processing and returning so much data so quickly that we are slamming the servers more than they can handle. Thus leading to our never ending issues with the servers getting stuck or crashing..

Random thoughts in the middle of the night....


. . But not an unreasobale one ...

Stephen

? ?
ID: 1914702 · Report as offensive
BoMbY

Send message
Joined: 3 Apr 99
Posts: 8
Credit: 759,919
RAC: 0
Germany
Message 1914721 - Posted: 23 Jan 2018, 12:31:29 UTC

So, none of the problems solved? Guess it's still not worth the effort to turn the system back on, as it would run out of work all the time?
ID: 1914721 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1914722 - Posted: 23 Jan 2018, 12:41:44 UTC - in response to Message 1914721.  

So, none of the problems solved? Guess it's still not worth the effort to turn the system back on, as it would run out of work all the time?


. . Partially, things are working better but perhaps still not 100%

Stephen

? ?
ID: 1914722 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1914725 - Posted: 23 Jan 2018, 12:45:13 UTC - in response to Message 1914721.  

The servers are keeping up with tasks requests, but not able to create much of a cache.
As it is there is no problems getting work, but will be starting maintenance soon here ...
ID: 1914725 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1914727 - Posted: 24 Jan 2018, 7:16:14 UTC

Whew, that was a short one ....
ID: 1914727 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1914729 - Posted: 24 Jan 2018, 7:29:46 UTC - in response to Message 1914727.  

Whew, that was a short one ....


. . Not ! !

. . I got the message notice but still unable to report :(

Stephen

:)
ID: 1914729 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13396
Credit: 208,696,464
RAC: 304
Australia
Message 1914732 - Posted: 24 Jan 2018, 7:34:20 UTC
Last modified: 24 Jan 2018, 7:41:05 UTC

A few problems contacting the Scheduler, then able to report some of the completed work, but unable to get new work.
I think it'll be that way for a while.

WooHoo!
Just picked up 1WU.

Forums are flaky, many of the Server Statuses haven't updated yet (a couple have). Some Scheduler replies take the usual amount of time, others are taking a minute or two.
Grant
Darwin NT
ID: 1914732 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1636
Credit: 12,921,799
RAC: 89
New Zealand
Message 1914733 - Posted: 24 Jan 2018, 7:35:32 UTC - in response to Message 1914727.  

Whew, that was a short one ....

Reason for the short maintenance period as you put it was because of the following Copied from front page from Eric

he outage ran long today because we needed to run down to the data center to swap some bad drives with new ones and reboot a few of the machine to pick up kernel and mysql updates.

Sorry for the delay.

ID: 1914733 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1914734 - Posted: 24 Jan 2018, 7:40:03 UTC - in response to Message 1914733.  

Bad drives = Good news!
Maybe the source of the file system lag has been found.
ID: 1914734 · Report as offensive
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1852
Credit: 2,258,721
RAC: 0
Australia
Message 1914735 - Posted: 24 Jan 2018, 7:40:38 UTC

And a 6 pack
ID: 1914735 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13396
Credit: 208,696,464
RAC: 304
Australia
Message 1914740 - Posted: 24 Jan 2018, 7:49:45 UTC - in response to Message 1914732.  

Forums are flaky, many of the Server Statuses haven't updated yet (a couple have). Some Scheduler replies take the usual amount of time, others are taking a minute or two.

And then the next 3 requests failed, then it got through again.

It's looking like a very rocky recovery.
Grant
Darwin NT
ID: 1914740 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1849
Credit: 268,616,081
RAC: 1,349
United States
Message 1914742 - Posted: 24 Jan 2018, 8:04:17 UTC

Longer the party, the worse the hangover ...
Good to see an Aricebo tape from last week here and getting split. In fact, that's pretty awesome, I think.
ID: 1914742 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13396
Credit: 208,696,464
RAC: 304
Australia
Message 1914745 - Posted: 24 Jan 2018, 8:12:03 UTC - in response to Message 1914742.  
Last modified: 24 Jan 2018, 8:25:49 UTC

Longer the party, the worse the hangover ...

Yeah, but even given the length of this last outage, this is a rockier than usual recovery.
If the failed drives were part of a RAID array (highly likely), then that's going to take ages to rebuild, and really hit on performance while it occurs.


EDIT- AP ready-to-send up to 900, but in-progress is still falling.
Looks like they're being split, but nothing is being sent out yet...
I haven't picked up anything since that one off WU- usually that doesn't happen till the ready-to-send buffer is empty & the splitters still haven't fired up.

EDIT- C2D has picked up 2 downloads, but they just sit there for 20sec or so & then go in to project backoff.
Grant
Darwin NT
ID: 1914745 · Report as offensive
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1852
Credit: 2,258,721
RAC: 0
Australia
Message 1914747 - Posted: 24 Jan 2018, 8:24:37 UTC

If any company ran like this in real world they would be shot, ( no feed back, no heads up on issues ) Not the Seti I remember!..
ID: 1914747 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13396
Credit: 208,696,464
RAC: 304
Australia
Message 1914749 - Posted: 24 Jan 2018, 8:29:16 UTC - in response to Message 1914747.  

If any company ran like this in real world they would be shot, ( no feed back, no heads up on issues ) Not the Seti I remember!..

If any company in the real world had Seti's funding they wouldn't still be operating.
Eric posted in the News forum about what happened, and that makes it appear on the home page.

Comparing Seti to a commercial operation is just a bit on the ridiculous side of things.
Grant
Darwin NT
ID: 1914749 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13159
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1914751 - Posted: 24 Jan 2018, 8:32:25 UTC - in response to Message 1914745.  

I have about 50 downloads stalled and going into 4 and 5 hour backoffs
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1914751 · Report as offensive
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1852
Credit: 2,258,721
RAC: 0
Australia
Message 1914752 - Posted: 24 Jan 2018, 8:35:38 UTC

I got 1 gpu and cpu now out of work
ID: 1914752 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 37 · Next

Message boards : Number crunching : Panic Mode On (110) Server Problems?


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.