Panic Mode On (96) Server Problems?

Message boards : Number crunching : Panic Mode On (96) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 · Next

AuthorMessage
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1849
Credit: 268,616,081
RAC: 1,349
United States
Message 1662536 - Posted: 8 Apr 2015, 10:58:09 UTC

Not a good trend.
Seems that right after maintenance each week the whole thing tanks and needs 3-4 days to recover to some degree of normalcy.
Knowing that through experience, those responsible for weekly maintenance should by now know to keep an eye on it and act accordingly.
Not an issue of staffing, or resources or whatever, but of individual responsibility. If you do the maintenance, you own the consequences.
Just saying ...
ID: 1662536 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1662568 - Posted: 8 Apr 2015, 13:31:13 UTC - in response to Message 1662536.  

Not a good trend.
Seems that right after maintenance each week the whole thing tanks and needs 3-4 days to recover to some degree of normalcy.
Knowing that through experience, those responsible for weekly maintenance should by now know to keep an eye on it and act accordingly.
Not an issue of staffing, or resources or whatever, but of individual responsibility. If you do the maintenance, you own the consequences.
Just saying ...


Sounds right to me.
ID: 1662568 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1662576 - Posted: 8 Apr 2015, 13:51:39 UTC
Last modified: 8 Apr 2015, 13:52:25 UTC

I think for the most part the system is running. My AP WUs are being validated, and being purged out as well. It's just the splitting which isn't happening.

I see their were 3 blocks of data that moved up the hill, so Matt may be splitting the 'old' users out of the database and compressing it, like he mentioned before. Or they are working on the AP database.

However, I never count on any tasks to be available for 24 hours after maintenance - it does always seem to choke.
ID: 1662576 · Report as offensive
Victor Wedge
Avatar

Send message
Joined: 3 Apr 04
Posts: 28
Credit: 12,569,503
RAC: 0
Message 1662609 - Posted: 8 Apr 2015, 16:23:23 UTC
Last modified: 8 Apr 2015, 16:32:16 UTC

I stumbled into this thread after noting that I was down to a handful of WUs. I've succeeded in downloading some more since I noticed the shortage. Still not up to par though, and NOTHING for the GPU.

So, the GPU is idle.
ID: 1662609 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1662613 - Posted: 8 Apr 2015, 16:28:01 UTC - in response to Message 1662576.  

I think for the most part the system is running. My AP WUs are being validated, and being purged out as well. It's just the splitting which isn't happening.

That gibes with my experience. I can upload and report, but not get new work. But yeah, the 24-30 hours after the weekly maintenance outage, new work pickin's are slim.....
Donald
Infernal Optimist / Submariner, retired
ID: 1662613 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1662617 - Posted: 8 Apr 2015, 16:35:32 UTC

SSP has been updated and is not looking good.

paddym, the SETI@home science database, is disabled. And that's never good news, without it no new WU's.

And AP has it's own problems, maybe no new tapes to split? So no new WU's.
"Please keep Your signature under four lines so Internet traffic doesn't go up too much"

- In 1992 when I had my first e-mail address -
ID: 1662617 · Report as offensive
Grumpy Swede (Democratic Socialist)
Volunteer tester
Avatar

Send message
Joined: 1 Nov 08
Posts: 8702
Credit: 49,849,242
RAC: 65
Sweden
Message 1662620 - Posted: 8 Apr 2015, 16:38:12 UTC

Ah well, we're use to this now. This situation with problem after problem is now the new normal. Move on, nothing to see here.

Complaining? Nah, that's over.
ID: 1662620 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51406
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1662629 - Posted: 8 Apr 2015, 16:59:13 UTC
Last modified: 8 Apr 2015, 17:18:43 UTC

Rather chilly in the kitties' crunching den this morning....
Not much in the way of GPU work left here.

Curious though, what happened to the buttload of MB datasets waiting to be split before the outage...........

I see a blue line spike on the cricket graph that usually heralds new work being sent to the splitters. Hope Matt doesn't have to repair another DB crash before things can get underway again.

Meowbrrrrrrrrr.
Excuse me if I am hard to understand at times.......I've had a difficult few lives.

ID: 1662629 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1662650 - Posted: 8 Apr 2015, 17:34:27 UTC

Hey - just so y'all know it's the science database again. This time enough is enough and I'm doing a comprehensive set of integrity checks on everything in that database before starting it up again. So no MB splitting or assimilating. Eventually some work will show up for AP splitting in the meantime...

Might be back up by the end of the work day, if not shortly after that. I did find one problem which was obscured while checking things out after previous crashes, so there's hope.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1662650 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 28524
Credit: 53,134,872
RAC: 32
United States
Message 1662651 - Posted: 8 Apr 2015, 17:35:40 UTC

Thanks for the information Matt.
ID: 1662651 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1662652 - Posted: 8 Apr 2015, 17:36:24 UTC - in response to Message 1662650.  

Thanks Matt, hope all goes well.
ID: 1662652 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51406
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1662653 - Posted: 8 Apr 2015, 17:36:58 UTC - in response to Message 1662650.  

Hey - just so y'all know it's the science database again. This time enough is enough and I'm doing a comprehensive set of integrity checks on everything in that database before starting it up again. So no MB splitting or assimilating. Eventually some work will show up for AP splitting in the meantime...

Might be back up by the end of the work day, if not shortly after that. I did find one problem which was obscured while checking things out after previous crashes, so there's hope.

- Matt

Thank you very much for the quick note, Matt!!
Best of luck with the analysis.

The kitties shall be waiting......

Meow!!!
Excuse me if I am hard to understand at times.......I've had a difficult few lives.

ID: 1662653 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1662655 - Posted: 8 Apr 2015, 17:41:07 UTC

On the up-side, all the scan/repair commands will be fresh in his mind from all the AP problems :D
ID: 1662655 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1662667 - Posted: 8 Apr 2015, 18:00:42 UTC
Last modified: 8 Apr 2015, 18:01:53 UTC

Thank You Matt for information.

Just those little tiny bit information messages from staff are really important to us. So we know what's happening.

And we know to be patient :)
"Please keep Your signature under four lines so Internet traffic doesn't go up too much"

- In 1992 when I had my first e-mail address -
ID: 1662667 · Report as offensive
Admiral Gloval
Avatar

Send message
Joined: 31 Mar 13
Posts: 16807
Credit: 5,308,449
RAC: 0
United States
Message 1662690 - Posted: 8 Apr 2015, 19:12:47 UTC

Bummer. My wu's disappeared many hours ago. Increased my cache numbers from ten and five to ten and ten. Have to see if my backup has any work.

ID: 1662690 · Report as offensive
Profile Akio
Avatar

Send message
Joined: 18 May 11
Posts: 375
Credit: 32,129,242
RAC: 0
United States
Message 1662724 - Posted: 8 Apr 2015, 20:52:47 UTC
Last modified: 8 Apr 2015, 21:41:06 UTC

Just got 40 some-odd WU's. Hope this means things are looking up for everyone :)

[ Edit: Yeah, nevermind. I got a lucky pull, I guess. Back to no new tasks ]
ID: 1662724 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 20881
Credit: 33,933,039
RAC: 23
United States
Message 1662759 - Posted: 8 Apr 2015, 23:00:57 UTC

Anyone know anything about Beta??? I have work that Uploaded; but, I cannot Report. Communication keeps deferring... :-( I have four WUs due to report by 4-26...
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1662759 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 23332
Credit: 261,360,520
RAC: 489
Australia
Message 1662765 - Posted: 8 Apr 2015, 23:20:26 UTC

I've got 1 rig that has now asked for CPU backup work and in a few more hours the other 1 will follow suit. :-(

Cheers.
ID: 1662765 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1662769 - Posted: 8 Apr 2015, 23:31:08 UTC - in response to Message 1662765.  
Last modified: 8 Apr 2015, 23:35:55 UTC

If you set your backup projects' cache to 0.02 days it will only load enough to keep your computer running without filling everything up on you. Then load SETI when available.

EDIT: Just remembered that is a global setting :( So scratch that.
ID: 1662769 · Report as offensive
Profile Akio
Avatar

Send message
Joined: 18 May 11
Posts: 375
Credit: 32,129,242
RAC: 0
United States
Message 1662770 - Posted: 8 Apr 2015, 23:34:10 UTC - in response to Message 1662769.  

How do you set up a backup project? If I have one set to "no new tasks" in BOINC, but have my S@H project to allow work from other projects when no work for S@H is available, will work for other projects still be downloaded?
ID: 1662770 · Report as offensive
Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 · Next

Message boards : Number crunching : Panic Mode On (96) Server Problems?


 
©2021 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.