Panic Mode On (96) Server Problems?

Message boards : Number crunching : Panic Mode On (96) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 23 · Next

AuthorMessage
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1662405 - Posted: 8 Apr 2015, 0:54:47 UTC

Well the SSP hasn't updated in quite some time now and the cricket graph is almost on the deck plus my main rig's GPU's will be looking for other work soon. :-(

Cheers.
ID: 1662405 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1662406 - Posted: 8 Apr 2015, 0:57:55 UTC - in response to Message 1662405.  

Hi Wiggo,

Well the SSP hasn't updated in quite some time now and the cricket graph is almost on the deck plus my main rig's GPU's will be looking for other work soon. :-(

Cheers.

Just tried to get some work, no go:-( Dispite server stats page showing loads of WU available:-(

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1662406 · Report as offensive
Bill Butler
Avatar

Send message
Joined: 26 Aug 03
Posts: 101
Credit: 4,270,697
RAC: 0
United States
Message 1662409 - Posted: 8 Apr 2015, 1:12:13 UTC - in response to Message 1662405.  

Well the SSP hasn't updated in quite some time now . . .

I see the SSP update time is 22:10 UTC. Current time on the planet is 01:12. So, something is not cooperating.
"It is often darkest just before it turns completely black."
ID: 1662409 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1662411 - Posted: 8 Apr 2015, 1:13:45 UTC - in response to Message 1662406.  

Had 1 batch of seti work right after the outage ended but didn't last long.

Since then been on the back up project.
ID: 1662411 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1662418 - Posted: 8 Apr 2015, 1:35:25 UTC - in response to Message 1662411.  

Hi Zalster,

Had 1 batch of seti work right after the outage ended but didn't last long.

Since then been on the back up project.



Someone forgot to feed the hamsters:-)

Cheers,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1662418 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1662463 - Posted: 8 Apr 2015, 4:50:04 UTC

Other than a little bump in the CG it doesn't look like much has happened since my last post as the SSP is still the same. :-(

My main rig's GPU's will be requesting work from else where within 10mins.

Cheers.
ID: 1662463 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1662483 - Posted: 8 Apr 2015, 6:18:11 UTC

Watching that World Visualization thingy for the past 10 minutes, I saw 3 tasks sent out, as opposed to hundreds of results returned. All my boxes will make it to morning, but one will be empty by 1000 Berkeley time....
Donald
Infernal Optimist / Submariner, retired
ID: 1662483 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1662490 - Posted: 8 Apr 2015, 6:55:25 UTC

There will be 44 new tasks as soon as the last task is finished here in a little while. I'm going to take this opportunity to clear out all the "lost" tasks, and there appear to be 44 on this host. Too bad they won't be sent back like back in the good old days...

I just noticed the last task sent has 21 Gaussians and then overflowed. I don't ever remember seeing that many Gaussians on one task before. Oh well, it's not like this task has never been run before...
http://setiathome.berkeley.edu/result.php?resultid=4081735035
ID: 1662490 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1662491 - Posted: 8 Apr 2015, 6:59:09 UTC - in response to Message 1662490.  

It appears the system is down.
No work available, and no updates on the server status- it's all still frozen.
Grant
Darwin NT
ID: 1662491 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1662500 - Posted: 8 Apr 2015, 7:51:47 UTC - in response to Message 1662491.  

Still borked.
Should be out of GPU work in a couple of hours.
Grant
Darwin NT
ID: 1662500 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1849
Credit: 268,616,081
RAC: 1,349
United States
Message 1662536 - Posted: 8 Apr 2015, 10:58:09 UTC

Not a good trend.
Seems that right after maintenance each week the whole thing tanks and needs 3-4 days to recover to some degree of normalcy.
Knowing that through experience, those responsible for weekly maintenance should by now know to keep an eye on it and act accordingly.
Not an issue of staffing, or resources or whatever, but of individual responsibility. If you do the maintenance, you own the consequences.
Just saying ...
ID: 1662536 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1662568 - Posted: 8 Apr 2015, 13:31:13 UTC - in response to Message 1662536.  

Not a good trend.
Seems that right after maintenance each week the whole thing tanks and needs 3-4 days to recover to some degree of normalcy.
Knowing that through experience, those responsible for weekly maintenance should by now know to keep an eye on it and act accordingly.
Not an issue of staffing, or resources or whatever, but of individual responsibility. If you do the maintenance, you own the consequences.
Just saying ...


Sounds right to me.
ID: 1662568 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1662576 - Posted: 8 Apr 2015, 13:51:39 UTC
Last modified: 8 Apr 2015, 13:52:25 UTC

I think for the most part the system is running. My AP WUs are being validated, and being purged out as well. It's just the splitting which isn't happening.

I see their were 3 blocks of data that moved up the hill, so Matt may be splitting the 'old' users out of the database and compressing it, like he mentioned before. Or they are working on the AP database.

However, I never count on any tasks to be available for 24 hours after maintenance - it does always seem to choke.
ID: 1662576 · Report as offensive
Victor Wedge
Avatar

Send message
Joined: 3 Apr 04
Posts: 28
Credit: 12,569,503
RAC: 0
Message 1662609 - Posted: 8 Apr 2015, 16:23:23 UTC
Last modified: 8 Apr 2015, 16:32:16 UTC

I stumbled into this thread after noting that I was down to a handful of WUs. I've succeeded in downloading some more since I noticed the shortage. Still not up to par though, and NOTHING for the GPU.

So, the GPU is idle.
ID: 1662609 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1662613 - Posted: 8 Apr 2015, 16:28:01 UTC - in response to Message 1662576.  

I think for the most part the system is running. My AP WUs are being validated, and being purged out as well. It's just the splitting which isn't happening.

That gibes with my experience. I can upload and report, but not get new work. But yeah, the 24-30 hours after the weekly maintenance outage, new work pickin's are slim.....
Donald
Infernal Optimist / Submariner, retired
ID: 1662613 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1662617 - Posted: 8 Apr 2015, 16:35:32 UTC

SSP has been updated and is not looking good.

paddym, the SETI@home science database, is disabled. And that's never good news, without it no new WU's.

And AP has it's own problems, maybe no new tapes to split? So no new WU's.
"Please keep Your signature under four lines so Internet traffic doesn't go up too much"

- In 1992 when I had my first e-mail address -
ID: 1662617 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1662629 - Posted: 8 Apr 2015, 16:59:13 UTC
Last modified: 8 Apr 2015, 17:18:43 UTC

Rather chilly in the kitties' crunching den this morning....
Not much in the way of GPU work left here.

Curious though, what happened to the buttload of MB datasets waiting to be split before the outage...........

I see a blue line spike on the cricket graph that usually heralds new work being sent to the splitters. Hope Matt doesn't have to repair another DB crash before things can get underway again.

Meowbrrrrrrrrr.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1662629 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1662650 - Posted: 8 Apr 2015, 17:34:27 UTC

Hey - just so y'all know it's the science database again. This time enough is enough and I'm doing a comprehensive set of integrity checks on everything in that database before starting it up again. So no MB splitting or assimilating. Eventually some work will show up for AP splitting in the meantime...

Might be back up by the end of the work day, if not shortly after that. I did find one problem which was obscured while checking things out after previous crashes, so there's hope.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1662650 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30608
Credit: 53,134,872
RAC: 32
United States
Message 1662651 - Posted: 8 Apr 2015, 17:35:40 UTC

Thanks for the information Matt.
ID: 1662651 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1662652 - Posted: 8 Apr 2015, 17:36:24 UTC - in response to Message 1662650.  

Thanks Matt, hope all goes well.
ID: 1662652 · Report as offensive
Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 23 · Next

Message boards : Number crunching : Panic Mode On (96) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.