Panic Mode On (42) Server problems

Message boards : Number crunching : Panic Mode On (42) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 11 · Next

AuthorMessage
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1060097 - Posted: 27 Dec 2010, 3:00:03 UTC - in response to Message 1059940.  

Well, Server Status Page is dead since 2 hours. It shows 18:40:08 UTC, and that's 2 hours ago.

Upload servers seems dead in the water though, requesting and reporting jobs seems to work though. I do not know if actually being assigned jobs and downloading them works, because my computers do not need any more work right now.


I got home from my Sunday volunteer job about 20 minutes ago. As soon as I got my laptop plugged in and enabled network, my BOINC manager requested and successfully downloaded 1 MB workunit. However, I have completed WUs on both boxes that are on 2+ hour backoff to upload. So the Scheduler and Download servers are working, but the Upload server is not.

But I have 1 WU in progress and 1 waiting to start on each box, so I am good until at least Wednesday. Matt and/or Jeff will fix it in the morning.

Even with two new servers, and one on the way, around here, Patience is not just a virtue, it is a necessity.

Donald
Infernal Optimist / Submariner, retired
ID: 1060097 · Report as offensive
parl

Send message
Joined: 22 May 04
Posts: 95
Credit: 4,476,976
RAC: 0
United States
Message 1060131 - Posted: 27 Dec 2010, 6:23:01 UTC

Maybe this has been covered elsewhere, but when the third new server comes online, will the mid-week outage be a thing of the past?

Not that I'm an expert or anything, but it seems to me that if the splitters generate WUs and the d/l server delivers them and the u/l server(s) receive the completed ones, the integration of completed and validated results into the database could be deferred for a while, although this might take a lot of storage to hold the WU's waiting for assimilation. It's possible that validation could even take place off-line to the main database. Unvalidated WUs likely aren't "fully" in the database anyway (I think).

But perhaps my basis is wrong, and assimilation is not the reason the database is not used for science except mid-week.

Or perhaps what I'm suggesting is already happening and there is no longer a scheduled mid-week outside communication shut-down. I don't see the mid-week notice on the front page just now, although not until Tuesday would it be needed.

Whatever, hi folks. Keep on crunchin'.
ID: 1060131 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1060134 - Posted: 27 Dec 2010, 6:33:13 UTC - in response to Message 1060131.  

Maybe this has been covered elsewhere, but when the third new server comes online, will the mid-week outage be a thing of the past?

Not that I'm an expert or anything, but it seems to me that if the splitters generate WUs and the d/l server delivers them and the u/l server(s) receive the completed ones, the integration of completed and validated results into the database could be deferred for a while, although this might take a lot of storage to hold the WU's waiting for assimilation. It's possible that validation could even take place off-line to the main database. Unvalidated WUs likely aren't "fully" in the database anyway (I think).

But perhaps my basis is wrong, and assimilation is not the reason the database is not used for science except mid-week.

Or perhaps what I'm suggesting is already happening and there is no longer a scheduled mid-week outside communication shut-down. I don't see the mid-week notice on the front page just now, although not until Tuesday would it be needed.

Whatever, hi folks. Keep on crunchin'.


Matt addresses that issue here. There will still be a short weekly outage for Database maintenance, but the 3-day outages should become less frequent and/or shorter once the new servers are all settled in.
Donald
Infernal Optimist / Submariner, retired
ID: 1060134 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1060171 - Posted: 27 Dec 2010, 11:02:21 UTC

I can't believe that anybody could complain about too much here lately.
We just had one of the best server runs in a long while.
Almost made it through the long holiday weekend, and work was flowing well enough that most folks should have been able to fill their caches.

In about 6 hours somebody should be back in the lab and get their kicking boots on to resolve the problem.
It will be interesting to find out what brought things down.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1060171 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1060172 - Posted: 27 Dec 2010, 11:08:29 UTC - in response to Message 1060171.  

It will be interesting to find out what brought things down.


My guess is the campus router power socket was needed for a photocopier during a Christmas party that got out of hand :)

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1060172 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34365
Credit: 79,922,639
RAC: 80
Germany
Message 1060173 - Posted: 27 Dec 2010, 11:35:18 UTC


I agree with Mark here.
Caches are full of work so nothing to worry about.




With each crime and every kindness we birth our future.
ID: 1060173 · Report as offensive
Profile platium
Avatar

Send message
Joined: 5 Jul 10
Posts: 212
Credit: 262,426
RAC: 0
United Kingdom
Message 1060178 - Posted: 27 Dec 2010, 12:47:27 UTC

mark is right i have never seen my pc with none stop work as i started when all the problem did, i even got my first ap's which i returned
ID: 1060178 · Report as offensive
Profile Will Malven
Avatar

Send message
Joined: 2 Jun 99
Posts: 52
Credit: 4,441,977
RAC: 0
United States
Message 1060201 - Posted: 27 Dec 2010, 14:08:54 UTC

Don't see the problem here. I've had pages and pages of stalled transfers before and the always get cleared up. If you keep your cache large enough to bridge through these minor inconveniences, then you will eventually get the credit and have plenty of work.

If you can't stand that, then move to another project. If you keep a couple of projects on your list with zero % resource share, then should the unthinkable happen and you run out of SETI units, your BOINC will switch to one of those other projects.

For me, the happy balance is CUDA for SETI and CPU for AQUA, with Einstein on reserve. Plenty of work, plenty of points.

Just take a chill pill. It's all good.
Man's future lies in the stars, not on Earth. It is each successive generation's responsibility to humanity to expand the knowledge and understanding of our Universe so that we may one day venture forth to meet our neighbors.

Houston, Texas
ID: 1060201 · Report as offensive
Profile lost68er
Volunteer tester
Avatar

Send message
Joined: 6 Jun 04
Posts: 5
Credit: 3,942,589
RAC: 0
Germany
Message 1060202 - Posted: 27 Dec 2010, 14:09:48 UTC

No complain, just asking what`s going on...
Can`t download new tasks without uploading the "done" ones...

Greets lost
ID: 1060202 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1060222 - Posted: 27 Dec 2010, 15:28:54 UTC - in response to Message 1060202.  

No complain, just asking what`s going on...
Can`t download new tasks without uploading the "done" ones...

Greets lost

Servers went down....no uploads or downloads right now.

Should see some activity in the next couple of hours hopefully.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1060222 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1060232 - Posted: 27 Dec 2010, 16:11:02 UTC

Uploads are now working.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1060232 · Report as offensive
Profile lost68er
Volunteer tester
Avatar

Send message
Joined: 6 Jun 04
Posts: 5
Credit: 3,942,589
RAC: 0
Germany
Message 1060234 - Posted: 27 Dec 2010, 16:12:56 UTC

YUP!!!

Greets lost
ID: 1060234 · Report as offensive
Dave

Send message
Joined: 29 Mar 02
Posts: 778
Credit: 25,001,396
RAC: 0
United Kingdom
Message 1060247 - Posted: 27 Dec 2010, 16:57:10 UTC

See wasn't a panic after all was it ;).
ID: 1060247 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1060248 - Posted: 27 Dec 2010, 16:59:44 UTC

No outbound work yet....
Server status shows mostly red. Either they have to reboot servers to put things back in order, or they are just waiting until the backed up download traffic settles down a bit.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1060248 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1060251 - Posted: 27 Dec 2010, 17:03:01 UTC - in response to Message 1060249.  

See wasn't a panic after all was it ;).



Look at the Server Status Page now (almost entirely red), and then come back saying the same :-)

I am starting to see some credits granted to my total....

"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1060251 · Report as offensive
Dave

Send message
Joined: 29 Mar 02
Posts: 778
Credit: 25,001,396
RAC: 0
United Kingdom
Message 1060287 - Posted: 27 Dec 2010, 20:39:58 UTC

I'm still refusing to panic...
ID: 1060287 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9958
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1060290 - Posted: 27 Dec 2010, 20:45:12 UTC
Last modified: 27 Dec 2010, 20:47:20 UTC

Yep 42 tasks just uploaded, who's panicking??

Bernie

PS Server status shows just one red!!
ID: 1060290 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1060292 - Posted: 27 Dec 2010, 20:47:36 UTC

just a bit seasick.. are we up again?
Janice
ID: 1060292 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9958
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1060293 - Posted: 27 Dec 2010, 20:48:42 UTC - in response to Message 1060292.  

just a bit seasick.. are we up again?

Seems so.
ID: 1060293 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1060306 - Posted: 27 Dec 2010, 22:04:06 UTC

Matt just posted in tech news about the problems.
Might be a bit hurky-jerky until tomorrow's outage.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1060306 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (42) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.