Website & RSS: Regular updates when there is an outage


log in

Advanced search

Questions and Answers : Wish list : Website & RSS: Regular updates when there is an outage

Author Message
Rick Spies
Send message
Joined: 11 Jul 99
Posts: 12
Credit: 2,840,936
RAC: 758
United States
Message 877998 - Posted: 21 Mar 2009, 19:49:52 UTC

I received the following post a couple days ago via RSS:

"Our science database crashed. It's recovering but no work will be available for a while."

I have several thoughts which this post has brought to mind again -- things which bug me every time there is a major issue:

1 - This info is not posted on the main page.
2 - What the heck is, "...for a while"? Minutes? Hours? Days? Weeks????
3 - When a major outage occurs, users should receive an update every 8-12 hours via a post on the main page and via an RSS post. Nature abhors a vacuum and that's what we have here. "Need input!"
____________

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24095
Credit: 517,832
RAC: 152
United States
Message 878013 - Posted: 21 Mar 2009, 20:27:26 UTC - in response to Message 877998.

I received the following post a couple days ago via RSS:

"Our science database crashed. It's recovering but no work will be available for a while."

I have several thoughts which this post has brought to mind again -- things which bug me every time there is a major issue:

1 - This info is not posted on the main page.
2 - What the heck is, "...for a while"? Minutes? Hours? Days? Weeks????
3 - When a major outage occurs, users should receive an update every 8-12 hours via a post on the main page and via an RSS post. Nature abhors a vacuum and that's what we have here. "Need input!"

For a while for DB recovery typically means anywhere from an hour to 3 days.
____________


BOINC WIKI

Aurora Borealis
Volunteer tester
Avatar
Send message
Joined: 14 Jan 01
Posts: 2975
Credit: 4,957,667
RAC: 1,495
Canada
Message 878018 - Posted: 21 Mar 2009, 20:38:39 UTC - in response to Message 877998.

I received the following post a couple days ago via RSS:

"Our science database crashed. It's recovering but no work will be available for a while."

I have several thoughts which this post has brought to mind again -- things which bug me every time there is a major issue:

1 - This info is not posted on the main page.
That's because it's a pain to update the main page. It's not like sending a text message. Besides at least half the people link directly to the message boards and never go through the main page.

2 - What the heck is, "...for a while"? Minutes? Hours? Days? Weeks????
Your guess is as good as theirs. By the time they know how long it's going to take to fix, the problem has already been resolved. Then it's all matter of how many users caches need to be replenished before things stabilize.

3 - When a major outage occurs, users should receive an update every 8-12 hours via a post on the main page and via an RSS post. Nature abhors a vacuum and that's what we have here. "Need input!"
If Boinc is still not connecting after 8 hours your Message tab tell you. What else do you want to hear. "It's not fixed yet"?

____________
Questions? Answers are in the "Unofficial" BOINC Wiki.

Boinc V7.0.27
Win7 i5 3.33G 4GB, GTX470

Rick Spies
Send message
Joined: 11 Jul 99
Posts: 12
Credit: 2,840,936
RAC: 758
United States
Message 878033 - Posted: 21 Mar 2009, 21:09:37 UTC - in response to Message 878018.


If Boinc is still not connecting after 8 hours your Message tab tell you. What else do you want to hear. "It's not fixed yet"?


What else do I want to hear?

"The hard drive crashed. The ETA is until we are up is 12 hours."
"Update: The HDD has been replaced and restored via backup."

...or...

"A contractor dug-up our T3 line. We don't have an ETA for repairs. An update will be posted as more information becomes available."
"Update: The telco has a crew is working on it now. We've been told the repair will be completed this afternoon.)"
"Update: Repairs are completed. Getting work units may be a little slow due to the heavy load following the outage."

All I'm asking for is common sense - common courtesy. Is that too much to ask?
____________

Aurora Borealis
Volunteer tester
Avatar
Send message
Joined: 14 Jan 01
Posts: 2975
Credit: 4,957,667
RAC: 1,495
Canada
Message 878069 - Posted: 21 Mar 2009, 21:49:52 UTC - in response to Message 878033.
Last modified: 21 Mar 2009, 21:51:09 UTC


If Boinc is still not connecting after 8 hours your Message tab tell you. What else do you want to hear. "It's not fixed yet"?


What else do I want to hear?

"The hard drive crashed. The ETA is until we are up is 12 hours."
"Update: The HDD has been replaced and restored via backup."

...or...

"A contractor dug-up our T3 line. We don't have an ETA for repairs. An update will be posted as more information becomes available."
"Update: The telco has a crew is working on it now. We've been told the repair will be completed this afternoon.)"
"Update: Repairs are completed. Getting work units may be a little slow due to the heavy load following the outage."

All I'm asking for is common sense - common courtesy. Is that too much to ask?

Anything as big as loosing the main internet would take the Forums off line (meaning you couldn't see it anyway) or would be updated on the front page. Or, as happened Friday, Matt posted from home in the Tech forum...

As for the outage, thumper crashed last night - still unsure why (heavy informix load? bad root disk drive? both?). Eric was the only one at the lab and fought with it for a while. Then around midnight I caught wind of the situation and was able to log in remotely via the network kvm and reboot the thing to a state where the OS finally came up, but all the RAIDs had to resync. Jeff wakes up super early and saw all the messages from me/Eric and took it from there. I'm about to head to a gig in Carmel so it'll be all Jeff today (with help from Eric/Bob perhaps) coaching it back into shape.

- Matt

____________
Questions? Answers are in the "Unofficial" BOINC Wiki.

Boinc V7.0.27
Win7 i5 3.33G 4GB, GTX470

Rick Spies
Send message
Joined: 11 Jul 99
Posts: 12
Credit: 2,840,936
RAC: 758
United States
Message 878101 - Posted: 21 Mar 2009, 23:28:30 UTC - in response to Message 878069.


If Boinc is still not connecting after 8 hours your Message tab tell you. What else do you want to hear. "It's not fixed yet"?


What else do I want to hear?

"The hard drive crashed. The ETA is until we are up is 12 hours."
"Update: The HDD has been replaced and restored via backup."

...or...

"A contractor dug-up our T3 line. We don't have an ETA for repairs. An update will be posted as more information becomes available."
"Update: The telco has a crew is working on it now. We've been told the repair will be completed this afternoon.)"
"Update: Repairs are completed. Getting work units may be a little slow due to the heavy load following the outage."

All I'm asking for is common sense - common courtesy. Is that too much to ask?

Anything as big as loosing the main internet would take the Forums off line (meaning you couldn't see it anyway) or would be updated on the front page. Or, as happened Friday, Matt posted from home in the Tech forum...

As for the outage, thumper crashed last night - still unsure why (heavy informix load? bad root disk drive? both?). Eric was the only one at the lab and fought with it for a while. Then around midnight I caught wind of the situation and was able to log in remotely via the network kvm and reboot the thing to a state where the OS finally came up, but all the RAIDs had to resync. Jeff wakes up super early and saw all the messages from me/Eric and took it from there. I'm about to head to a gig in Carmel so it'll be all Jeff today (with help from Eric/Bob perhaps) coaching it back into shape.

- Matt


My samples were impromptu examples, not necessarily real-world situations.

Posting in a forum not visited by 99.9% of users is no substitute for a brief note on the front page which would also be available via RSS.

As you can see, I am not a clueless newbie. I have no idea why you, a moderator and developer, are being so defensive and flame-baiting me about such a simple request. This is my last post on this topic.

End.
____________

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13541
Credit: 29,380,863
RAC: 15,518
United States
Message 878146 - Posted: 22 Mar 2009, 1:33:33 UTC

Hmmm... the first place I saw the news was on the front page, which is where I always come in at.


The more time they spend updating the users, the less time they have in actually fixing the problem. Matt has already stated that putting things on the main page isn't easy.

I can understand why people would want more information. It is courtesy to do so. But at the same time, if they start putting up ETAs, then things fall apart making them miss an ETA, people tend to get real upset. If they don't post a time, then there's nothing people can hold them to incase something else goes wrong.


Rick, try not to think of John and AB as "flame baiting" you, because they are not. They are simply users like yourself who are offering their own opinion. Everyone is entitled to one, even if it is in disagreement. Nothing to get upset about because someone else doesn't like your idea.
____________

Questions and Answers : Wish list : Website & RSS: Regular updates when there is an outage

Copyright © 2014 University of California