Project Back Online After Overnight Outage


log in

Advanced search

Message boards : News : Project Back Online After Overnight Outage

Previous · 1 · 2 · 3 · 4 · Next
Author Message
Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 6911
Credit: 94,166,238
RAC: 75,465
Australia
Message 1196783 - Posted: 18 Feb 2012, 10:30:48 UTC - in response to Message 1196760.

You know, when we have a server crash or an outage at Home Depot, if i took this long to get the system back up, or properly notify the users of staging developments, I'd get fired...

I'd hope so.
It's a business, that's why they pay all the money for that hardware & support.
This isn't, they rely on donations. If you want them to have 24/7 up time & notifications of what's going on, how about you providing all the hardware & money required to support it?

Plus the people working on the setup only work on it part-time, unless you want to pay for a full-time person to baby-sit it all.

Cheers.
____________

Profile archangel
Avatar
Send message
Joined: 25 Apr 01
Posts: 62
Credit: 1,840,710
RAC: 0
United States
Message 1196821 - Posted: 18 Feb 2012, 14:03:42 UTC - in response to Message 1196760.
Last modified: 18 Feb 2012, 14:21:59 UTC

"If you want them to have 24/7 up time & notifications of what's going on, how about you providing all the hardware & money required to support it?"

That would be a web enabled smartphone, right?

All they need to do is set up a log file, and monitor it with trace32.exe.

Set up ping over time to each server, an outbound and inbound bandwidth ping set for 20k packets, and have the scheduler log delivered WU's to the log as well.

Then, with a smartphone you could monitor the log, see any interruption to any server, any interruption to the outbound WU's, any bandwidth constraints or interruptions to the network, and from home, make a post online detailing whhic server is down,, and or what the problem is.

You could even sprint for a laptop if you wanted to spend some real money, and remote the servers or the monitoring PC.

Course, now we are talking nearly $600.

I'll spring for that though, if they are short.

8)
____________

Profile Slavac
Volunteer tester
Avatar
Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1196822 - Posted: 18 Feb 2012, 14:13:53 UTC - in response to Message 1196821.

"If you want them to have 24/7 up time & notifications of what's going on, how about you providing all the hardware & money required to support it?"

That would be a web enabled smartphone, right?

K, i'll spring for that 8)


Not quite. Here's the dent our donors have made in the past few months:

http://gpuug.org/purchases
____________


Executive Director GPU Users Group Inc. -
brad@gpuug.org

Profile archangel
Avatar
Send message
Joined: 25 Apr 01
Posts: 62
Credit: 1,840,710
RAC: 0
United States
Message 1196823 - Posted: 18 Feb 2012, 14:28:37 UTC - in response to Message 1196822.
Last modified: 18 Feb 2012, 14:37:21 UTC

Not quite. Here's the dent our donors have made in the past few months:

http://gpuug.org/purchases



I don't see what any of that has to do with providing a timely update on outages.

Sure, a nice new server would be great and all, but i think the effort required to get notices out in and of itself would be minimal and doable even for someone sitting at home on a sofa, just using windows utilities.

After seeing Seti was back up and WU's were getting through, i took the time to go around to my PC's and revert them back over to Seti from E@home...

When you have 9 computers, that takes about half an hour. To set them all back takes another half hour...

A timely update could have saved me that aggravation, that's all I'm saying.

You could even set up an alarm on trace32 monitoring the log to send a high priority email alert to a distribution list, or, if you are *super* lazy, you could set it up to post an alert to the message board itself...

Course, if you had a repeating failure, like an outage, that could result in spam posts, so probably best to have it alert a distribution list.
____________

Profile Michel448a
Volunteer tester
Avatar
Send message
Joined: 27 Oct 00
Posts: 1201
Credit: 2,891,635
RAC: 0
Canada
Message 1196828 - Posted: 18 Feb 2012, 15:06:16 UTC - in response to Message 1196823.
Last modified: 18 Feb 2012, 16:05:20 UTC

oh ya definitively, users can use more ... "care" from the project admins and sub-admins.

we arent paid to do that, it s even us who are paying :P. everything works with donations, and every users put money in their systems/networks to crunch harder and faster ^^
____________

Profile Bill Walker
Avatar
Send message
Joined: 4 Sep 99
Posts: 3372
Credit: 2,070,151
RAC: 2,250
Canada
Message 1196888 - Posted: 18 Feb 2012, 17:15:59 UTC

This is a science project. Right now, there are more volunteers than there are WUs, on a regular basis.

The project is short of resources, and has to use the available resources where it helps the science. Timely updates on outages may make us feel good, but it does NOTHING for the science. So, the resources (including our donations) go elsewhere.
____________

Profile archangel
Avatar
Send message
Joined: 25 Apr 01
Posts: 62
Credit: 1,840,710
RAC: 0
United States
Message 1197012 - Posted: 18 Feb 2012, 21:32:05 UTC - in response to Message 1196891.

Ah, well my computers are across 2 sites, i have to drive to work.

I'll look into the configuration you suggested and see if i can figure it out, when i added E@home on mine, it downloaded like 15 E@H tasks and 1 S@H task, so i had to remove the E@H, because i was up against space limitations...

Thought that was why i wasn't getting S@H WU's, because E@H was hogging all the drive space.

But of course, turns out it was just another outage... :)
____________

buzzard7
Send message
Joined: 20 Jul 11
Posts: 1
Credit: 419,890
RAC: 0
United States
Message 1197039 - Posted: 18 Feb 2012, 22:11:29 UTC

I'm not getting any work out of the scheduler. Anybody else having trouble?

Profile cliff
Avatar
Send message
Joined: 16 Dec 07
Posts: 322
Credit: 2,509,590
RAC: 0
United Kingdom
Message 1197204 - Posted: 19 Feb 2012, 5:59:27 UTC - in response to Message 1197040.

[/quote]
Everybody is....check the Panic Mode and other threads in Number Crunching.
That's the first place to check if you are having issues.[/quote]

I see that the work creation rate has gone back up from o.nnn to 6.nnn when I had a look at the server status page, but available work is zero..

Either its all being shoveled out as soon as its created or whatever's crated is going into a blackhole..

At any rate I aint getting no satisfaction:-/ Logs show endless 'server has no work available' since yesterday afternoon..

Regards

____________
Cliff,
Been there, Done that, Still no damm T shirt!

Profile cliff
Avatar
Send message
Joined: 16 Dec 07
Posts: 322
Credit: 2,509,590
RAC: 0
United Kingdom
Message 1197206 - Posted: 19 Feb 2012, 6:04:05 UTC

Arrrghhhhh.
7 WU on server. did update.. got 'no work' again..

There is some sort of skynet conspiracy against my rig.. or its Murphy's 3rd law again:-/

Cheers
____________
Cliff,
Been there, Done that, Still no damm T shirt!

frankysun
Send message
Joined: 9 Feb 12
Posts: 1
Credit: 13,190
RAC: 0
China
Message 1197218 - Posted: 19 Feb 2012, 8:44:52 UTC

谢谢 很高兴

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8366
Credit: 56,387,041
RAC: 77,922
United Kingdom
Message 1197229 - Posted: 19 Feb 2012, 9:33:26 UTC - in response to Message 1197206.

Arrrghhhhh.
7 WU on server. did update.. got 'no work' again..

There is some sort of skynet conspiracy against my rig.. or its Murphy's 3rd law again:-/

Cheers


Murphy has a lot to answer for...
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile ivan
Volunteer tester
Avatar
Send message
Joined: 5 Mar 01
Posts: 608
Credit: 137,763,429
RAC: 148,981
United Kingdom
Message 1197304 - Posted: 19 Feb 2012, 15:08:41 UTC - in response to Message 1197229.

Arrrghhhhh.
7 WU on server. did update.. got 'no work' again..

There is some sort of skynet conspiracy against my rig.. or its Murphy's 3rd law again:-/

Cheers


Murphy has a lot to answer for...

Don't shoot the messenger! Murphy merely observed that a connector in an aerospace project had no way to prevent its being plugged in the wrong way around -- so it was plugged in the wrong way around. "If anything can go wrong, it will." My understanding is that this was the genesis of the slightly asymmetric Canon "D" connectors (e.g. the 15-pin VGA connector, and 25-pin parallel printer connector) and derivatives.
____________

wayner11
Send message
Joined: 29 Dec 00
Posts: 2
Credit: 312,214
RAC: 344
Canada
Message 1197321 - Posted: 19 Feb 2012, 16:19:42 UTC - in response to Message 1197039.

Have been able to get any work in days.
Wayne
____________

Profile cliff
Avatar
Send message
Joined: 16 Dec 07
Posts: 322
Credit: 2,509,590
RAC: 0
United Kingdom
Message 1197323 - Posted: 19 Feb 2012, 16:22:49 UTC - in response to Message 1197304.

Hi Ivan,
When Murphy is working hand in glove with both the Gremlins & Skynet, its time to send a message.. 1 swift TLAM would be ideal:-)
Gotta get the Gremlins to go back to Dimension 'n'...

Cheers,
____________
Cliff,
Been there, Done that, Still no damm T shirt!

Profile cliff
Avatar
Send message
Joined: 16 Dec 07
Posts: 322
Credit: 2,509,590
RAC: 0
United Kingdom
Message 1197327 - Posted: 19 Feb 2012, 16:30:59 UTC - in response to Message 1197229.

Hi Bob,
Yup recon he has:-) Still on the bright side its amazing how many WU are getting validated ASAP.

Although I'm sure there are some folk out there who dont even know there's a problem, I have a couple of wingmen who havent been back to the servers since the 10th of last month.. Ignorance is bliss:-)

Cheers
____________
Cliff,
Been there, Done that, Still no damm T shirt!

JB
Volunteer tester
Send message
Joined: 21 Jul 09
Posts: 46
Credit: 7,891,307
RAC: 0
Germany
Message 1197370 - Posted: 19 Feb 2012, 18:00:24 UTC

Hi to some guys,

Past the dark I remembered to myself, these thread was opened by Matt some days ago.

Called: News\Project Back Online After Overnight Outage

Now I learned in this thread about Gremlins, Murphys Law, Skynet (greetings to TERMINATOR) Outerspace...and so on. Great!!!

How do ya think about this posts above, to change that into the Cafe SETI for more further small talks?

I`m still awaiting for serious new posts in this thread to resolve missing downloads new MBs and APs.

JB




Previous · 1 · 2 · 3 · 4 · Next

Message boards : News : Project Back Online After Overnight Outage

Copyright © 2014 University of California