Panic Mode On (26) Server problems


log in

Advanced search

Message boards : Number crunching : Panic Mode On (26) Server problems

1 · 2 · 3 · 4 . . . 13 · Next
Author Message
Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3615
Credit: 48,248,408
RAC: 35,807
United States
Message 948675 - Posted: 20 Nov 2009, 23:57:52 UTC

New thread
____________

Profile perryjay
Volunteer tester
Avatar
Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 15,209,154
RAC: 11,804
United States
Message 948687 - Posted: 21 Nov 2009, 0:45:16 UTC - in response to Message 948675.

New thread?? Did something happen to the old one? Did we hurt it? Oh noes, we didn't kill it did we?
____________


PROUD MEMBER OF Team Starfire World BOINC

Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3615
Credit: 48,248,408
RAC: 35,807
United States
Message 948699 - Posted: 21 Nov 2009, 1:34:51 UTC - in response to Message 948687.

Nope, it just went over 200 posts and for those on dial-up that is the point that it starts slowing down for them.
____________

Profile Bill Walker
Avatar
Send message
Joined: 4 Sep 99
Posts: 3346
Credit: 2,021,688
RAC: 2,097
Canada
Message 948707 - Posted: 21 Nov 2009, 2:22:35 UTC

I've been working off line for awhile while travelling, and come to find this new thread. Everything seems to be working for me.

WHY ARE PANICING? WHY AM I SHOUTING?
____________

Profile rebestProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Apr 00
Posts: 1296
Credit: 32,187,764
RAC: 9,498
United States
Message 948709 - Posted: 21 Nov 2009, 2:27:05 UTC - in response to Message 948707.

I've been working off line for awhile while travelling, and come to find this new thread. Everything seems to be working for me.

WHY ARE PANICING? WHY AM I SHOUTING?


Take a chill pill, Bill.

____________

Join the PACK!

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7047
Credit: 59,793,526
RAC: 21,946
Germany
Message 948712 - Posted: 21 Nov 2009, 2:44:05 UTC


Hey guys keep cool..

If the 'panic thread' reached a special value, a new thread is started.

Nothing more, nothing less.



____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7047
Credit: 59,793,526
RAC: 21,946
Germany
Message 949388 - Posted: 24 Nov 2009, 3:08:37 UTC


From the first side:

November 23, 2009
We just had a disk failure on our raw data storage device, which we're in the process of fixing, but still we'll probably run out of workunits to send out tonight.


And.. yes.. BOINC get this message now.. 'no jobs available'..

____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile hiamps
Volunteer tester
Avatar
Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 949524 - Posted: 24 Nov 2009, 22:37:18 UTC

Little slow restarting today...Can't report...
____________
Official Abuser of Boinc Buttons...
And no good credit hound!

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38685
Credit: 573,751,671
RAC: 544,308
United States
Message 949525 - Posted: 24 Nov 2009, 22:40:40 UTC - in response to Message 949524.
Last modified: 24 Nov 2009, 22:41:01 UTC

Little slow restarting today...Can't report...

Matt indicated in tech news that they were gonna be playing a little hanky panky with the server setup during today's outage........
So I wouldn't be surprised if something bombs......
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38685
Credit: 573,751,671
RAC: 544,308
United States
Message 949529 - Posted: 24 Nov 2009, 22:53:38 UTC

And this is part of Matt's tech news post as of a few minutes ago......so it looks like we shall have to be patient....once again...LOL.


"At the end of the day yesterday our raw data file server lost a drive. The bottom line as far as you're concerned is that we had to stop the creation of workunits until we got on top of the RAID resync issues this morning. But by then we were into our normal weekly outage, so you've been unable to get any work for a while, and will continue to not be able to do so until I start splitting up again - probably later this evening.

Meanwhile, every other part of the project is coming back online. We're testing the new mysql commit behavior (mentioned in yesterday's post). It's not looking good right out of the gate, but that may be due to mysql needing to read everything back into memory again after a bounce to pick up the configuration change. I may have to bounce it again if it continues to be a problem. I hope not, but it's no big deal either way."
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Profile hiamps
Volunteer tester
Avatar
Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 949531 - Posted: 24 Nov 2009, 23:01:06 UTC

Weird, some of my machines can connect but this one I'm at won't...Must be time for a restart.
____________
Official Abuser of Boinc Buttons...
And no good credit hound!

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38685
Credit: 573,751,671
RAC: 544,308
United States
Message 949532 - Posted: 24 Nov 2009, 23:04:21 UTC - in response to Message 949531.
Last modified: 24 Nov 2009, 23:05:22 UTC

Weird, some of my machines can connect but this one I'm at won't...Must be time for a restart.

You can reboot, but it might not change anything....
Could just be a chance thing of hitting the servers at just the right moment... 3 of my 8 rigs have reported in, but I just hit the update button on a 4th to try it, and it couldn't connect.

Bandwidth is still pretty low, so it's either Matt's experimental setup not working or up to speed yet, or other problems with things resynching.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Profile hiamps
Volunteer tester
Avatar
Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 949533 - Posted: 24 Nov 2009, 23:06:22 UTC

Did the trick, reported 32 units.
____________
Official Abuser of Boinc Buttons...
And no good credit hound!

Profile James SotherdenProject donor
Avatar
Send message
Joined: 16 May 99
Posts: 8651
Credit: 32,619,418
RAC: 54,045
United States
Message 949736 - Posted: 25 Nov 2009, 16:39:59 UTC
Last modified: 25 Nov 2009, 16:52:50 UTC

Hi all. I think all these outages are my fault. every time my rac gets up to 9k we go down, Then rac falls like a rock, then we come back on. Its this i7 sucking the life out of the servers:) sorry!!

edit- one thing i just did seeing i have no work is go to 6.10.18 on the i7. wuill do that with the old p4 also.
____________

Old James

FiveHamlet
Avatar
Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 949781 - Posted: 25 Nov 2009, 20:26:47 UTC

If the server shows 241,000 results ready to send then why am I not getting any?
As of 25 Nov 2009 20:10:13 UTC
____________

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38685
Credit: 573,751,671
RAC: 544,308
United States
Message 949782 - Posted: 25 Nov 2009, 20:31:12 UTC - in response to Message 949781.
Last modified: 25 Nov 2009, 20:31:55 UTC

If the server shows 241,000 results ready to send then why am I not getting any?
As of 25 Nov 2009 20:10:13 UTC

Because many thousands of computers are right now clamoring for those same WUs......
The bandwidth is saturated, and even connecting to the servers is hit and miss at the moment. The fact that the bandwidth is saturated is proof that the work is going out.

Give it time, and work should start to flow normally, barring any more server crashes.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 949797 - Posted: 25 Nov 2009, 21:43:48 UTC - in response to Message 949781.

If the server shows 241,000 results ready to send then why am I not getting any?
As of 25 Nov 2009 20:10:13 UTC

As Mark says.....

The number is the number "split" as of that time. In a sense, it represents history.

Split work is picked up by the feeder, and placed in a 100 work unit feeder queue. The scheduler can only assign what it has in the feeder queue.

This serves to spread work over time when demand is high (to try to avoid the huge collisions when they all complete quickly).

... and there is always a chance that the number is a bit stale.
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5774
Credit: 57,558,823
RAC: 48,374
Australia
Message 949879 - Posted: 26 Nov 2009, 8:42:35 UTC - in response to Message 949797.


Looks like something's broken- no inbound traffic. No uploads are going through.
____________
Grant
Darwin NT.

Profile T. Bell
Send message
Joined: 27 Mar 06
Posts: 20
Credit: 229,303
RAC: 0
Australia
Message 949882 - Posted: 26 Nov 2009, 9:26:38 UTC
Last modified: 26 Nov 2009, 9:28:21 UTC

It keeps backing off on the upload over here in Australia. Have also restarted BOINC to no avail. Yet Server Status says Upload server is running?? :-(

-- Tom :)
____________

Profile [DPC]NGS~R.Stanneveld
Volunteer tester
Avatar
Send message
Joined: 20 Nov 05
Posts: 9
Credit: 1,225,991
RAC: 0
Netherlands
Message 949886 - Posted: 26 Nov 2009, 11:02:40 UTC
Last modified: 26 Nov 2009, 11:04:35 UTC

Same here. and all my Nvidia's are out of work :(
____________

1 · 2 · 3 · 4 . . . 13 · Next

Message boards : Number crunching : Panic Mode On (26) Server problems

Copyright © 2014 University of California