Panic Mode On (26) Server problems

Message boards : Number crunching : Panic Mode On (26) Server problems
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 13 · Next

AuthorMessage
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 948675 - Posted: 20 Nov 2009, 23:57:52 UTC

New thread

ID: 948675 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 948687 - Posted: 21 Nov 2009, 0:45:16 UTC - in response to Message 948675.  

New thread?? Did something happen to the old one? Did we hurt it? Oh noes, we didn't kill it did we?


PROUD MEMBER OF Team Starfire World BOINC
ID: 948687 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 948699 - Posted: 21 Nov 2009, 1:34:51 UTC - in response to Message 948687.  

Nope, it just went over 200 posts and for those on dial-up that is the point that it starts slowing down for them.

ID: 948699 · Report as offensive
Profile Bill Walker
Avatar

Send message
Joined: 4 Sep 99
Posts: 3868
Credit: 2,697,267
RAC: 0
Canada
Message 948707 - Posted: 21 Nov 2009, 2:22:35 UTC

I've been working off line for awhile while travelling, and come to find this new thread. Everything seems to be working for me.

WHY ARE PANICING? WHY AM I SHOUTING?

ID: 948707 · Report as offensive
Profile rebest Project Donor
Volunteer tester
Avatar

Send message
Joined: 16 Apr 00
Posts: 1296
Credit: 45,357,093
RAC: 0
United States
Message 948709 - Posted: 21 Nov 2009, 2:27:05 UTC - in response to Message 948707.  

I've been working off line for awhile while travelling, and come to find this new thread. Everything seems to be working for me.

WHY ARE PANICING? WHY AM I SHOUTING?


Take a chill pill, Bill.


Join the PACK!
ID: 948709 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 948712 - Posted: 21 Nov 2009, 2:44:05 UTC


Hey guys keep cool..

If the 'panic thread' reached a special value, a new thread is started.

Nothing more, nothing less.



ID: 948712 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 949388 - Posted: 24 Nov 2009, 3:08:37 UTC


From the first side:

November 23, 2009
We just had a disk failure on our raw data storage device, which we're in the process of fixing, but still we'll probably run out of workunits to send out tonight.


And.. yes.. BOINC get this message now.. 'no jobs available'..

ID: 949388 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 949524 - Posted: 24 Nov 2009, 22:37:18 UTC

Little slow restarting today...Can't report...
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 949524 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 949525 - Posted: 24 Nov 2009, 22:40:40 UTC - in response to Message 949524.  
Last modified: 24 Nov 2009, 22:41:01 UTC

Little slow restarting today...Can't report...

Matt indicated in tech news that they were gonna be playing a little hanky panky with the server setup during today's outage........
So I wouldn't be surprised if something bombs......
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 949525 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 949529 - Posted: 24 Nov 2009, 22:53:38 UTC

And this is part of Matt's tech news post as of a few minutes ago......so it looks like we shall have to be patient....once again...LOL.


"At the end of the day yesterday our raw data file server lost a drive. The bottom line as far as you're concerned is that we had to stop the creation of workunits until we got on top of the RAID resync issues this morning. But by then we were into our normal weekly outage, so you've been unable to get any work for a while, and will continue to not be able to do so until I start splitting up again - probably later this evening.

Meanwhile, every other part of the project is coming back online. We're testing the new mysql commit behavior (mentioned in yesterday's post). It's not looking good right out of the gate, but that may be due to mysql needing to read everything back into memory again after a bounce to pick up the configuration change. I may have to bounce it again if it continues to be a problem. I hope not, but it's no big deal either way."
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 949529 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 949531 - Posted: 24 Nov 2009, 23:01:06 UTC

Weird, some of my machines can connect but this one I'm at won't...Must be time for a restart.
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 949531 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 949532 - Posted: 24 Nov 2009, 23:04:21 UTC - in response to Message 949531.  
Last modified: 24 Nov 2009, 23:05:22 UTC

Weird, some of my machines can connect but this one I'm at won't...Must be time for a restart.

You can reboot, but it might not change anything....
Could just be a chance thing of hitting the servers at just the right moment... 3 of my 8 rigs have reported in, but I just hit the update button on a 4th to try it, and it couldn't connect.

Bandwidth is still pretty low, so it's either Matt's experimental setup not working or up to speed yet, or other problems with things resynching.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 949532 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 949533 - Posted: 24 Nov 2009, 23:06:22 UTC

Did the trick, reported 32 units.
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 949533 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 949736 - Posted: 25 Nov 2009, 16:39:59 UTC
Last modified: 25 Nov 2009, 16:52:50 UTC

Hi all. I think all these outages are my fault. every time my rac gets up to 9k we go down, Then rac falls like a rock, then we come back on. Its this i7 sucking the life out of the servers:) sorry!!

edit- one thing i just did seeing i have no work is go to 6.10.18 on the i7. wuill do that with the old p4 also.
[/quote]

Old James
ID: 949736 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 949781 - Posted: 25 Nov 2009, 20:26:47 UTC

If the server shows 241,000 results ready to send then why am I not getting any?
As of 25 Nov 2009 20:10:13 UTC
ID: 949781 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 949782 - Posted: 25 Nov 2009, 20:31:12 UTC - in response to Message 949781.  
Last modified: 25 Nov 2009, 20:31:55 UTC

If the server shows 241,000 results ready to send then why am I not getting any?
As of 25 Nov 2009 20:10:13 UTC

Because many thousands of computers are right now clamoring for those same WUs......
The bandwidth is saturated, and even connecting to the servers is hit and miss at the moment. The fact that the bandwidth is saturated is proof that the work is going out.

Give it time, and work should start to flow normally, barring any more server crashes.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 949782 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 949797 - Posted: 25 Nov 2009, 21:43:48 UTC - in response to Message 949781.  

If the server shows 241,000 results ready to send then why am I not getting any?
As of 25 Nov 2009 20:10:13 UTC

As Mark says.....

The number is the number "split" as of that time. In a sense, it represents history.

Split work is picked up by the feeder, and placed in a 100 work unit feeder queue. The scheduler can only assign what it has in the feeder queue.

This serves to spread work over time when demand is high (to try to avoid the huge collisions when they all complete quickly).

... and there is always a chance that the number is a bit stale.
ID: 949797 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 949879 - Posted: 26 Nov 2009, 8:42:35 UTC - in response to Message 949797.  


Looks like something's broken- no inbound traffic. No uploads are going through.
Grant
Darwin NT
ID: 949879 · Report as offensive
Profile [DPC]NGS~R.Stanneveld
Volunteer tester
Avatar

Send message
Joined: 20 Nov 05
Posts: 9
Credit: 1,225,991
RAC: 0
Netherlands
Message 949886 - Posted: 26 Nov 2009, 11:02:40 UTC
Last modified: 26 Nov 2009, 11:04:35 UTC

Same here. and all my Nvidia's are out of work :(
ID: 949886 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 949889 - Posted: 26 Nov 2009, 12:40:19 UTC

Yep, looks like a Thanksgiving day server crash, though the server page shows all well. Doubt anyone will be in today. Wonder if they can do anything remotely?
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 949889 · Report as offensive
1 · 2 · 3 · 4 . . . 13 · Next

Message boards : Number crunching : Panic Mode On (26) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.