Panic Mode On (25) Server problems

Message boards : Number crunching : Panic Mode On (25) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 11 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 936675 - Posted: 29 Sep 2009, 13:33:21 UTC - in response to Message 936656.  

I noticed early in the AM that the cricket graph flatlined, but the pages here still worked. I believe this is because the website is on the campus network, and just the data for the project goes through the hurricane electronics link?

I did notice that while cricket was flatlined (0.00 kbps for inbound and outbound), I had two uploads that would immediately fail. Problem went away though, and everything is smooth again after a short period of link saturation.

When we were discussing failed uploads a couple of weeks ago, somebody linked to a recent CISCO advisory recommending an operating system upgrade to forestall a possible DoS attack.

SETI uses high-spec CISCO routers on the Hurricane Electric tunnel. Applying the patch to one of those would explain the total flatline that Cosmic_Ocean noticed.
ID: 936675 · Report as offensive
Profile Lint trap

Send message
Joined: 30 May 03
Posts: 871
Credit: 28,092,319
RAC: 0
United States
Message 936686 - Posted: 29 Sep 2009, 14:42:15 UTC - in response to Message 936656.  

The 2_3 modem is still reporting ok here: http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;view=Octets;ranges=d%3Aw

note the display is reversed compared to the 5_1 modem, so "bits in" is the download side and "bits out" is uploads.

Martin
ID: 936686 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51580
Credit: 1,018,363,574
RAC: 1,004
United States
Message 936690 - Posted: 29 Sep 2009, 14:55:51 UTC

The 5.1 Cricket graph just shot back up, so it looks like somebody rebooted whatever was causing the problem.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 936690 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 936738 - Posted: 29 Sep 2009, 22:58:41 UTC

Anyone else not able to report tasks ?
I have about 600 to go.

ID: 936738 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 936740 - Posted: 29 Sep 2009, 23:05:06 UTC - in response to Message 936738.  

Anyone else not able to report tasks ?
I have about 600 to go.


My Laptop took quite a few attempts, but it only had 9 tasks to report,
so give it a bit more time for things to settle down.

Claggy
ID: 936740 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 936741 - Posted: 29 Sep 2009, 23:07:51 UTC - in response to Message 936740.  

Will do.
This is a bit unusual though as it is usually uploads that do not work after an outage.
All my tasks have uploaded ok but I am not able to report them.
ID: 936741 · Report as offensive
Profile Reuben Gathright
Avatar

Send message
Joined: 8 Mar 01
Posts: 213
Credit: 14,594,579
RAC: 0
United States
Message 936745 - Posted: 29 Sep 2009, 23:19:24 UTC

ID: 936745 · Report as offensive
Profile Cappy [Team Musketeers]

Send message
Joined: 19 Feb 03
Posts: 18
Credit: 1,180,143
RAC: 0
United States
Message 936756 - Posted: 30 Sep 2009, 0:12:46 UTC

yes no reporting,, uploading fine,, reporting no fine :P :P


ID: 936756 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13985
Credit: 208,696,464
RAC: 304
Australia
Message 937635 - Posted: 3 Oct 2009, 23:42:17 UTC - in response to Message 936756.  


Well, it looks like the forums are back, sort of.
Still lots of red on the Server Status page & not much network traffic. Hopefully the beast will awaken fully in the not too distant future.
Grant
Darwin NT
ID: 937635 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 937636 - Posted: 3 Oct 2009, 23:50:40 UTC
Last modified: 3 Oct 2009, 23:51:15 UTC

Well, that was fun. Some time this morning, I was looking at a screen saying 9,999,821.19 credits (since copied to BOINCstats). Now it says 10,000,025

So, many happy returns to me. But I fear I may have clicked 'update' once too often this morning....... ;-)
ID: 937636 · Report as offensive
HAL

Send message
Joined: 28 Mar 03
Posts: 704
Credit: 870,617
RAC: 0
United States
Message 937637 - Posted: 4 Oct 2009, 0:07:34 UTC

the reporting seems to work now - BUT requests for work say no jobs available despite what the server status says.

Classic WU= 7,237 Classic Hours= 42,079
ID: 937637 · Report as offensive
Rasputin
Volunteer tester

Send message
Joined: 13 Jun 02
Posts: 1764
Credit: 6,132,221
RAC: 0
Russia
Message 937642 - Posted: 4 Oct 2009, 0:29:48 UTC - in response to Message 937637.  

the reporting seems to work now - BUT requests for work say no jobs available despite what the server status says.


The status says all of the splitters are either disabled or not running and at the same time there's supposed to be a result creation rate of almost 25 per second.

One of them has to be wrong. No splitters equals zero creation rate.
ID: 937642 · Report as offensive
Larry256
Volunteer tester

Send message
Joined: 11 Nov 05
Posts: 25
Credit: 5,715,079
RAC: 8
United States
Message 937644 - Posted: 4 Oct 2009, 0:44:25 UTC - in response to Message 937642.  

the reporting seems to work now - BUT requests for work say no jobs available despite what the server status says.


The status says all of the splitters are either disabled or not running and at the same time there's supposed to be a result creation rate of almost 25 per second.

One of them has to be wrong. No splitters equals zero creation rate.



Look at when the stats were last updated.
ID: 937644 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9960
Credit: 103,452,613
RAC: 328
United Kingdom
Message 937645 - Posted: 4 Oct 2009, 0:45:25 UTC
Last modified: 4 Oct 2009, 0:59:20 UTC

I still feel however busy any body is it only take seconds to post to let us all know there is a problem.

I just feel sometimes that we are an annoyance if we want to know what is happening. I have 5 pc's running SETI@HOME nothing else, please can I at least hope to know when the problem is fixed. Is that expecting too much?

I don't expect servers to be up 24/7 I work in IT so I do realise there can always be problems, but we are the customers of this project, without us it would not exist. So could someone other than Matt please post as to the problems and solutions.

Thank you.

Bernie
ID: 937645 · Report as offensive
Berserker
Volunteer tester

Send message
Joined: 2 Jun 99
Posts: 105
Credit: 5,440,087
RAC: 0
United Kingdom
Message 937648 - Posted: 4 Oct 2009, 0:50:44 UTC

That depends where they are and what access they have from wherever they are. It's the weekend remember - everyone deserves a break sometimes.

As for the outage, I suspect they've done the bare minimum needed to get the site functional again and will do the cleanup as/when they can. Judging by the errors and the server status, one of the DB servers may have died horribly. If so, that's going to take quite some cleaning.
Stats site - http://www.teamocuk.co.uk - still alive and (just about) kicking.
ID: 937648 · Report as offensive
Rasputin
Volunteer tester

Send message
Joined: 13 Jun 02
Posts: 1764
Credit: 6,132,221
RAC: 0
Russia
Message 937649 - Posted: 4 Oct 2009, 0:51:41 UTC - in response to Message 937644.  

the reporting seems to work now - BUT requests for work say no jobs available despite what the server status says.


The status says all of the splitters are either disabled or not running and at the same time there's supposed to be a result creation rate of almost 25 per second.

One of them has to be wrong. No splitters equals zero creation rate.



Look at when the stats were last updated.


I see one hour ago. And there were/are no disks to split.

Could this be the beginning of the month long outage? I've never seen the status page without disks listed for splitting. Even when they're all completed splitting they still show up on the page.
ID: 937649 · Report as offensive
Aurora Borealis
Volunteer tester
Avatar

Send message
Joined: 14 Jan 01
Posts: 3075
Credit: 5,631,463
RAC: 0
Canada
Message 937650 - Posted: 4 Oct 2009, 0:59:53 UTC

Just be grateful

Nicolas posted on the Boinc Project Forum at 3 Oct 2009 20:25:19 UTC
Eric Korpela reports:
"Our database machine crashed. I can't get in to campus. Matt's nowhere to be found. Jeff's in Nepal. Dan's in South Africa. Please spread the word that we know that there's a problem."


We're lucky any of it came back up on a Saturday.

Boinc V7.2.42
Win7 i5 3.33G 4GB, GTX470
ID: 937650 · Report as offensive
Rasputin
Volunteer tester

Send message
Joined: 13 Jun 02
Posts: 1764
Credit: 6,132,221
RAC: 0
Russia
Message 937651 - Posted: 4 Oct 2009, 1:01:23 UTC

Just got three new tasks. So, the status page is behind and everything is working...slowly but working.
ID: 937651 · Report as offensive
Rasputin
Volunteer tester

Send message
Joined: 13 Jun 02
Posts: 1764
Credit: 6,132,221
RAC: 0
Russia
Message 937653 - Posted: 4 Oct 2009, 1:05:09 UTC - in response to Message 937650.  
Last modified: 4 Oct 2009, 1:16:52 UTC

Just be grateful

Nicolas posted on the Boinc Project Forum at 3 Oct 2009 20:25:19 UTC
Eric Korpela reports:
"Our database machine crashed. I can't get in to campus. Matt's nowhere to be found. Jeff's in Nepal. Dan's in South Africa. Please spread the word that we know that there's a problem."


We're lucky any of it came back up on a Saturday.


I think they can remotely login to the servers and fix certain problems. Not positive about that, but I seem to remember Eric mentioning it one time.

Edit: maybe it was Matt that mentioned it. or maybe it was never mentioned at all. lol I can't always trust my memory especially when I'm sick with pneumonia. The x-rays confirmed it on friday and this is the third time this year. Getting old is a biotch. Smoking doesn't help either...
ID: 937653 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9960
Credit: 103,452,613
RAC: 328
United Kingdom
Message 937655 - Posted: 4 Oct 2009, 1:08:48 UTC - in response to Message 937648.  

That depends where they are and what access they have from wherever they are. It's the weekend remember - everyone deserves a break sometimes.

As for the outage, I suspect they've done the bare minimum needed to get the site functional again and will do the cleanup as/when they can. Judging by the errors and the server status, one of the DB servers may have died horribly. If so, that's going to take quite some cleaning.


Yes but unless Matt post we never find out something was wrong. No one else ever seems to care, which is rather strange because if we all decided to stop crunching for SETI@HOME there would be no results.


ID: 937655 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (25) Server problems


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.