Storm (Oct 14 2009)

Message boards : Technical News : Storm (Oct 14 2009)
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 939902 - Posted: 14 Oct 2009, 17:51:33 UTC

Got finished kinda late yesterday hence the lack of tech news report. So last week we had lots of database cleanup to deal with due to server crashes over the preceding weekend. The mysql replica database suffered quite a bit, so we planned to recover it using a standard mysql dump file, except we discovered that the latest version of mysql is buggy and the dump files sometimes contain syntax errors. Great.

So this week we recovered the replica by copying all the myisam and innodb data files (and logs) from mork (the master database server). We actually did the first rsync on Monday to help speed things along on Tuesday, but there are so many large files it took forever to even to just do the "delta" rsync. That's why the outage was so long (this final rsync could only happen when the master database was quiescent).

This morning Bob and I made sure all the right config tweaks were made in /etc/my.cnf and started up the replica server. Only one minor snag at first which we fixed, now it's running again and catching up! We still have to figure out how to get mysqldump to work 100% of the time syntax-error-free. That's actually kind of scary.

Meanwhile, the Bay Area was hit with a record breaking storm yesterday. Yeah, I grew up in New York so I can say it was only really a "storm" by Bay Area standards but still we had low temperatures and high humidity. This wreaks havoc on our air conditioner, and the server closet has been hovering a few degrees away from disaster for a couple days now. In fact, ewen (Eric's hydrogen survey server) just lost a drive in the root RAID. It shouldn't be a big deal to replace, except that when ewen goes down for maintenance everything hangs (as there are lots of informix libs living on that system - we really need to move them off but have been loathe to do so for fear of breaking something else).

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 939902 · Report as offensive
Profile Neil Blaikie
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 143
Credit: 6,652,341
RAC: 0
Canada
Message 939918 - Posted: 14 Oct 2009, 19:30:07 UTC - in response to Message 939902.  

Dump some of the weather towards Montreal, Canada please. It is cold, gray miserable and actually snowed for a few minutes earlier today. Warmish and wet is better than gray and cold!

Hope everything gets back to "normal" soon and you figure out the mysqldump issue, sounds like it is going to be a main in the proverbial ass though.

As always thank you for taking your precious time to keep us informed and also thank you to Bob as well for your efforts.

Keep up the good work
ID: 939918 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 69
United Kingdom
Message 939940 - Posted: 14 Oct 2009, 20:39:11 UTC

This is old news for some but i just discovered it.
Over 300 known bugs, some serious. the url says it all,
I did`nt read through all of it, its a bit long and above my level.
Plenty of things to `tidy up` :)

http://monty-says.blogspot.com/2008/11/oops-we-did-it-again-mysql-51-released.html
ID: 939940 · Report as offensive
P. J. Crabtree

Send message
Joined: 17 Jan 07
Posts: 22
Credit: 1,847,766
RAC: 0
United States
Message 939967 - Posted: 14 Oct 2009, 21:59:43 UTC

So, it was pretty much a typical outage day.
ID: 939967 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 939995 - Posted: 14 Oct 2009, 23:41:01 UTC - in response to Message 939902.  

So, the replica DB is still disabled until it can catch up to the master? You got a pair of fiber cards between those two servers yet?

A reliable, online backup mechanism (binary) for a production database should be mandatory. Maybe MySQL 6.0? Or a commercial product?
http://www.arkeia.com/en/products/arkeia-network-backup/backup-agent/database-agent/mysql-agent
ID: 939995 · Report as offensive
Profile 52 Aces
Avatar

Send message
Joined: 7 Jan 02
Posts: 497
Credit: 14,261,068
RAC: 67
United States
Message 940009 - Posted: 15 Oct 2009, 1:15:22 UTC - in response to Message 939902.  

Will you set Boinc "show_results" back to TRUE any time soon?

Thx so much !
ID: 940009 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 940016 - Posted: 15 Oct 2009, 2:02:27 UTC - in response to Message 939995.  

So, the replica DB is still disabled until it can catch up to the master? You got a pair of fiber cards between those two servers yet?

A reliable, online backup mechanism (binary) for a production database should be mandatory. Maybe MySQL 6.0? Or a commercial product?
http://www.arkeia.com/en/products/arkeia-network-backup/backup-agent/database-agent/mysql-agent

Online backup systems cost money - SETI has none of that.


BOINC WIKI
ID: 940016 · Report as offensive
Acct Closed

Send message
Joined: 14 Oct 09
Posts: 3
Credit: 14,617
RAC: 0
Message 940022 - Posted: 15 Oct 2009, 2:22:09 UTC - in response to Message 940009.  

Will you set Boinc "show_results" back to TRUE any time soon?

Thx so much !


With this feature turned off, we have a lot more internet bandwidth for the Seti@Home project to do uploads, downloads, scheduler requests, etc. I'm wondering why we need this feature turned back on since Boinc was designed to be simply set and left alone to run. My long-time Seti friends have not had any network dropouts since this feature was turned off.

What is the value of being able to parse through one's or someone else's tasks?
ID: 940022 · Report as offensive
Profile Zeus Fab3r
Avatar

Send message
Joined: 17 Jan 01
Posts: 649
Credit: 275,335,635
RAC: 597
Serbia
Message 940025 - Posted: 15 Oct 2009, 2:42:15 UTC - in response to Message 940022.  

What is the value of being able to parse through one's or someone else's tasks?


...high-end MoBo...300$, latest CPU ...974$,
mindblowing watercooled VGA ...700$ ...
Chance to see how, or if it works as it should, compared to others...priceless :)

Who the hell is General Failure and why is he reading my harddisk?¿
ID: 940025 · Report as offensive
Profile 52 Aces
Avatar

Send message
Joined: 7 Jan 02
Posts: 497
Credit: 14,261,068
RAC: 67
United States
Message 940042 - Posted: 15 Oct 2009, 3:20:46 UTC - in response to Message 940022.  

With this feature turned off, we have a lot more internet bandwidth for the Seti@Home project to do uploads, downloads, scheduler requests, etc.


A lot? Not quite ;-) Results traffic (wire bandwidth consumed) is pretty minor --- you'd probably have to look at 250 pages worth of results (in under 5 seconds) to equal 1 potential WU download in bandwidth (I say 'potential', because bandwidth isn't banked -- if it goes unused this second it's gone forever). When people can't download a WU, it's because there are none to begin with or the system is swamped by things OTHER than Result serving (ditto uploads, scheduling, etc).

In fact, for those of us reading here, the Forums with those signature images & avatars on practically every message of every thread are a greater overall bandwidth hit. But aesthetics & e-awards have their values too.

Had any of these been genuine culprits in the past, it's pretty easy to throttle both with a QOS pushback for the sake of production traffic (vs status & other website traffic such as the streaming videos), it might even already be there, just never seen because it's just not an issue since on avgerage the system handles and balances it all (aside from the self imposed post maintenance storm every Tuesday). But yes, a bigger newer pipe would be grand for far more than speedier results ;-)

Cheers
ID: 940042 · Report as offensive
Profile Pappa
Volunteer tester
Avatar

Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 940043 - Posted: 15 Oct 2009, 3:30:05 UTC - in response to Message 940022.  
Last modified: 15 Oct 2009, 3:32:26 UTC

Will you set Boinc "show_results" back to TRUE any time soon?

Thx so much !


With this feature turned off, we have a lot more internet bandwidth for the Seti@Home project to do uploads, downloads, scheduler requests, etc. I'm wondering why we need this feature turned back on since Boinc was designed to be simply set and left alone to run. My long-time Seti friends have not had any network dropouts since this feature was turned off.

What is the value of being able to parse through one's or someone else's tasks?


Nikos

What people see when they visit the Forums or look at their Account page is on a seperate link. It does not interfere with Uploads/Downloads to Seti. The Pending Credit or Tasks information is normally drawn from the Replica Boinc Database.

What is in their Account page, attacks the Boinc Replica Database. Drops outs would "normally" be associated with problems accessing the Boinc Master Database. Most of that has been cleared.

So for the answer to what is the value of being able to look at problems with a Workunit that someone had a problem with? It is important in Users Helping Users. The ability to see what happened or did not happen. This also goes for the Lunatics Crew to see what was broken. Or the Access to the Information.

A Large number of people try to Help. When they do not have access to the information it is very tough.

Regards
Please consider a Donation to the Seti Project.

ID: 940043 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 940093 - Posted: 15 Oct 2009, 8:03:39 UTC - in response to Message 940043.  

What is in their Account page, attacks the Boinc Replica Database.



Poor server. :-)

ID: 940093 · Report as offensive
Profile 52 Aces
Avatar

Send message
Joined: 7 Jan 02
Posts: 497
Credit: 14,261,068
RAC: 67
United States
Message 940094 - Posted: 15 Oct 2009, 8:09:01 UTC - in response to Message 940093.  

Poor server. :-)


At last, an Apache Server that attacks back !
ID: 940094 · Report as offensive
thentangler

Send message
Joined: 9 Oct 09
Posts: 1
Credit: 21,091
RAC: 0
United States
Message 940110 - Posted: 15 Oct 2009, 11:12:40 UTC - in response to Message 940094.  

I cant seem to get any Graphical results loaded when Seti@home is running.
In screensaver mode, it says that Boinc is currently idle.
Is there a problem at the server?
ID: 940110 · Report as offensive
HAL

Send message
Joined: 28 Mar 03
Posts: 704
Credit: 870,617
RAC: 0
United States
Message 940113 - Posted: 15 Oct 2009, 11:56:40 UTC - in response to Message 940110.  
Last modified: 15 Oct 2009, 11:59:31 UTC

Nobody has been able to display results, tasks, or pending for the past 2 weeks almost. It is completely unknown if or when they will be able to do so again.
ID: 940113 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 940118 - Posted: 15 Oct 2009, 12:36:08 UTC - in response to Message 939902.  

You know, I just read your old post on why Classic closed again and saw this in there:

Matt L. wrote:
4. The SETI@home Classic backend is a tangled mess. There have been many problems over the years, most of which were invisible to the participants. None of these problems were fatal to the project or its science, but have resulted in an obnoxious web of ridiculous dependencies, confusing configurations, and unweildy databases. I am practically drooling dreaming of day when we get to turn all that stuff off and be done with it already. The BOINC backend is sooooo much easier to deal with.

And here we are... That still true then? ;-)
ID: 940118 · Report as offensive
Profile Pooh Bear 27
Volunteer tester
Avatar

Send message
Joined: 14 Jul 03
Posts: 3224
Credit: 4,603,826
RAC: 0
United States
Message 940120 - Posted: 15 Oct 2009, 12:39:54 UTC - in response to Message 940042.  

In fact, for those of us reading here, the Forums with those signature images & avatars on practically every message of every thread are a greater overall bandwidth hit. But aesthetics & e-awards have their values too.

This does not take up bandwidth as you would think. The web pages uses the Campus network. The file transfers take the SETI bandwidth. So you are not stealing bandwidth from the SETI pipe when perusing the webpages.



My movie https://vimeo.com/manage/videos/502242
ID: 940120 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 940136 - Posted: 15 Oct 2009, 13:50:19 UTC - in response to Message 940025.  

What is the value of being able to parse through one's or someone else's tasks?


...high-end MoBo...300$, latest CPU ...974$,
mindblowing watercooled VGA ...700$ ...
Chance to see how, or if it works as it should, compared to others...priceless :)


Reminds me of a quote I recently heard in the movie "I Hope They Serve Beer In Hell", and I've heard the line before then as well:

If it lacks a price, then it it probably worthless. :)

PS - This was only meant as humor. Anyone lacking a sense of humor should please disengage themselves from the keyboard before firing off a flame or argument to the contrary. Thank you and have a nice day. :)
ID: 940136 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 940137 - Posted: 15 Oct 2009, 13:51:18 UTC - in response to Message 940118.  

You know, I just read your old post on why Classic closed again and saw this in there:

Matt L. wrote:
4. The SETI@home Classic backend is a tangled mess. There have been many problems over the years, most of which were invisible to the participants. None of these problems were fatal to the project or its science, but have resulted in an obnoxious web of ridiculous dependencies, confusing configurations, and unweildy databases. I am practically drooling dreaming of day when we get to turn all that stuff off and be done with it already. The BOINC backend is sooooo much easier to deal with.

And here we are... That still true then? ;-)


LOL
ID: 940137 · Report as offensive
Andreas

Send message
Joined: 21 Jan 02
Posts: 16
Credit: 9,911,789
RAC: 0
Germany
Message 940138 - Posted: 15 Oct 2009, 13:53:36 UTC - in response to Message 940118.  

You know, I just read your old post on why Classic closed again and saw this in there:

Matt L. wrote:
4. The SETI@home Classic backend is a tangled mess. There have been many problems over the years, most of which were invisible to the participants. None of these problems were fatal to the project or its science, but have resulted in an obnoxious web of ridiculous dependencies, confusing configurations, and unweildy databases. I am practically drooling dreaming of day when we get to turn all that stuff off and be done with it already. The BOINC backend is sooooo much easier to deal with.

And here we are... That still true then? ;-)


ROFL
ID: 940138 · Report as offensive
1 · 2 · Next

Message boards : Technical News : Storm (Oct 14 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.