Message boards :
Technical News :
Quick Outage Today (Sep 22 2009)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Today was an outage day, with nothing special to report on that front. One interesting note is that our master mysql database server (mork) has 24 processors and 64 GB of memory, and the replica server (jocelyn, which used to be the master) has 4 processors and 28 GB of memory. Eric recently cleaned out really old rows from the beta result table - now the entire database fits better in memory on jocelyn, and in turn this database engine generally performs better than mork. How could this be? Because despite have far less memory and processors, jocelyn has more disk spindles (and faster disks, for that matter) than mork. Not really all that surprising, but it's fun to see our suspicions about disk performance confirmed with memory being less of a bottleneck. In any case, both servers are zippy and today's outage wasn't very long, was it? So the weekend went by with nary a blip, or even a single alert from my web of alert scripts. This pretty much never happens. We always get kind of warning, severe or otherwise - high load on this server, replica database is falling behind, rising temperatures in the closet... but nope. Everything was just fine. However yesterday we did have one short traffic dip due to the science database getting locked up on too many internal user queries, so the splitters weren't creating work for a couple hours there. No biggie - we killed the queries and informix sprung back to life. It is a bit worrisome how locked up the database can get, though, and it's hardly predictable when (or why) it does. I'm actually running my software radar blanker through an entire 50GB test file right now. It processes in roughly twice real time (meaning a file containing n hours of data takes 2n hours to find radar and blank it). Not to worry - we can run many of these in parallel. I could also make several code optimizations if need be. Anyway, I'm hoping by the end of the week to trust this suite of software enough to start processing our large backlog of 2007-2008 data by next month. Oh yeah one more thing - we do know that "queries/second" field is blank on the server status page. For some reason the same exact informational query on one server returns in a different format than the other, so our general "db stats" script is sorta broken. Bob is fixing it. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Francis Noel Send message Joined: 30 Aug 05 Posts: 452 Credit: 142,832,523 RAC: 94 |
Thanks for the update Matt. Did you get that eerie feeling that things just went "too well" ? As a sysadmin when everything is going smoothly I always get that calm-before-the-storm feeling of impending doom :). mambo |
zpm Send message Joined: 25 Apr 08 Posts: 284 Credit: 1,659,024 RAC: 0 |
wouldn't it be nice to have some SSdrives. I recommend Secunia PSI: http://secunia.com/vulnerability_scanning/personal/ Go Georgia Tech. |
ront Send message Joined: 25 Aug 01 Posts: 77 Credit: 386,336 RAC: 0 |
thank you for the info. Have 14 "pendings" stacked up dating back to the 17th Please advise. Be Blessed & Be A Blessing, ront |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
wouldn't it be nice to have some SSdrives. Actually they do. Apparently they won't work with the new Intel server (Mork). BOINC blog |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
thank you for the info. Have 14 "pendings" stacked up dating back to the 17th These questions belong in Number Crunching. Technical News is for general updates from the project. Every work unit has to be processed twice, and the results must match. Your pendings are waiting for the second result. If you need more, ask in Number Crunching. |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
thank you for the info. Have 14 "pendings" stacked up dating back to the 17th To find out which of the above is true (I.E. waiting for matching WU or that that WU didn't jibe with yours...) on the accounts page, click on "Tasks": Items that say "Completed, waiting for validation" are waiting for the matching WU, items that say "Completed, validation inconclusive" are waiting of a "tie breaker" WU to determine which of two different results is correct. . Hello, from Albany, CA!... |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.