Message boards :
Technical News :
Hocket (Aug 05 2010)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Another catchup post. I'm still trying to page in everything I missed in July - it doesn't help that shortly after the last post I got a nasty summer cold. I'm back in business now. We had another mysql database server crash over the weekend, which Jeff handled remotely without much ado. The upload server also had its directly attached storage array freak out again. This is becoming a common event, resulting in the software RAID getting in some funky state (which has always been reversible thus far). Other than that, the servers are still chugging along. As for the grand server shuffle, progress has been made and a definite plan is in motion. Basically marvin is becoming bambi (the Astropulse database) and bambi is becoming bruno (the upload/BOINC admin server) and bruno is being turned off. Meanwhile some new machine (we'll acquire somehow) will become thumper (the science database) and thumper will become ptolemy (internal file server) and ptolemy will shut off. Getting bruno and ptolemy out of the picture means two of the three servers prone to random crashes/hardware issues will no longer be on line. The third such server is mork, which is the only server remotely close to handling the mysql database load, so no options for fixing that anytime soon. We have our hands full anyway fixing what we got. I also (finally) got a test suite working for all my birdie tests (i.e. putting a fake signal or "birdie" in the raw data, blanking it, splitting it, then running clients on it to see if the birdie still appears). This took me a while as I had to remember all the various bits and pieces of this puzzle, some of which I haven't touched for months. Now that it's all in one big script, which is nice. Oh yeah I also parallelized the software blanking pre-processing, so new data can get on line twice as fast as before (if resources are available). Jeff's going to put some newly compiled Astropulse back end services on line tomorrow. Hopefully that's all good or else we'll likely run out of work over the weekend (which happend last weekend, but was mostly hidden by the mysql database server crash). It's summertime, so people are in and out of the lab a lot, but enough of us will be in one room at the same time next week that more meaningful plans/management discussions will take place regarding NTPCkr and other scienctific analysis stuff. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Keep up the good work Matt, Jeff and Others, and thanks for the update, Claggy |
B-Man Send message Joined: 11 Feb 01 Posts: 253 Credit: 147,366 RAC: 0 |
Thank you for the update. Keep up the great work. seems to be going if not smoothly. |
Dirk Sadowski Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
Thanks for the update! What's because of the validate errors which have begun ~ 2 hours before the current outage? You will let run the famous script for to grant the Credits? |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
it doesn't help that shortly after the last post I got a nasty summer cold. Ah, so that's where I caught it from. Thanks for that. 2 weeks in and still battling it, but at least I got my voice back. :-) Thanks for the update, yet for us non-native-English-tonguers, what's Hocket? (I know Steve Hackett) |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Hocket.... A musical term...fitting for Matt. Thanks so much for taking the time to post the updates, Matt. How much ya figger somebody would have to pony up to get the server you are hankering for? Gives the kitties hope that better times are on the horizon for the Seti project. Meow meow! "Time is simply the mechanism that keeps everything from happening all at once." |
LiliKrist Send message Joined: 12 Aug 09 Posts: 333 Credit: 143,167 RAC: 0 |
Thanks for the update Master Matt =) N = R x fp x ne x fl x fi x fc x L |
ront Send message Joined: 25 Aug 01 Posts: 77 Credit: 386,336 RAC: 0 |
Hi, Thanks for the information Matt. Hope your cold is getting better. Be Blessed & Be A Blessing, ront |
ToxicTBag Send message Joined: 5 Feb 10 Posts: 101 Credit: 57,197,902 RAC: 0 |
Updates are much appreciated Matt, curse those summer colds!! |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30989 Credit: 53,134,872 RAC: 32 |
It's summertime, so people are in and out of the lab a lot, but enough of us will be in one room at the same time next week that more meaningful plans/management discussions will take place regarding NTPCkr and other scienctific analysis stuff. RAC chasers, be afraid, be very afraid. Last time that happened we got three day breaks! :) Thanks for the update Matt. Much appreciated. |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
You're not the only one, still got no 'voice' and fever too. :( Must be a mutated human/computer virus, LOL ;-) Anyway glad you 'survived' all and glad to hear from you. And ofcoarse thanks for your UPdate, on the project. |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Serious business now........ How much for the new server? Not bare bones........what you need. Or can you find a donor? The kitties need to know. "Time is simply the mechanism that keeps everything from happening all at once." |
S@NL - Vipertje - D. van Es Send message Joined: 19 Oct 02 Posts: 39 Credit: 32,174,152 RAC: 0 |
Thnx or the update Matt, only one thing I don't understand. There is a lot spoken about not enough resources and yet you discharge two servers (again). Why don't you use it for services what the can do. Even it is only one or two services. I really don't understand it. Why not using older servers for one or two services like a one mb and ap splitter on the servers. When you but so much stuff and services on a server you are depending that the server must work and if you split it one much more server you have more a failsave if one goes down that the hole project don't go offline!!! And reading your tech post for 2 years now I know that you have a lot of old server in your basemant down at Berkeley!!! Don't understand me wrong, but I find it mindbodering when I read everytime when there is something wrong. So now the question: Why don't you split the services more on the old servers??? I do what I can and I can what I do! :P |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13848 Credit: 208,696,464 RAC: 304 |
So now the question: Why don't you split the services more on the old servers??? Because they are unreliable & keep failing. And then people get all upset when they can't upload or download or report or all three untill the servers have restarted & the databases have been checked & repaired if necessary. Grant Darwin NT |
Bill Walker Send message Joined: 4 Sep 99 Posts: 3868 Credit: 2,697,267 RAC: 0 |
So now the question: Why don't you split the services more on the old servers??? And when the do fail, the scientists at S@H become the most educated IT department in the world, and spend too much time on all the things mentioned above, when they should be looking for ET. |
S@NL - Vipertje - D. van Es Send message Joined: 19 Oct 02 Posts: 39 Credit: 32,174,152 RAC: 0 |
But why is it unrealible? Are the servers the problem, or the people who install them, or the services they want to run on the servers??? I have almost never seen a unrealible server were the problem was the hardware, most of the cases were the problem blamed to the software or OS... I do what I can and I can what I do! :P |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13848 Credit: 208,696,464 RAC: 304 |
But why is it unrealible? Are the servers the problem, The servers are the main problem. Several are pre-production units. Once they've got servers that can be depended on, then they can spend more time working on the software. Grant Darwin NT |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 66303 Credit: 55,293,173 RAC: 49 |
But why is it unrealible? Are the servers the problem, Donated Pre Production Servers at that, Hopefully there is enough for 1 or 2 good production blade servers of the type Seti needs to get, Last I heard $7,000 was raised thanks to 1 loud mouth and 6 others, Maybe Seti's equivalent of "the Magnificent Seven"... I have one old Pre Production cpu running My current setup, Which is awaiting It's retirement from crunching, But the next cpu is having to wait until supporting parts are acquired and outfitted before their gone and so I wait, patiently as I have lots to get done before the computer purchases can begin so that I can be done with this old hardware. Savoir-Faire is everywhere! The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13848 Credit: 208,696,464 RAC: 304 |
I did wonder at one time whether they were pushing the Informix database to its limits but it appears not, there are larger ones out there. I think the database limits are mostly hardware. More storage & more RAM would help the databases along considerably. Grant Darwin NT |
RoosStar Send message Joined: 16 Oct 99 Posts: 51 Credit: 12,900,339 RAC: 20 |
In music, hocket is the rhythmic linear technique using the alternation of notes, pitches, or chords. In medieval practice of hocket, a single melody is shared between two (or occasionally more) voices such that alternately one voice sounds while the other rests. For a more complete explanation see here :D |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.