Message boards :
Technical News :
Sad Trombone (Nov 25 2009)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Okay then. The mysql commit behavior we were testing was an absolute failure - though for expected reasons (not enough disk i/o, even with the solid state drives). It was worth a shot, but we fell back to the old commit behavior for now. However, this caused a lot of backend processes to clog up including the transitioners, which ultimately meant the splitters burned through all kinds of raw data files before they realized we had more than enough work on disk. This could have been bad, i.e. filled up our workunit storage server, but luckily it didn't even come close to doing that. Anyway, we reverted this morning and all the dams broke for a while... until we ran out of work to send out. Turns out the last 10 files I brought up from Arecibo are all broken. In better news, we did the last bits to get the Astropulse signal table fully copied over to another database fragment - only losing a few rows here and there (as opposed to many thousands as originally thought). Work will resume on Monday to make this exchange old/new fragments and hopefully the science database will be much happier. That's it for now. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Well...... Hope you find time to enjoy your Thanksgiving anyway Matt and Jeff. The crunchers will wait if need be. So now ya gotta troubleshoot the recorder at Arecibo again on top of everything else, eh? Sad song indeed. Best of luck sorting it. Happy Thanksgiving to you and yours. "Time is simply the mechanism that keeps everything from happening all at once." |
James Sotherden Send message Joined: 16 May 99 Posts: 10436 Credit: 110,373,059 RAC: 54 |
Matt, that stinks you have to check in and do some kicking on the holidays. Theres nothing I hate worse than having to think about work on a long weekend. Have a great Thanksgiving. [/quote] Old James |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
So it'll be to me and Jeff to check in over the next few days and kick the pipeline along. Are you crazy? Go on, have that same extra long weekend that everyone else has - medical, fire and police people excluded - as I am sure people's computers can do without work for a while, while the people being boss of said computers have got plenty of projects to choose from to overcome any workflow problems Seti has. Have a fine weekend and a Happy Thanksgiving, Matt (and Jeff). :-) |
Julie Send message Joined: 28 Oct 09 Posts: 34060 Credit: 18,883,157 RAC: 18 |
Have a happy Thanksgiving Matt and Jeff:-) Really glad you keep us so well updated! |
KWSN Ekky Ekky Ekky Send message Joined: 25 May 99 Posts: 944 Credit: 52,956,491 RAC: 67 |
Are you crazy? Go on, have that same extra long weekend that everyone else has - medical, fire and police people excluded - What? Let 'em have all that time off? Surely Matt and Co. really are our Fire, Police, Ambulance services all rolled into one? Only kidding - have a great time while all the rest of us stare into blank screens! ;-) |
Peterjansson20 Send message Joined: 12 Oct 00 Posts: 1 Credit: 124,856 RAC: 0 |
I hope i will worke better for you an you not lose enny more data God lyck! Peter |
Saicere Send message Joined: 9 Jul 99 Posts: 7 Credit: 6,717,907 RAC: 0 |
You're talking a lot about MySQL problems on that new monster of yours, and in case you weren't aware, I just thought you should know that MySQL scales horribly on a large number of cores. It can barely scale to 8, so there's virtually no chance that you're getting good results on 24. Sun has an "official" writeup of a scaling attempt here: http://blogs.sun.com/mrbenchmark/entry/scaling_mysql_on_a_256 Their suggestion? Run several instances of MySQL on the same server. Which is a bit meh, but if you insist on running MySQL on a high-performance project like this, it might be worth looking into splitting up the most trafficked tables into separate instances. Then you could also enable the safe commit behavior individually on the critical servers. You've also mentioned running into replica problems on jocelyn after a primary crash, which is very common. Here's a trick to get around it, at least temporarily. After the crash, feed jocelyn the following through the MySQL console: CHANGE MASTER TO MASTER_LOG_FILE=[NEXT FILE], MASTER_LOG_POS=4; START SLAVE; Replace [NEXT FILE] with the name of the first binary log file on mork that it started writing after the crash, typically mysql-bin.002342 or something similar. You can get the name by running SHOW MASTER STATUS; on mork. This will skip the corrupted end of the previous binary log, and restart replication from the new file. Then verify that it's running with a SHOW SLAVE STATUS; Note that you *cannot be sure* that the state on the replica and the primary is now consistent unless you are using safe commits, however it should still be consistent enough for non-science use. |
C Send message Joined: 3 Apr 99 Posts: 240 Credit: 7,716,977 RAC: 0 |
Looks as if everything quit completely last night around 2300 PST. Let it go for the day, Matt, and enjoy Thanksgiving Day with friends and family. We'll survive... C Join Team MacNN |
cncr04s Send message Joined: 25 Oct 00 Posts: 6 Credit: 296,024 RAC: 0 |
Seti is always offline it seems, I was enticed by emails to come back, but it seems the same thing is happening that made me leave in the first place. I've been with seti for a long time (9) years, and I'm sad to see so many problems with servers going down lately... can't you guys find some one smart enough to fix stuff? and buy lasting equipment, as my server machine is still top shape after 6 years. I'd be willing to do help seti in my spare time, if only I lived that far west, I'm sure others would too so don't rant to me about "costs" and the lack of funding. |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 |
Seti is always offline it seems, I was enticed by emails to come back, but it seems the same thing is happening that made me leave in the first place. I've been with seti for a long time (9) years, and I'm sad to see so many problems with servers going down lately... can't you guys find some one smart enough to fix stuff? and buy lasting equipment, as my server machine is still top shape after 6 years. I'd be willing to do help seti in my spare time, if only I lived that far west, I'm sure others would too so don't rant to me about "costs" and the lack of funding. They have no money for anything - the entire budget is from donations at the moment. Servers are mostly donated... BOINC WIKI |
Bearcat Send message Joined: 10 Sep 99 Posts: 106 Credit: 10,778,506 RAC: 0 |
... ,I was enticed by emails to come back, but it seems the same thing is happening that made me leave in the first place. ... Same here. I have been with SETI from 1999 to 2004 and received a "we need your help" email a few months ago. Unfortunately I can only donate my computer time, but not money, and it seems (at least to me) that SETI either needs more donations or the existing donations are spent unwisely. It is certainly none of my business, but the thought crossed my mind, is there info available about the funding situation and where the money goes ? It also seems people are just interested in keeping the computers going to gain credit instead of actually finding ET. Isn't this what's it's all about ? And how are we going to do this without getting new data from Arecibo ? It seems pointless to continue and to waste millions of KW hours /end of rant + start of apology |
GreggyBee Send message Joined: 9 Mar 01 Posts: 203 Credit: 1,600,521 RAC: 0 |
... ,I was enticed by emails to come back, but it seems the same thing is happening that made me leave in the first place. ... For some people, yes; what you need to remember is S@H is now only one part of a larger Distributed computing project called Boinc; A lot of people are only interested in the number-crunching game- not the individual merits of any one project. Don't believe me? Just check out the shoutbox on Boincstats: there's no loyalty to individual projects, just credit- chasers boasting about the latest mullti-million credit day- on BOINC. S@H has become a victim of its own pioneering. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
Seti is always offline it seems, I was enticed by emails to come back, but it seems the same thing is happening that made me leave in the first place Then you didn't understand what was going on then, and you don't understand now. Setting aside the funding issue, somewhat. BOINC projects are supposed to do big science on very small budgets. That means that they don't do things like redundant server clusters, multihomed sites and all of the (expensive) things that hide the occasional outage that you might see with Amazon or Space.com. If you have work in your cache, a server outage is a minor inconvenience. It's interesting to know about, but that's all. The work isn't time critical, so getting reported later is no big deal. I wish people would start realizing how well the entire system (client and server) work when none of the individual components are 99.99% reliable. |
ML1 Send message Joined: 25 Nov 01 Posts: 21019 Credit: 7,508,002 RAC: 20 |
... The work isn't time critical, so getting reported later is no big deal. Considering the overall environment, the vast numbers (data AND users) and the project goals, I'm utterly amazed at how successful and how well all of this works in the first place. Truly a dedication from Matt & Dr A and a few others for the last decade and more! And all despite some flat-Earth political nit-twit forbidding all funding in case they might find God?!!! (Or some such?...) Meanwhile, a failed network switch and a few hours downtime over their big festive break is all part of the fun. The big Boinc crunch continues, uninterrupted. Who forsake their turkey to go into the lab to fix that one I wonder?!... ... So when do all the Boinc scheduler problems and all the other minor niggles get fixed? ;-) (Or to rephrase: Any interested volunteers to the rescue?) It's all part of the experiment and the development! Happy Thanksgiving crunchin'! Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
Odysseus Send message Joined: 26 Jul 99 Posts: 1808 Credit: 6,701,347 RAC: 6 |
|
Wingless Wonder Send message Joined: 14 May 99 Posts: 14 Credit: 12,157,146 RAC: 0 |
In response to some less-than-positive comments about the state of SETI@home, its servers, not being able to obtain work, etc., I myself volunteer use of my computer because I feel like it. No one is twisting my arm to do so. I maintain a two-day work cache in anticipation of the occasional outages, along with participation in other BOINC projects. I rarely run out of work. If I did, no big deal. I don't feel that the folks at SETI@home owe me some sort of debt of gratitude. On the contrary, I feel privileged to be able to participate in any of the BOINC projects. |
Robert Waite Send message Joined: 23 Oct 07 Posts: 2417 Credit: 18,192,122 RAC: 59 |
No worries A signal may travel for thousands of years to reach Earth. No need to squack about a few days of down time. |
Dr. C.E.T.I. Send message Joined: 29 Feb 00 Posts: 16019 Credit: 794,685 RAC: 0 |
eh Matt - a big hug for you & your updates from joanne & i BOINC Wiki . . . Science Status Page . . . |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.