Message boards :
Number crunching :
Why is BOINC so finicky?
Message board moderation
Author | Message |
---|---|
Yogurtron Send message Joined: 31 Jul 03 Posts: 1 Credit: 29,910 RAC: 0 |
First of all, I want to say, this is not a flame thread, and I know the people at SETI and the other BOINC projects are doing their best. But I am confused, how come most boinc projects (SETI, LHC, and Predictor (the ones I noticed and refer to with the problem so far)) have so many problems? I mean, I know that BOINC is still in some... semi-beta form, but most of these downtimes have description saying that they are replacing actual hardware. I mean, is how hard is BOINC to process on the server side that causes it to actually overload hardware? The software problems I can probably chalk up to it still being in some form of semi-beta... but I mean, why are there problems so often? I dunno, it just seems weird how often these problems arise. But yeah, I just wanted clarification to answer my curiosity of just why BOINC has so frequent problems. PS. Still, keep up the good works guys (and that isn't sarcastic) |
Toby Send message Joined: 26 Oct 00 Posts: 1005 Credit: 6,366,949 RAC: 0 |
A good question actually. Since seti has so many more users than the other projects, it would be easy to say 'so many users just clog the system'. But I don't think this is the correct answer. First of all, as Matt posted in the "database replica thread" they don't have all that big of a budget. This means they can't get the super enterprise level stuff that might help. They are using a software RAID for crying out loud... I'm not sure what kind of money the other projects have. With predictor, it isn't so much hardware problems as vendor problems. Dell seems to have 'lost' their server plus they are still upgrading to BOINC v4 and improving their science client applications which are software issues. With LHC I suspect they just didn't have the hardware to support the latest rush. They had a bunch of work units that finished really quickly (some withinn mere seconds) and this obviously put a huge strain on the hardware for a project which has just come out of beta. They probably should have upgraded their hardware before coming out of beta but they are Swiss so we will forgive this error because they make excellent chocolate :) Right now they are simply out of work units. Everything they wanted to have processed has been processed. Until they analyze the data, they can't generate new work units. It has been stated by their admins that this may frequently be the case with their project. CPDN on the other hand has HUGE work units (each one takes 3 weeks or more on a good CPU) so their database/bandwidth demands are much less and they have had much better uptime because of it. Those excuses being made, I think there are a few design issues with the server side part of BOINC. I haven't looked at the code very much but from what I can tell, EVERYTHING runs off of one central database. Most of the science stuff probably has to but one area that I think should be split off is the message boards. LHC implied in one of their posts that the message boards actually put a fair amount of stress on the database. If the message boards were a completely seperate database, the load would be reduced but (possibly more importantly) when the project went down or got overloaded, users could still get to the message boards and see what is going on. There is my 2 cents worth - from a recently graduated software engineer with virtually no practical experience. If you think I'm smart and want to hire me, get in touch! If you think I'm stupid, don't tell anyone else! :) --------------------------------------- - A member of The Knights Who Say NI! Possibly the best stats site in the universe: http://boinc-kwsn.no-ip.info |
geckomind Send message Joined: 27 Nov 03 Posts: 5 Credit: 5,063 RAC: 0 |
Hi there! Interesting analysis... Sounds quite logical to me. Why the heck are they running the boards under the same server/database anyway? Wouldn't it be easier to set up a seperate thingy with phpbb or vbulletin out o' the box? I mean, I don't now stuff. Just strikes me, that's all... Greetings from the University of Bonn, Germany! |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> Hi there! > > Interesting analysis... Sounds quite logical to me. Why the heck are they > running the boards under the same server/database anyway? Wouldn't it be > easier to set up a seperate thingy with phpbb or vbulletin out o' the box? > > I mean, I don't now stuff. Just strikes me, that's all... Well, not ALL of it has to be out of one database. And they may be doing some more partitioning later. But just one simple example, your work needs your account data ... so the the web site for the "Your Account" page ... Now, they have done some more moving of things around we can hope for better responses ... LONG term, we may even see multiple databases set up with replication so that the web site can be hosted completely separately from the remainder of the site, with changes to the account data being replicated forward to the science database that the BOINC Work Manager connects to for accumulation of the results. <p> For BOINC Documentation: Click Me! |
geckomind Send message Joined: 27 Nov 03 Posts: 5 Credit: 5,063 RAC: 0 |
Yeah.... That sounds about right. It's just that I'm used to web projects where the forum-data is totally seperate of the other stuff. OK, you have to sign up twice and stuff but it makes the actual project you doing much more stable. My only experience is with MySQL stuff for php-applications and even there iz is sometimes a good idea to seperate things... Cheers! It's dog eat dog, rat eat rat Kroc-style - Boom, like that GeckoMind.net |
Scott Brown Send message Joined: 5 Sep 00 Posts: 110 Credit: 59,739 RAC: 0 |
@Toby A nice clear post, but... "Since seti has so many more users than the other projects, it would be easy to say 'so many users just clog the system'." technically it is not the number of users but the number of hosts that are the basis of the 'clog the system' argument. |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> @Toby > > A nice clear post, but... > > "Since seti has so many more users than the other projects, it would be easy > to say 'so many users just clog the system'." > > technically it is not the number of users but the number of hosts that are the > basis of the 'clog the system' argument. Actually, it is the number of users, who have a number of hosts, that all return an even greater number of results ... <p> For BOINC Documentation: Click Me! |
Troy_ND Send message Joined: 23 Aug 02 Posts: 8 Credit: 1,879,844 RAC: 0 |
The way I imagine how the Seti/BOINC project is and possible other BOINC projects: Imagine trying to build a huge 747 passenger plane with only a $5000(USD) yearly budget. Eventually you'll be able to get the plane built how you want it to be, but it takes time. Now I know I'm probably exagrating a little and the $5000/year is simply a number I made up, but it helps explain why thing might not always go how the project teams would like them to go. I'm sure alot of the problems they've been having here would just go away if they could get a huge quad Xeon(HT)processor enterprise server with like 2-4Gb of RAM, along with nice large and very fast 15K SCSI drives in a hardware array. But it's tough to do if you don't have the budget for it. Coming from a IT background, I figure if the Seti project team is successfully able to run a database server with all the connections the Seti/Boinc has on a software array and probably not a very highend server, then WAY TO GO, Seti/Boinc Team!!! :) Troy |
Scott Brown Send message Joined: 5 Sep 00 Posts: 110 Credit: 59,739 RAC: 0 |
> > @Toby > > > > A nice clear post, but... > > > > "Since seti has so many more users than the other projects, it would be > easy > > to say 'so many users just clog the system'." > > > > technically it is not the number of users but the number of hosts that > are the > > basis of the 'clog the system' argument. > > Actually, it is the number of users, who have a number of hosts, that all > return an even greater number of results ... Paul, Nope...the number of users is irrelevant. For example, a single user with 1,000 hosts would provide roughly the same load as 1,000 users with 1 host each. Number of users is not even relevant for a minimum host count (i.e., at least one host per user) since one can sign up for an account and never actually crunch anything. |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
I'll tell you why BOINC is so finicky: Not enough hardware, not enough staff, not enough air conditioning. This isn't a complaint, or a cry for help/sympathy. This is an obvious artifact of us here at the lab trying to run classic SETI@home (which has older, but much better hardware as it has to handle currently half a million active users), while trying to start up a whole other SETI@home on far less hardware. We were hoping the rise of BOINC and the "fall" of classic SETI@home would be much smoother, in that BOINC could take over the nicer hardware as classic SETI@home wound down and needed it less. Not exactly happening that way. As well, we can't just add hardware for several reasons. One - we don't have the money to obtain the hardware. Two - we barely have the time to set up and integrate the hardware we already have. And Three - above all else, we only have one server closet, and it is completely maxed out as far as power, space, and air conditioning. So.. as users slowly ramp up on BOINC (which is a good thing), we can only adjust so much to handle each new crisis. Okay.. I better get up to the lab for at least a few hours today to start the project back up. Then I'm off to a couple gigs where I'll make a lot more money (another sign of BOINC budgetary constraints - I make more as a musician than a systems administrator, so I gotta take the music gigs when they come up). - Matt BOINC/SETI@home |
John Cropper Send message Joined: 3 May 00 Posts: 444 Credit: 416,933 RAC: 0 |
> I'll tell you why BOINC is so finicky: > > Not enough hardware, not enough staff, not enough air conditioning. Hardware: See if you can get some money from "Uncle Arnold"...yeah, right! > > This isn't a complaint, or a cry for help/sympathy. This is an obvious > artifact of us here at the lab trying to run classic SETI@home (which has > older, but much better hardware as it has to handle currently half a million > active users), while trying to start up a whole other SETI@home on far less > hardware. > > We were hoping the rise of BOINC and the "fall" of classic SETI@home would be > much smoother, in that BOINC could take over the nicer hardware as classic > SETI@home wound down and needed it less. Not exactly happening that way. > In the world of technology, things seldom perform as designed. Perhaps pushing the issue to the client side with a more aggressive cutover should be considered. > As well, we can't just add hardware for several reasons. One - we don't have > the money to obtain the hardware. Two - we barely have the time to set up and > integrate the hardware we already have. And Three - above all else, we only > have one server closet, and it is completely maxed out as far as power, space, > and > air conditioning. Power: Put more gerbils on the wheel. ;o) Space: Can't you get rid of the cot and just put a sleeping bag on the roof? :o) AC: Add a wall vent and get [insert opposing political party here] to suck the air from the room. > > So.. as users slowly ramp up on BOINC (which is a good thing), we can only > adjust so much to handle each new crisis. > It's a shame you don't have a FULLY distributed model that would allow CLIENTS to perform some of the tasks (splitting, for instance) that are needed to support the project when key components take a dirt nap. (Yeah, I know EVERYTHING can't be done outside, mainly for data integrity and security purposes). Such a model would allow the project to become more self-supporting, especially if the client could switch back and forth between tasks based on project needs. |
HachPi Send message Joined: 2 Aug 99 Posts: 481 Credit: 21,807,425 RAC: 21 |
Matt : 1. BOINC is in my humble opinion NOT FINICKY!!! 2. Most of the REASONABLE people who have some science experience at university labs KNOW how difficult it is to run projects on low budget costs. 3. We KNOW the TEAM are doing their UTMOST, none of us could ask for more, we are VERY PROUD to have such a bunch of guys on the job. 4. The only criticism I did have was concerning the PR in the first period (NOT in the last two or three weeks - info is important even for us to get going). Greetings from Belgium, we are HERE TO STAY AND TO SUPPORT!!! > > I'll tell you why BOINC is so finicky: > > Not enough hardware, not enough staff, not enough air conditioning. > > This isn't a complaint, or a cry for help/sympathy. This is an obvious > artifact of us here at the lab trying to run classic SETI@home (which has > older, but much better hardware as it has to handle currently half a million > active users), while trying to start up a whole other SETI@home on far less > hardware. > > We were hoping the rise of BOINC and the "fall" of classic SETI@home would be > much smoother, in that BOINC could take over the nicer hardware as classic > SETI@home wound down and needed it less. Not exactly happening that way. > > As well, we can't just add hardware for several reasons. One - we don't have > the money to obtain the hardware. Two - we barely have the time to set up and > integrate the hardware we already have. And Three - above all else, we only > have one server closet, and it is completely maxed out as far as power, space, > and > air conditioning. > > So.. as users slowly ramp up on BOINC (which is a good thing), we can only > adjust so much to handle each new crisis. > > Okay.. I better get up to the lab for at least a few hours today to start the > project back up. Then I'm off to a couple gigs where I'll make a lot more > money (another sign of BOINC budgetary constraints - I make more as a musician > than a systems administrator, so I gotta take the music gigs when they come > up). > > - Matt > BOINC/SETI@home |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> Nope...the number of users is irrelevant. For example, a single user with > 1,000 hosts would provide roughly the same load as 1,000 users with 1 host > each. Number of users is not even relevant for a minimum host count (i.e., at > least one host per user) since one can sign up for an account and never > actually crunch anything. I was not cleat (not that it matters), I should have been more specific; as in: Number of users times the number of hosts times the number of results, or: U * H * R = Collapse ... <p> For BOINC Documentation: Click Me! |
haddock29 Send message Joined: 18 Sep 99 Posts: 36 Credit: 26,012,417 RAC: 0 |
> First of all, I want to say, this is not a flame thread, and I know the people > at SETI and the other BOINC projects are doing their best. > > But I am confused, how come most boinc projects (SETI, LHC, and Predictor (the > ones I noticed and refer to with the problem so far)) have so many problems? > I mean, I know that BOINC is still in some... semi-beta form, but most of > these downtimes have description saying that they are replacing actual > hardware. I mean, is how hard is BOINC to process on the server side that > causes it to actually overload hardware? > > The software problems I can probably chalk up to it still being in some form > of semi-beta... but I mean, why are there problems so often? I dunno, it just > seems weird how often these problems arise. > > But yeah, I just wanted clarification to answer my curiosity of just why BOINC > has so frequent problems. > > PS. Still, keep up the good works guys (and that isn't sarcastic) > The question may be: "Why is Seti classic so efficient, and Seti boinc so dificult to start?". Boinc seems to be a good environment for cpdn (limited number of very large WU),may be it is not a good solution for seti. |
Ulrich Metzner Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13 |
[sarcasm mode on] It's the old "University teached theory meets the cold hard practice". [sarcasm mode off] Nearly nothing i learned as the theoretically best approach at the university survived the practical test in real live conditions. You always have to adapt the theoretic plan to the real live situations. Maybe it's time to think over the much too complicated and database trashing validating mechanisms. Throwing only hardware at it will only delay the problems to a later phase. Ever heard of the "KISS principle"? It's the Keep It Small & Simple approach, working often the best. But that's, of course, not academical ;) greetz, Uli |
Papa Zito Send message Joined: 7 Feb 03 Posts: 257 Credit: 624,881 RAC: 0 |
I thought KISS was Keep It Simple, Stupid. Ah well. I have a solution to your space and air conditioning problems. Move the SETI computer stuff to Alaska, and store them in a large shack. See, I contribute... ------------------------------------ The game High/Low is played by tossing two nuclear warheads into the air. The one whose bomb explodes higher wins. This game is usually played by people of low intelligence, hence the name High/Low. |
Toby Send message Joined: 26 Oct 00 Posts: 1005 Credit: 6,366,949 RAC: 0 |
> I thought KISS was Keep It Simple, Stupid. Me too > I have a solution to your space and air conditioning problems. Move the SETI > computer stuff to Alaska, and store them in a large shack. Oh... They finally got internet up there? Or were you proposing using IP over avian carrier? Or maybe Elk? --------------------------------------- - A member of The Knights Who Say NI! Possibly the best stats site in the universe: http://boinc-kwsn.no-ip.info |
JAF Send message Joined: 9 Aug 00 Posts: 289 Credit: 168,721 RAC: 0 |
OK, I'm going to ramble a bit. I see the key, major, and regular sponsors mentioned on the main page. I'm a little surprised the hardware needed to support Boinc Seti isn't setting on shelves at some of those sponsors. It would seem be a good public relations donation to get this project up to speed where it should be. I also see a real lack of sponsorship by some companies that profit the most: (in no particular order) Intel, AMD, and Microsoft. I know these three companies make a nice profit from Seti - just look at the "top computers" list. Just my opinion, but i think it is accurate. |
Papa Zito Send message Joined: 7 Feb 03 Posts: 257 Credit: 624,881 RAC: 0 |
> > I thought KISS was Keep It Simple, Stupid. > > Me too > > > I have a solution to your space and air conditioning problems. Move the > SETI > > computer stuff to Alaska, and store them in a large shack. > > Oh... They finally got internet up there? Or were you proposing using IP > over avian carrier? Or maybe Elk? I'm surprised at you. A member of the KWSN should see the obvious solution. Our data comes and goes in packets, right? What better way to encapsulate a packet than... a coconut? Migratory coconuts, my friend. And for faster processing, we might want to employ some swallows. ------------------------------------ The game High/Low is played by tossing two nuclear warheads into the air. The one whose bomb explodes higher wins. This game is usually played by people of low intelligence, hence the name High/Low. |
Stephen Balch Send message Joined: 20 Apr 00 Posts: 141 Credit: 13,912 RAC: 0 |
> > Migratory coconuts, my friend. And for faster processing, we might want to > employ some swallows. > An African or European swallow? <P>"I want to go dancing on the moon, I want to frolic in zero gravity!....", and now, I might be able to go someday! Thanks, SpaceShipOne and crew!<BR><a> [/url] |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.