Message boards :
Technical News :
Bit Ceiling (Oct 08 2008)
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
P.S. Have you seen the messages about the HTTP service being down on 208.68.240.13 (one of the download servers)? - it's muddled in with other routing problems in the Users from Germany reporting bad route thread. |
Jon Golding Send message Joined: 20 Apr 00 Posts: 105 Credit: 841,861 RAC: 0 |
I realise that it would require a MASSIVE resource to replicate the complex server network that manages the SETI project at Berkeley. However, on a much more limited scale, could any single server be configured to split, distribute and receive processed SETI workunits, to validate these units and send the results back to Berkeley? If so, then a number of these servers, dotted about the globe, could form the backbone of a "distributed, distributed" SETI project. Several SETI participants already crunch on high-specification server platforms, they might be able to help spread the work and reduce the load on the Berkeley servers and build in redundancy in case of server problems at any one location. Food for thought, or guaranteed indigestion? |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
This is explained in my last wordy post (earlier today in this thread). We simply cannot have any piece of our backend remotely located from others. For example, splitters need as-fast-as-possible access to (a) raw data storage, (b) workunit storage, (c) boinc database, and (d) science database. Assimilators need as-fast-as-possible access to (a) boinc db, (b) science db, (c) uploaded result storage, and (d) workunit storage. I could go on... There is no single backend server (the list including all the above and more) that can be remotely located without taking an IMPOSSIBLY HUGE hit somewhere. - Matt I realise that it would require a MASSIVE resource to replicate the complex server network that manages the SETI project at Berkeley. -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Jon Golding Send message Joined: 20 Apr 00 Posts: 105 Credit: 841,861 RAC: 0 |
I didn't mean outsourcing a single backend server function, but instead a collection of individual servers, each one of which would perform ALL backend functions (albeit with a very reduced throughput of data), so that you'd have several "mini Berkeley's" in different locations. These servers would send out batches of workunits, collect and validate the results, and only the canonical validated data would get returned to Berkeley for addition to the master science database. Because these results had already been validated off-site, no further backend processing of this data would be required at Berkeley. Admittedly, you'd have to come up with some way to ship batches of data to these mini Berkeley's for them to split. However, I can appreciate that it would be a nightmare to come up with the code that unified all the current backend services and could be run on a single server. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
This is explained in my last wordy post (earlier today in this thread). We simply cannot have any piece of our backend remotely located from others. For example, splitters need as-fast-as-possible access to (a) raw data storage, (b) workunit storage, (c) boinc database, and (d) science database. Assimilators need as-fast-as-possible access to (a) boinc db, (b) science db, (c) uploaded result storage, and (d) workunit storage. I could go on... There is no single backend server (the list including all the above and more) that can be remotely located without taking an IMPOSSIBLY HUGE hit somewhere. I'm not saying this to contradict Matt in any way, but to perhaps state it differently. Lets' pick one example, and say the download server is to be moved off-site. Work to be downloaded still has to be split by the splitter, and stored on the download server. Right now, the HE link is used between the download server and those of us in userland, after the change the bandwidth would be the same, it'd be between the lab and the download server. It's easier to move problems around than it is to solve them. Most of the mirroring suggestions fall into this category. If there are any details that disagree with any of Matt's posts, please take Matt's statements as "correct." |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30651 Credit: 53,134,872 RAC: 32 |
So if I've got that straight there is HE to the outside world at PAIX. You have boxes at PAIX and a shared pipe? to CENIC. There there is a shared pipe from CENIC to the campus. There are more boxes down the hill where it gets split off to a campus owned fibre up the hill to SSL. Do I have all the hops? I can see a pile of politics in that. Gary That said, our current private SETI ISP - Hurricane Electric (a.k.a "HE") - is not directly connected to anything on campus. In fact, the whole thing is a marriage of many parties. We connect to HE down at the PAIX (major exchange in Palo Alto). But to do so we needed space/access at the PAIX. This is where the generous support and donated routers (and rack space) from Bill Woodcock (of Packet Clearing House fame) comes in. He also walked me and Jeff through using IOS to get all this to connect properly. Anyway, bits go through the PAIX to CENIC, which is where they are tunneled (thanks to much help/support from central campus) over an extra 100Mbit cable up to our SETI router in our closet. That's the "simple version" of what's going on. Many parties are involved, all with varied hectic schedules and different priorities. Getting a better link from here to central campus is but one part of the puzzle. |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 |
Maybe I'm just dense, but Jon Golding's explaination is what I've had in mind. Nobody is talking about mirroring here, which may have value from a reliability perspective but that is not the point. The objective is performance and extensibility. * Find a willing partner, say at NCU, with a 'small' server that contains a copy of Berkeley's setup. You set up the ground rules. * Send a subset of raw data disks directly to NCU, with the others going to Berkeley. The number depends on capacity, etc. Say one out of ten. In this example, NCU would be responsible for all pre- and post-processing of the work stream assocated with their portion of the data. * Write the software interface required to satisfy download requests from either Berkeley or NCU. Hide the fact that there are more than one participating server sites behind the setiathome domain. Again, a sensible, extensible protocol would need to be decided upon. * Write the software interface so that uploaded results go to the originating server (to reduce book keeping complexity, but probably not needed in the long run). * Once a week/month/whatever, merge the database from NCU into Berkeley. Other solutions exist, but this is a simple way to start. * Establish change control so that the two server systems do not get out of wack with software revisions, etc. * Berkeley and NCU write a paper on new cutting-edge success in DT and ET, get into popular press, and persuade the idiots in Washington, or better a private corporation, to fund 'us' into the future. But seriously, breaking the DT ice might lead to subsequent partnerships and larger horizons. And we all know that a single central system is not extensible, despite its other advantages. |
Jon Golding Send message Joined: 20 Apr 00 Posts: 105 Credit: 841,861 RAC: 0 |
Thanks, PhonAcq. You've got it in one! Sorry that my original posts didn't have the clarity and specifics that you outlined here (my first language is "gibberish"). I like the idea of keeping the individual servers within the context of various Universities (e.g. Astronomy departments). This ought to make things easier to administer, should increase confidence and transparency in how the data is being curated locally, and would open up joint funding opportunities for grant money to develop the idea further - towards dozens of such dedicated servers. Cheers, Jon |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
Thanks, PhonAcq. Doing it completely, as you outlined (and PhonAcq restated), is the only practical way. The problem I see is that it'd take resources (hardware, electricity, rack space) and staff-time at the "host" University, and unless the various Universities wanted to donate those resources, SETI@Home would have to pay for them. ... and if it is completely volunteer, SETI would have to guard against a partner site "going missing." It's happened. |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
I think Jon's idea is the most possible solution to distributing the back-end workload. It's worth pursuing, at least academically, but my mind reels thinking about the minor issues, at least in our case, that may make such an endeavour ultimately not worth it (at the moment). The devil is in the details. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Ianab Send message Joined: 11 Jun 08 Posts: 732 Credit: 20,635,586 RAC: 5 |
I had been thinking along the same lines as Jon, just never got around to posting about it. One of the problems I see at the moment is that we have quite a complex back end system with multiple servers all doing different tasks, sharing disk resouces and generally either tripping over each other, or just waiting while another server gets it act together. Too many single points of failure that brings the whole project to a standstill. I see a lot of smaller BOINC projects running on single servers, nowhere near the amount of Data that Seti puts through, but they seem relatively stable and chug along at their own pace. I'm thinking a 'turnkey' single server solution (with a sensible size RAID array) that could easily be tuned to the CPU, disk space, net connection that was available. If ET is found it can be reported right away, if large amounts of data needs to returned for archiving then it can be returned 'Sneakernet' on the original data disks. That makes it a practical solution for another University to contribute with basic donated hardware and a modest net connection. Main advantage is that it takes away the single points of failure that seem to plauge the project right now. If you cant get new jobs from the main database your PC hits the backup servers and picks up whatever work is available. I realise this is a pretty major redesign of the Boinc and Seti system, and it's not going to happen tomorrow, but as an academic exercise I think it's a good one. I just think it's the next natural progresion the distributed computing experiment - distributed back end servers. Good luck Ian |
speedimic Send message Joined: 28 Sep 02 Posts: 362 Credit: 16,590,653 RAC: 0 |
Another big problem is how to load-balance all the clients to the distributed backend-servers. For that load-balancing the geogrphical location and the load of the servers have to be taken into account. The first might be done by some routing protocol tricks (like for the dns-root-servers), but the second might need a single backend-coordination-server... mic. |
Jon Golding Send message Joined: 20 Apr 00 Posts: 105 Credit: 841,861 RAC: 0 |
What’s the minimum realistic specification for such a distributed backend server? We’re talking in terms of a single machine that would hold about a month’s worth of WU data at a time (is this a sensible amount?) and carry out all of the existing Berkeley backend processes, albeit with a much lower throughput (how low to be worthwhile?) and then upload the validated results to the Berkeley master science database at intervals. If Berkeley came up with a standard specification, then a SETI funding drive might help provide the cash to buy these servers. Some university departments (at least here in the UK) offer matched funding schemes, so only part of the money would come from SETI donations. A grant proposal might fund a proof-of-principle project to write the “all in one†software, although I’m sure the folks over at lunatics.kwsn.net would relish the opportunity to create, debug and optimise such software. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
I realise this is a pretty major redesign of the Boinc and Seti system, and it's not going to happen tomorrow, but as an academic exercise I think it's a good one. Actually, it's not. The BOINC model allows more than one upload server, more than one download server, and I'm pretty sure it allows more than one scheduler. ... and at least one project (Climate) has servers distributed around the planet. I don't follow "Climate" but I think at one point one of their partners basically disappeared. Work units are "tagged" with an upload URL, and the server name disappeared from DNS. (Note: this was years ago) If SETI distributed servers "out there" it'd be good for them to keep them all in the setiathome.ssl.berkeley.edu namespace. Easy to do. As Matt said, the problems are not insurmountable, and mostly, they aren't technical, they're organizational. I think the biggest problem is that we don't always look at BOINC as a system that includes the BOINC client. If the SETI servers are down, the BOINC client just waits until they're back, and as long as they don't run out of work, they're crunching away -- and work is returned and credit granted when the servers come back up. |
Christoph Send message Joined: 21 Apr 03 Posts: 76 Credit: 355,173 RAC: 0 |
@ Jon Golding Are you aware of the fact that the actual funding drive for THIS YEAR is hardly taking in the money to cover basic operational cost? I had a machine to send over to berkeley, unfortunately the MoBo broke down. Without hardware donations there is no upgrade/extension possibility this year, because there is no money to spare for that. Happy crunching, Christoph Christoph |
Jon Golding Send message Joined: 20 Apr 00 Posts: 105 Credit: 841,861 RAC: 0 |
Hi Christoph Yes, I'm well aware of the shoestring budget that SETI has to work with and the Heroic efforts of all the staff there to keep the project running and to expand it's capabilities. I'm just trying to stimulate a discussion to consider how the existing pressures on the Berkeley servers might be reduced. Turning any of those ideas into practical solutions is another problem entirely. |
Christoph Send message Joined: 21 Apr 03 Posts: 76 Credit: 355,173 RAC: 0 |
I'm just trying to stimulate a discussion to consider how the existing pressures on the Berkeley servers might be reduced. Ok, this is of course a good point. Ideas are alwys good and needed. Mine is actualy to find a mobo at ebay which can host 2 XEONS with 3.2 GHz and PC-3200 registered DDR 400 memory. And support 8 gb memory or more. That's what I need to give the old but still good parts a new home. In that case SETI will have an additional server. Maybe over the same way (ebay) another can be cobbled together for 100 € excluding memory. Memory depend to the board, DDR 1 or 2? Keep thinking, Christoph Christoph |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
We are paying a tiny fraction for the current 1Gbit than we were paying for the previous 100Mbit connection If the download/upload server's are on a 1Gbit connection & it's maxing it's self out at around 95mb/s, whats stooping the lab using the other 905mb/s to transfer the raw data from Arecibo to storage in the lab? Is this is possible it would mean while raw data is been transferred to storage the crunchers will be able to upload & download work faster. Just my thoughts. Speedy |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
So what is it gonna take to get this fixed??? A new server? 2 new servers? 5? 10? I have asked before, and was answered by Eric, that simply upgrading hard drives was not possible....... Do we need new servers, or just more online hard drive space? Storage should be easy enough to rig up if that is the answer to our woes......... Dammit Scotty, we need more power............ "Freedom is just Chaos, with better lighting." Alan Dean Foster |
ML1 Send message Joined: 25 Nov 01 Posts: 20289 Credit: 7,508,002 RAC: 20 |
We are paying a tiny fraction for the current 1Gbit than we were paying for the previous 100Mbit connection Well... I guess you actually mean 905 Mbit/s... ;-) As described earlier, the cable from the SSL (s@h servers physical location) "down the hill" to the comms rack in another building is just a 100 Mbit/s link. The 1 Gbit/s link is further down the line from there. With tcp and overheads, you can expect 95 Mbit/s max out of that. And there is insurmountable bureaucracy/costs to upgrade the 100 Mbit/s segment. Good suggestions I've seen is for a volunteer effort to upgrade the fibre optic nodes or install a microwave link. But is that workable for Berkeley? Happy crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.