Bit Ceiling (Oct 08 2008)

Message boards : Technical News : Bit Ceiling (Oct 08 2008)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 816319 - Posted: 9 Oct 2008, 17:31:18 UTC

P.S. Have you seen the messages about the HTTP service being down on 208.68.240.13 (one of the download servers)? - it's muddled in with other routing problems in the Users from Germany reporting bad route thread.
ID: 816319 · Report as offensive
Profile Jon Golding
Avatar

Send message
Joined: 20 Apr 00
Posts: 105
Credit: 841,861
RAC: 0
United Kingdom
Message 816346 - Posted: 9 Oct 2008, 18:45:28 UTC

I realise that it would require a MASSIVE resource to replicate the complex server network that manages the SETI project at Berkeley.
However, on a much more limited scale, could any single server be configured to split, distribute and receive processed SETI workunits, to validate these units and send the results back to Berkeley?
If so, then a number of these servers, dotted about the globe, could form the backbone of a "distributed, distributed" SETI project.
Several SETI participants already crunch on high-specification server platforms, they might be able to help spread the work and reduce the load on the Berkeley servers and build in redundancy in case of server problems at any one location.

Food for thought, or guaranteed indigestion?
ID: 816346 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 816348 - Posted: 9 Oct 2008, 18:53:13 UTC - in response to Message 816346.  

This is explained in my last wordy post (earlier today in this thread). We simply cannot have any piece of our backend remotely located from others. For example, splitters need as-fast-as-possible access to (a) raw data storage, (b) workunit storage, (c) boinc database, and (d) science database. Assimilators need as-fast-as-possible access to (a) boinc db, (b) science db, (c) uploaded result storage, and (d) workunit storage. I could go on... There is no single backend server (the list including all the above and more) that can be remotely located without taking an IMPOSSIBLY HUGE hit somewhere.

- Matt

I realise that it would require a MASSIVE resource to replicate the complex server network that manages the SETI project at Berkeley.
However, on a much more limited scale, could any single server be configured to split, distribute and receive processed SETI workunits, to validate these units and send the results back to Berkeley?
If so, then a number of these servers, dotted about the globe, could form the backbone of a "distributed, distributed" SETI project.
Several SETI participants already crunch on high-specification server platforms, they might be able to help spread the work and reduce the load on the Berkeley servers and build in redundancy in case of server problems at any one location.

Food for thought, or guaranteed indigestion?


-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 816348 · Report as offensive
Profile Jon Golding
Avatar

Send message
Joined: 20 Apr 00
Posts: 105
Credit: 841,861
RAC: 0
United Kingdom
Message 816354 - Posted: 9 Oct 2008, 19:26:28 UTC - in response to Message 816348.  
Last modified: 9 Oct 2008, 20:04:46 UTC

I didn't mean outsourcing a single backend server function, but instead a collection of individual servers, each one of which would perform ALL backend functions (albeit with a very reduced throughput of data), so that you'd have several "mini Berkeley's" in different locations.
These servers would send out batches of workunits, collect and validate the results, and only the canonical validated data would get returned to Berkeley for addition to the master science database. Because these results had already been validated off-site, no further backend processing of this data would be required at Berkeley. Admittedly, you'd have to come up with some way to ship batches of data to these mini Berkeley's for them to split.

However, I can appreciate that it would be a nightmare to come up with the code that unified all the current backend services and could be run on a single server.
ID: 816354 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 816413 - Posted: 9 Oct 2008, 21:35:28 UTC - in response to Message 816348.  

This is explained in my last wordy post (earlier today in this thread). We simply cannot have any piece of our backend remotely located from others. For example, splitters need as-fast-as-possible access to (a) raw data storage, (b) workunit storage, (c) boinc database, and (d) science database. Assimilators need as-fast-as-possible access to (a) boinc db, (b) science db, (c) uploaded result storage, and (d) workunit storage. I could go on... There is no single backend server (the list including all the above and more) that can be remotely located without taking an IMPOSSIBLY HUGE hit somewhere.

- Matt

I realise that it would require a MASSIVE resource to replicate the complex server network that manages the SETI project at Berkeley.
However, on a much more limited scale, could any single server be configured to split, distribute and receive processed SETI workunits, to validate these units and send the results back to Berkeley?
If so, then a number of these servers, dotted about the globe, could form the backbone of a "distributed, distributed" SETI project.
Several SETI participants already crunch on high-specification server platforms, they might be able to help spread the work and reduce the load on the Berkeley servers and build in redundancy in case of server problems at any one location.

Food for thought, or guaranteed indigestion?


I'm not saying this to contradict Matt in any way, but to perhaps state it differently.

Lets' pick one example, and say the download server is to be moved off-site.

Work to be downloaded still has to be split by the splitter, and stored on the download server.

Right now, the HE link is used between the download server and those of us in userland, after the change the bandwidth would be the same, it'd be between the lab and the download server.

It's easier to move problems around than it is to solve them. Most of the mirroring suggestions fall into this category.

If there are any details that disagree with any of Matt's posts, please take Matt's statements as "correct."
ID: 816413 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30651
Credit: 53,134,872
RAC: 32
United States
Message 816427 - Posted: 9 Oct 2008, 21:58:10 UTC - in response to Message 816305.  

So if I've got that straight there is HE to the outside world at PAIX. You have boxes at PAIX and a shared pipe? to CENIC. There there is a shared pipe from CENIC to the campus. There are more boxes down the hill where it gets split off to a campus owned fibre up the hill to SSL. Do I have all the hops?

I can see a pile of politics in that.

Gary

That said, our current private SETI ISP - Hurricane Electric (a.k.a "HE") - is not directly connected to anything on campus. In fact, the whole thing is a marriage of many parties. We connect to HE down at the PAIX (major exchange in Palo Alto). But to do so we needed space/access at the PAIX. This is where the generous support and donated routers (and rack space) from Bill Woodcock (of Packet Clearing House fame) comes in. He also walked me and Jeff through using IOS to get all this to connect properly. Anyway, bits go through the PAIX to CENIC, which is where they are tunneled (thanks to much help/support from central campus) over an extra 100Mbit cable up to our SETI router in our closet. That's the "simple version" of what's going on. Many parties are involved, all with varied hectic schedules and different priorities. Getting a better link from here to central campus is but one part of the puzzle.

(So yeah.. when people on campus running BOINC download a workunit from our servers, their requests go all the way down to Palo Alto first and then back to us.)

- Matt


ID: 816427 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 816440 - Posted: 9 Oct 2008, 22:10:56 UTC

Maybe I'm just dense, but Jon Golding's explaination is what I've had in mind. Nobody is talking about mirroring here, which may have value from a reliability perspective but that is not the point. The objective is performance and extensibility.
* Find a willing partner, say at NCU, with a 'small' server that contains a copy of Berkeley's setup. You set up the ground rules.
* Send a subset of raw data disks directly to NCU, with the others going to Berkeley. The number depends on capacity, etc. Say one out of ten. In this example, NCU would be responsible for all pre- and post-processing of the work stream assocated with their portion of the data.
* Write the software interface required to satisfy download requests from either Berkeley or NCU. Hide the fact that there are more than one participating server sites behind the setiathome domain. Again, a sensible, extensible protocol would need to be decided upon.
* Write the software interface so that uploaded results go to the originating server (to reduce book keeping complexity, but probably not needed in the long run).
* Once a week/month/whatever, merge the database from NCU into Berkeley. Other solutions exist, but this is a simple way to start.
* Establish change control so that the two server systems do not get out of wack with software revisions, etc.
* Berkeley and NCU write a paper on new cutting-edge success in DT and ET, get into popular press, and persuade the idiots in Washington, or better a private corporation, to fund 'us' into the future.

But seriously, breaking the DT ice might lead to subsequent partnerships and larger horizons. And we all know that a single central system is not extensible, despite its other advantages.
ID: 816440 · Report as offensive
Profile Jon Golding
Avatar

Send message
Joined: 20 Apr 00
Posts: 105
Credit: 841,861
RAC: 0
United Kingdom
Message 816447 - Posted: 9 Oct 2008, 22:35:04 UTC - in response to Message 816440.  

Thanks, PhonAcq.

You've got it in one!
Sorry that my original posts didn't have the clarity and specifics that you outlined here (my first language is "gibberish").

I like the idea of keeping the individual servers within the context of various Universities (e.g. Astronomy departments). This ought to make things easier to administer, should increase confidence and transparency in how the data is being curated locally, and would open up joint funding opportunities for grant money to develop the idea further - towards dozens of such dedicated servers.

Cheers,
Jon
ID: 816447 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 816449 - Posted: 9 Oct 2008, 22:47:19 UTC - in response to Message 816447.  

Thanks, PhonAcq.

You've got it in one!
Sorry that my original posts didn't have the clarity and specifics that you outlined here (my first language is "gibberish").

I like the idea of keeping the individual servers within the context of various Universities (e.g. Astronomy departments). This ought to make things easier to administer, should increase confidence and transparency in how the data is being curated locally, and would open up joint funding opportunities for grant money to develop the idea further - towards dozens of such dedicated servers.

Cheers,
Jon

Doing it completely, as you outlined (and PhonAcq restated), is the only practical way.

The problem I see is that it'd take resources (hardware, electricity, rack space) and staff-time at the "host" University, and unless the various Universities wanted to donate those resources, SETI@Home would have to pay for them.

... and if it is completely volunteer, SETI would have to guard against a partner site "going missing." It's happened.
ID: 816449 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 816451 - Posted: 9 Oct 2008, 22:49:29 UTC - in response to Message 816447.  

I think Jon's idea is the most possible solution to distributing the back-end workload. It's worth pursuing, at least academically, but my mind reels thinking about the minor issues, at least in our case, that may make such an endeavour ultimately not worth it (at the moment). The devil is in the details.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 816451 · Report as offensive
Ianab
Volunteer tester

Send message
Joined: 11 Jun 08
Posts: 732
Credit: 20,635,586
RAC: 5
New Zealand
Message 816618 - Posted: 10 Oct 2008, 11:08:20 UTC
Last modified: 10 Oct 2008, 11:09:53 UTC

I had been thinking along the same lines as Jon, just never got around to posting about it.

One of the problems I see at the moment is that we have quite a complex back end system with multiple servers all doing different tasks, sharing disk resouces and generally either tripping over each other, or just waiting while another server gets it act together.

Too many single points of failure that brings the whole project to a standstill.

I see a lot of smaller BOINC projects running on single servers, nowhere near the amount of Data that Seti puts through, but they seem relatively stable and chug along at their own pace.

I'm thinking a 'turnkey' single server solution (with a sensible size RAID array) that could easily be tuned to the CPU, disk space, net connection that was available.

If ET is found it can be reported right away, if large amounts of data needs to returned for archiving then it can be returned 'Sneakernet' on the original data disks.

That makes it a practical solution for another University to contribute with basic donated hardware and a modest net connection.

Main advantage is that it takes away the single points of failure that seem to plauge the project right now. If you cant get new jobs from the main database your PC hits the backup servers and picks up whatever work is available.

I realise this is a pretty major redesign of the Boinc and Seti system, and it's not going to happen tomorrow, but as an academic exercise I think it's a good one.

I just think it's the next natural progresion the distributed computing experiment - distributed back end servers.

Good luck

Ian
ID: 816618 · Report as offensive
Profile speedimic
Volunteer tester
Avatar

Send message
Joined: 28 Sep 02
Posts: 362
Credit: 16,590,653
RAC: 0
Germany
Message 816634 - Posted: 10 Oct 2008, 12:49:12 UTC

Another big problem is how to load-balance all the clients to the distributed backend-servers.
For that load-balancing the geogrphical location and the load of the servers have to be taken into account. The first might be done by some routing protocol tricks (like for the dns-root-servers), but the second might need a single backend-coordination-server...


mic.


ID: 816634 · Report as offensive
Profile Jon Golding
Avatar

Send message
Joined: 20 Apr 00
Posts: 105
Credit: 841,861
RAC: 0
United Kingdom
Message 816670 - Posted: 10 Oct 2008, 14:48:25 UTC

What’s the minimum realistic specification for such a distributed backend server?

We’re talking in terms of a single machine that would hold about a month’s worth of WU data at a time (is this a sensible amount?) and carry out all of the existing Berkeley backend processes, albeit with a much lower throughput (how low to be worthwhile?) and then upload the validated results to the Berkeley master science database at intervals.
If Berkeley came up with a standard specification, then a SETI funding drive might help provide the cash to buy these servers. Some university departments (at least here in the UK) offer matched funding schemes, so only part of the money would come from SETI donations. A grant proposal might fund a proof-of-principle project to write the “all in one” software, although I’m sure the folks over at lunatics.kwsn.net would relish the opportunity to create, debug and optimise such software.


ID: 816670 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 816686 - Posted: 10 Oct 2008, 15:56:32 UTC - in response to Message 816618.  
Last modified: 10 Oct 2008, 15:57:05 UTC

I realise this is a pretty major redesign of the Boinc and Seti system, and it's not going to happen tomorrow, but as an academic exercise I think it's a good one.

Actually, it's not. The BOINC model allows more than one upload server, more than one download server, and I'm pretty sure it allows more than one scheduler.

... and at least one project (Climate) has servers distributed around the planet.

I don't follow "Climate" but I think at one point one of their partners basically disappeared. Work units are "tagged" with an upload URL, and the server name disappeared from DNS. (Note: this was years ago) If SETI distributed servers "out there" it'd be good for them to keep them all in the setiathome.ssl.berkeley.edu namespace. Easy to do.

As Matt said, the problems are not insurmountable, and mostly, they aren't technical, they're organizational.

I think the biggest problem is that we don't always look at BOINC as a system that includes the BOINC client. If the SETI servers are down, the BOINC client just waits until they're back, and as long as they don't run out of work, they're crunching away -- and work is returned and credit granted when the servers come back up.
ID: 816686 · Report as offensive
Christoph
Volunteer tester

Send message
Joined: 21 Apr 03
Posts: 76
Credit: 355,173
RAC: 0
Germany
Message 816692 - Posted: 10 Oct 2008, 16:08:14 UTC - in response to Message 816670.  
Last modified: 10 Oct 2008, 16:10:14 UTC

@ Jon Golding

Are you aware of the fact that the actual funding drive for THIS YEAR is hardly taking in the money to cover basic operational cost? I had a machine to send over to berkeley, unfortunately the MoBo broke down. Without hardware donations there is no upgrade/extension possibility this year, because there is no money to spare for that.

Happy crunching, Christoph
Christoph
ID: 816692 · Report as offensive
Profile Jon Golding
Avatar

Send message
Joined: 20 Apr 00
Posts: 105
Credit: 841,861
RAC: 0
United Kingdom
Message 816701 - Posted: 10 Oct 2008, 16:37:11 UTC - in response to Message 816692.  

Hi Christoph

Yes, I'm well aware of the shoestring budget that SETI has to work with and the Heroic efforts of all the staff there to keep the project running and to expand it's capabilities.
I'm just trying to stimulate a discussion to consider how the existing pressures on the Berkeley servers might be reduced.
Turning any of those ideas into practical solutions is another problem entirely.
ID: 816701 · Report as offensive
Christoph
Volunteer tester

Send message
Joined: 21 Apr 03
Posts: 76
Credit: 355,173
RAC: 0
Germany
Message 816723 - Posted: 10 Oct 2008, 17:11:43 UTC - in response to Message 816701.  

I'm just trying to stimulate a discussion to consider how the existing pressures on the Berkeley servers might be reduced.
Turning any of those ideas into practical solutions is another problem entirely.


Ok, this is of course a good point. Ideas are alwys good and needed.
Mine is actualy to find a mobo at ebay which can host 2 XEONS with 3.2 GHz and PC-3200 registered DDR 400 memory. And support 8 gb memory or more. That's what I need to give the old but still good parts a new home. In that case SETI will have an additional server.
Maybe over the same way (ebay) another can be cobbled together for 100 € excluding memory. Memory depend to the board, DDR 1 or 2?

Keep thinking, Christoph
Christoph
ID: 816723 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 817180 - Posted: 11 Oct 2008, 20:48:35 UTC

We are paying a tiny fraction for the current 1Gbit than we were paying for the previous 100Mbit connection

If the download/upload server's are on a 1Gbit connection & it's maxing it's self out at around 95mb/s, whats stooping the lab using the other 905mb/s to transfer the raw data from Arecibo to storage in the lab? Is this is possible it would mean while raw data is been transferred to storage the crunchers will be able to upload & download work faster. Just my thoughts.
Speedy
ID: 817180 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 817181 - Posted: 11 Oct 2008, 20:53:11 UTC

So what is it gonna take to get this fixed???
A new server? 2 new servers? 5? 10?

I have asked before, and was answered by Eric, that simply upgrading hard drives was not possible.......

Do we need new servers, or just more online hard drive space?

Storage should be easy enough to rig up if that is the answer to our woes.........

Dammit Scotty, we need more power............
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 817181 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20289
Credit: 7,508,002
RAC: 20
United Kingdom
Message 817203 - Posted: 11 Oct 2008, 21:36:38 UTC - in response to Message 817180.  
Last modified: 11 Oct 2008, 21:37:51 UTC

We are paying a tiny fraction for the current 1Gbit than we were paying for the previous 100Mbit connection

If the download/upload server's are on a 1Gbit connection & it's maxing it's self out at around 95mb/s, whats stooping the lab using the other 905mb/s...

Well... I guess you actually mean 905 Mbit/s... ;-)


As described earlier, the cable from the SSL (s@h servers physical location) "down the hill" to the comms rack in another building is just a 100 Mbit/s link. The 1 Gbit/s link is further down the line from there. With tcp and overheads, you can expect 95 Mbit/s max out of that.

And there is insurmountable bureaucracy/costs to upgrade the 100 Mbit/s segment.


Good suggestions I've seen is for a volunteer effort to upgrade the fibre optic nodes or install a microwave link. But is that workable for Berkeley?

Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 817203 · Report as offensive
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Technical News : Bit Ceiling (Oct 08 2008)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.