Message boards :
Number crunching :
Couple of Thoughts on Network Bandwidth
Message board moderation
Author | Message |
---|---|
MikeSW17 Send message Joined: 3 Apr 99 Posts: 1603 Credit: 2,700,523 RAC: 0 |
Given the recent outages and network problems, a couple of thoughts come to mind - I could be wrong, but it's my interpretation of what I've observed... 1) When the network is heavily loaded, uploads often connect, transfer 80% of the data (say 9K of 12K) and then time-out and fail. Isn't this a terrible waste of bandwidth? Wouldn't it be better not to even begin a transfer if there isn't a very good chance that it will complete? These partial transfers must also waste server cpu cycles? Presumably it is possible to limit the number of connections the server will accept? (Might also help dial-up users if they get an immediate 'too busy' and disconnect, rather than transmit 9K then time-out.) 2) SETI data transfers seem to be uncompressed data? - Downloads seem to count to ~360Kb, uploads around 10-12Kb. ZIPped, the 360k download reduces to around 270k (25% saving) and uploads reduce from around 12k to 2.5k (75% saving). Some other BOINC projects seem to download ZIP files, so the mechanism is inherently present? (Again could help dial-up users d/l-ing 25% less data) |
Josh Linscott Send message Joined: 10 Sep 03 Posts: 121 Credit: 32,250 RAC: 0 |
I like the compressed files Idea! <img src="http://boinc.mundayweb.com/one/stats.php?userID=1052&prj=1&trans=off"> <img src="http://seti.mundayweb.com/stats.php?userID=749&trans=off"> |
Murasaki Send message Joined: 22 Jul 03 Posts: 702 Credit: 62,902 RAC: 0 |
I just tried zip compressing and got the same results. Maybe compression could be worked into the next BOINC release. There'd probably have to be some extra handshaking involved so the servers wouldn't try to send compressed work units to users running previous versions of the software. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
> 1) When the network is heavily loaded, uploads often connect, transfer 80% of > the data (say 9K of 12K) and then time-out and fail. Isn't this a terrible > waste of bandwidth? This is a catch-22, the BOINC client can't tell there is congestion until the upload fails, and by then the bandwidth is already wasted. |
MikeSW17 Send message Joined: 3 Apr 99 Posts: 1603 Credit: 2,700,523 RAC: 0 |
> > 1) When the network is heavily loaded, uploads often connect, transfer > 80% of > > the data (say 9K of 12K) and then time-out and fail. Isn't this a > terrible > > waste of bandwidth? > > This is a catch-22, the BOINC client can't tell there is congestion until the > upload fails, and by then the bandwidth is already wasted. > I Understand that the client has no way to tell. But the server could dynamically adjust the number of connections it will accept based on CPU and IO load such that it maintains sufficient reserve capacity to always complete what it starts? I run a simple FTP server on one system and I know that has a 'Maximum Simultaneous' connections option - I've never reached it (200) so I don't know what it does, but presubly it either won't respond at all to the 201st connection, or, not permit the login. |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 |
Even if the connection fails, it would be extremely nice if the transfer picked up where it left off instead of starting over at the beginning. BOINC WIKI |
Prognatus Send message Joined: 6 Jul 99 Posts: 1600 Credit: 391,546 RAC: 0 |
Maybe Galileo and Kryten could switch places (tasks)? From what is on the Server status page, it seems Galileo is a more powerful server and there are two other splitters already. (Staff say it's usually enough with only one splitter anyway). Kryten, OTOH, is alone with all the scheduling, uploading and downloading taks. For reducing bottlenecks on client loads, it seems logical to have the most powerful server here. |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 |
> Maybe Galileo and Kryten could switch places (tasks)? From what is on the <a> href="http://setiweb.ssl.berkeley.edu/sah_status.html">Server status page[/url], > it seems Galileo is a more powerful server and there are two other splitters > already. (Staff say it's usually enough with only one splitter anyway). > Kryten, OTOH, is alone with all the scheduling, uploading and downloading > taks. For reducing bottlenecks on client loads, it seems logical to have the > most powerful server here. > Quite a bit of the bottleneck is the DB access. BOINC WIKI |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
> Quite a bit of the bottleneck is the DB access. Looking at the router info the bottle neck is the 100Mb connection. It's pretty much fully utilised so no matter how fast the servers are, they still won't be able to get the data in or out of an already full pipe. Grant Darwin NT |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
> > This is a catch-22, the BOINC client can't tell there is congestion until > the > > upload fails, and by then the bandwidth is already wasted. > > > > I Understand that the client has no way to tell. But the server could > dynamically adjust the number of connections it will accept based on CPU and > IO load such that it maintains sufficient reserve capacity to always complete > what it starts? > > I run a simple FTP server on one system and I know that has a 'Maximum > Simultaneous' connections option - I've never reached it (200) so I don't know > what it does, but presubly it either won't respond at all to the 201st > connection, or, not permit the login. The problem is: even if the server is getting more TCP SYN packets than it can accept, all of those SYN packets hitting the server at once are a big part of the problem. The repeated attempts to connect use up bandwidth, to the point that the connections that are successful can't get enough bandwith to finish the transfers. The fix is some sort of mechanism to slow down the connection rate at the client after a long outage: instead of backing off between 1 minute and three hours, a way to set the back-off on a failed connect to between 1 minute and 30 hours (for example). That'd slow the incoming connections by a factor of 10, more connections would complete on the first try, bandwidth would be available for successful transfers, those transfers wouldn't need to retry, etc. This is one of those problems that is easy to move around, but harder to solve. |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
> Looking at the router info the bottle neck is the 100Mb connection. It's > pretty much fully utilised so no matter how fast the servers are, they still > won't be able to get the data in or out of an already full pipe. I already mentioned this in some other thread somewhere, but the 100Mb connection isn't really the only bottle neck (if it is one at all) - there's an artificial throttle right on the data server, for example. If the data server was allowed to fork as many cgi processes it could to keep up with demand, it would spiral out of control within seconds. So we have to keep the max number of processes down, which in turn creates an arbitrary ceiling if demand is high. We need a better data server (with more RAM) before we need a bigger pipe. Also bear in mind the whole Space Sciences Laboratory sits on top of the hill on Berkeley campus, and there's only one single 100Mb fibre containing *all* traffic that goes up/down the hill, whether it is destined to go to our private ISP or not. All the other projects in the lab have web sites, too. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Prognatus Send message Joined: 6 Jul 99 Posts: 1600 Credit: 391,546 RAC: 0 |
|
Jan Inge Send message Joined: 24 Sep 02 Posts: 21 Credit: 1,655,076 RAC: 0 |
> I already mentioned this in some other thread somewhere, but the 100Mb > connection isn't really the only bottle neck (if it is one at all) - there's > an artificial throttle right on the data server, for example. If the data > server was allowed to fork as many cgi processes it could to keep up with > demand, it would spiral out of control within seconds. So we have to keep the > max number of processes down, which in turn creates an arbitrary ceiling if > demand is high. We need a better data server (with more RAM) before we need a > bigger pipe. > > Also bear in mind the whole Space Sciences Laboratory sits on top of the hill > on Berkeley campus, and there's only one single 100Mb fibre containing *all* > traffic that goes up/down the hill, whether it is destined to go to our > private ISP or not. All the other projects in the lab have web sites, too. > > - Matt > How about setting up a donation page where you say that you need lets say $10.000 for a better data server. On that page you list up what the advantages of getting a new data server is and people using seti can give money with paypal or something like it. I know i wouldnt mind giving $10 here and there to secure better hardware for seti, and i bet there is alot more people willing to give moeny. I think more people are willing to give money when they know what they get in return for it. Im sure alot of people would have given money so seti have had a UPS on the database server. |
Roland S Stubbs Send message Joined: 31 Jul 99 Posts: 4 Credit: 4,560,097 RAC: 0 |
As long as we're dreaming about hardware how about adding either an infrared laser or microwave link from the hilltop to the bottom to get some bandwidth without having to trench? I know adding some fiber is the best solution but sometimes a patch will work just as well to get things moving again. I'd be willing to kick in $10 for the server and $10 for the link, just show me the PayPal link! |
davidmcw Send message Joined: 28 May 99 Posts: 47 Credit: 388,756 RAC: 1 |
If you're discussing raising money, surely someone out there in sunny California could come up with a SETI@Home t-shirt or something, I'm sure you'd sell quite a few, raising much needed pennies along the way. |
Ralph Waldo Emerson Send message Joined: 17 Mar 00 Posts: 1 Credit: 0 RAC: 0 |
I'm sure it's been discussed elsewhere, but all such discussions need to include ideas about distributing the tasks to several servers rather than how to shoehorn more data through a single pipe. It *might* be easier to get bandwidth donated from some other location rather than more servers or connections "up the hill". My 2cents, Peace. |
EdwardPF Send message Joined: 26 Jul 99 Posts: 389 Credit: 236,772,605 RAC: 374 |
Perhaps what's needed is another Boinc project to ... say ... be distributed Boinc admin. It could be project specific ... distributed SETI Boinc admin ... or just a generic distributed Boinc admin for the small projects that must exist out there ... hidden around the web. Boinc could start feeding on itself in order to grow! EdwardPF |
trlauer Send message Joined: 6 May 04 Posts: 106 Credit: 1,021,816 RAC: 0 |
I'd donate at least US$10.00 too! Show me the PayPal or credit card link, and I'll donate money. Torrey Lauer www.nospam.rainbowskytravel.com |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
> I'd donate at least US$10.00 too! Show me the PayPal or credit card link, and > I'll donate money. https://colt.berkeley.edu:444/urelgift/seti.html Thanks! - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
JavaPersona Send message Joined: 4 Jun 99 Posts: 112 Credit: 471,529 RAC: 0 |
> If you're discussing raising money, surely someone out there in sunny > California could come up with a SETI@Home t-shirt or something, I'm sure you'd > sell quite a few, raising much needed pennies along the way. > Donations are a good idea, always. I think the T-Shirt idea would really catch on. I know someone who could print the shirts and maybe we work out who does the selling? I imagine just the new SETI logo on front, T-shirts in black or white. Let me know. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.