A Modest Proposal for Easy FREE Bandwidth Relief


log in

Advanced search

Message boards : Number crunching : A Modest Proposal for Easy FREE Bandwidth Relief

Previous · 1 · 2 · 3 · Next
Author Message
jravin
Send message
Joined: 25 Mar 02
Posts: 930
Credit: 99,682,508
RAC: 85,465
United States
Message 1334966 - Posted: 5 Feb 2013, 21:19:05 UTC - in response to Message 1334958.

That's why I am advocating sending shorties to CPUs by default, rather than letting them hit GPUs...
____________

jravin
Send message
Joined: 25 Mar 02
Posts: 930
Credit: 99,682,508
RAC: 85,465
United States
Message 1336564 - Posted: 10 Feb 2013, 9:38:42 UTC

Maybe somebody Up There heard me - lately, I'm getting a lot more MBs for CPU with an elapsed time of around 40 minutes, and fewer GPU MBs with elapsed time around 5 minutes, so the shorties are being biased towards CPU rather than GPU. Or, alternatively, maybe it is just a coincidence.

I'd like to believe the first, but ???.

Can anyone In The Know verify if a change has been made?
____________

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8398
Credit: 56,837,338
RAC: 77,860
United Kingdom
Message 1336565 - Posted: 10 Feb 2013, 9:43:40 UTC

Ah, so now I know where all those shorties on my GPUs came from - they implemented a user-selective algorithm....


The servers are down as I type, indeed have been down since late yesterday evening (UK time).
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile Staffan Ericsson
Send message
Joined: 5 Jan 00
Posts: 6
Credit: 1,496,204
RAC: 1
Sweden
Message 1336614 - Posted: 10 Feb 2013, 13:14:22 UTC

The question we should ask ourself first is. Do we compute faster or slower than the incoming data? If we are faster then we have to ask ourself if we shouldn't reallocate some resources towards other Boinc projects.

If we are slower due to the limited bandwidth, then we need to find a solution that increases bandwidth or else we will stockpile data.

Looking on my own download speeds from Seti@home I get a feeling that the servers allow to many connections at the same time. The download rate can be down to 1kb/s which probably indicates that the pipe is full and to many users requesting data from seti@home. I wonder if the bandwidth usage wouldn't be more optimized if the servers had a lower concurrent connection count in there settings.

The Seti@home project made should investigate if they could get another university, preferably in Europe, to join in on the effort. If half of the data distribution/splitting could be "outsourced" to another pipe the benifit could be enormous. I know that people have stated that this wouldn't work due to the distance to the database servers, but with some kind of primitive sharding of the data, this wouldn't so much of an issue and a delay of reporting towards central databases shouldn't be so much of an issue.

The European space agencies might be a good group of organizations to approach or some of the universities with space related research. I know that the swedish university generally are well connected so another 100mbit/s load wouldn't be so much of a problem.

.staffan


Profile Staffan Ericsson
Send message
Joined: 5 Jan 00
Posts: 6
Credit: 1,496,204
RAC: 1
Sweden
Message 1336618 - Posted: 10 Feb 2013, 13:46:17 UTC

Maybe Onsala radio telescope, Sweden could join as both bandwidth and data provider?

http://www.chalmers.se/rss/oso-en/

.staffan
____________

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8398
Credit: 56,837,338
RAC: 77,860
United Kingdom
Message 1336620 - Posted: 10 Feb 2013, 13:57:09 UTC

It would be nice to have a high-latitude provider, but that would not help the bandwidth under discussion which is that between the S@H servers and us crunchers. The data that we crunch is sent from the telescope to the S@H labs, where it is cleaned up then split and distributed to us. Of course you may have hit upon a solution - why not have three "versions" of the S@H project, an equatorial (we've got that already), a high northern and a high southern. There would need to be some co-ordination between the three projects, but it might go some way to easing the pressure on the current "equatorial" project.
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile Staffan Ericsson
Send message
Joined: 5 Jan 00
Posts: 6
Credit: 1,496,204
RAC: 1
Sweden
Message 1336646 - Posted: 10 Feb 2013, 15:27:23 UTC - in response to Message 1336620.

It would be nice to have a high-latitude provider, but that would not help the bandwidth under discussion which is that between the S@H servers and us crunchers. The data that we crunch is sent from the telescope to the S@H labs, where it is cleaned up then split and distributed to us. Of course you may have hit upon a solution - why not have three "versions" of the S@H project, an equatorial (we've got that already), a high northern and a high southern. There would need to be some co-ordination between the three projects, but it might go some way to easing the pressure on the current "equatorial" project.


My point was that with a new telescope onboard that can provide data directly to the crunchers the Berkeley servers should see a drop in load but no scientific loss at all.

Another dataprovier that actually have enough bandwidth to provide crunchers with data directly without shipping drives back and forth would reduce workload on the project and at the same time increase coverage and chances of a success.
____________

bill
Send message
Joined: 16 Jun 99
Posts: 861
Credit: 23,492,648
RAC: 25,383
United States
Message 1336676 - Posted: 10 Feb 2013, 16:40:27 UTC

Is there any reason why there can't be
two or more Seti@home projects? Legal,
copyright, trademark, or some other issues?

The projects could be based in different locations
and cover different areas of the sky. With more
crunchers than can be handled at Berkeley a European
based project could handle northern latitudes and a
South American or Australian location for the
southern latitudes.

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8398
Credit: 56,837,338
RAC: 77,860
United Kingdom
Message 1336692 - Posted: 10 Feb 2013, 17:19:57 UTC

I think Staffan is looking at a separate project (in BOINC terms), distributing its data in the same way as S@H does but from a different location. Such a project would need to functionally duplicate the existing server set-up, have enough bandwidth and connectivity, have funding "guaranteed" for a number of years for a local support team.
This is quite attractive in a number of ways, one it might remove some bandwidth demand from the current S@H, it would give access to a new section of the sky as the current telescopes only see "a third" of the sky, and are very immobile (unlike their Northern and Southern cousins which are highly steerable).
Carrying on thinking out loud, when a user joined S@H they were automatically joined to their local version of the project, or, for existing users, were migrated there given fair warning - membership between the three area groups being totally transferable.

All we need is the money to set it up and run it, I'm sure the BOINC back-end is capable of it, but is mankind willing to give it a go?
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile Staffan Ericsson
Send message
Joined: 5 Jan 00
Posts: 6
Credit: 1,496,204
RAC: 1
Sweden
Message 1336788 - Posted: 10 Feb 2013, 19:43:52 UTC - in response to Message 1336692.

"rob smith" wrote:
I think Staffan is looking at a separate project (in BOINC terms), distributing its data in the same way as S@H does but from a different location. Such a project would need to functionally duplicate the existing server set-up, have enough bandwidth and connectivity, have funding "guaranteed" for a number of years for a local support team.
This is quite attractive in a number of ways, one it might remove some bandwidth demand from the current S@H, it would give access to a new section of the sky as the current telescopes only see "a third" of the sky, and are very immobile (unlike their Northern and Southern cousins which are highly steerable).
Carrying on thinking out loud, when a user joined S@H they were automatically joined to their local version of the project, or, for existing users, were migrated there given fair warning - membership between the three area groups being totally transferable.

All we need is the money to set it up and run it, I'm sure the BOINC back-end is capable of it, but is mankind willing to give it a go?


Somebody posted a link to the trafic graph for seti@home. The traffic was 10 in and 90mbit/s out.

As I have understood the setup of the project it's essentially this.

1. Data is collected at Arcibo.
2. Disks are sent to Berkeley.
3. Data is splitt from the disks to workunits.
4. Workunits are sent out
5. Crunchers crunch workunits.
6. Results are returned.
7. Results are validated
8. Credit is given
9. Seti is found.

Based on the bandwidth graphs. Step 4 is the bottle neck in the project. So with another data source that can handle step 1 to 4 without utilizing Berkeley bandwidth would solve the bandwidth issue. (Incoming traffic shouldn't be a problem.

Einstein@home manages to have 3 mirrors for distributing work, so it would be possible to do a similar setup withing Seti@home.

.staffan

PS. Looks like Seti@home have 100Mbit summed over in and out from the graphs, not full 100 in both directions
____________

Profile ivan
Volunteer tester
Avatar
Send message
Joined: 5 Mar 01
Posts: 608
Credit: 138,643,886
RAC: 150,733
United Kingdom
Message 1336792 - Posted: 10 Feb 2013, 19:51:13 UTC - in response to Message 1336788.

PS. Looks like Seti@home have 100Mbit summed over in and out from the graphs, not full 100 in both directions

No, I think that's just the overheads of colliding packets, etc detracting from the outbound traffic. It's just coincidence that the two channels add up to an "even" value. Any time there's a significant change in the inward traffic (blue) there's not necessarily an opposite change in the outward (green) -- usually they march in lockstep as conditions change, both up or both down.
____________

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5312
Credit: 294,625,786
RAC: 469,903
Brazil
Message 1336819 - Posted: 10 Feb 2013, 20:34:55 UTC
Last modified: 10 Feb 2013, 20:38:33 UTC

It´s a more simply way to resolve that, move the SETI servers (at least the ones that have heavy trafic limitations) to a location accessible by the 1GBps link, they allready paid for the 1Gbps link.
____________

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8398
Credit: 56,837,338
RAC: 77,860
United Kingdom
Message 1336829 - Posted: 10 Feb 2013, 20:48:46 UTC

Cost???
University politics??

I'm not sure which is the bigger hurdle to overcome, but I suspect that the latter might be harder than the former, after all some of the funding is based on the project being "on campus".

Moving off-campus would require the "acquisition" of premises to operate from, server space, the required links and a whole host of other things, that are currently provided by the canopy of the university.

(Remember that the line we are using today is a 1GB line, throttled by university politics to 100MB)
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile MarkJProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 08
Posts: 939
Credit: 24,084,767
RAC: 74,867
Australia
Message 1336833 - Posted: 10 Feb 2013, 20:52:40 UTC

While sending shorties to the CPU can reduce the load on the server this I would only work for a short time before the server gets overwhelmed with the sheer number of requests.

We suggested before that data could be compressed. Yes there is a server CPU overhead in doing it, but reduces the comms bandwidth. We've also suggested that scheduler requests and replies could also be compressed. Neither of these have happened to date.

However compression will only improve the bandwidth situation for a while before the sheer number of users take up any savings. The only ways I can see to reduce that is either reduce the number of machines connecting or reduce the frequency of connection. They already do the frequency part by back offs and quota of work. What I think needs to happen in they need to have servers in multiple locations. There was a BOINC proposal called SuperHost that has never been acted upon. This could spread the load and then the SETI servers simply need to provide the SuperHosts with the data for the end user. Put simply whatever bandwidth SETI has it will always be exceeded by the sheer number of users. The whole idea of the Internet was to spread things around in doing so reducing the stress on any single point.

So quick solution which will only work for a while is to compress data, longer term they need multiple servers in multiple locations.
____________
BOINC blog

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5312
Credit: 294,625,786
RAC: 469,903
Brazil
Message 1336844 - Posted: 10 Feb 2013, 21:17:03 UTC - in response to Message 1336829.
Last modified: 10 Feb 2013, 21:18:35 UTC

Cost???
University politics??

I'm not sure which is the bigger hurdle to overcome, but I suspect that the latter might be harder than the former, after all some of the funding is based on the project being "on campus".

Moving off-campus would require the "acquisition" of premises to operate from, server space, the required links and a whole host of other things, that are currently provided by the canopy of the university.

(Remember that the line we are using today is a 1GB line, throttled by university politics to 100MB)

I´m not talking about take the project out of the campus, a simple relocation to some place that actualy have the way to use the 1GBps, anyware in the campus. That will cost nothing more than they actualy spend. Maybe the computer science lab/building is the best place, at least is spected there a 24/7 support will be avaiable and sure they have access and allready use the high speed optical fiber to support the 1GBps link.
____________

SockGap
Send message
Joined: 16 Apr 07
Posts: 13
Credit: 5,863,192
RAC: 2,254
Australia
Message 1336887 - Posted: 11 Feb 2013, 1:29:08 UTC - in response to Message 1336829.

(Remember that the line we are using today is a 1GB line, throttled by university politics to 100MB)


From what I remember (mainly from this post from Matt) the link "up the hill" is a spare 100mbps cable. This is probably old multi-mode fibre (although it could be a collection of links across other networks and through other routers). The 1Gbps ISP connection comes in at the bottom of the hill and Campus IT have to get the data to the lab. Considering Matt's post was from 2009 I would have thought there may have been some movement on this by now.

It would be worth asking Campus IT if there's another "spare" 100Mbps link next to the current one - it would be relatively easy to double the connection to 200Mbps if the link was there.

On the co-location - maybe the discussion on whether to move or not should include the cost of installing a pair of single-mode fibres from the new co-location to where the 1Gbps ISP link terminates. Moving the servers to somewhere with 24x7 onsite AND being able to use the 1Gbps link would be a dream come true.

Cheers
Jeff
____________

dsh
Send message
Joined: 6 Jan 08
Posts: 106
Credit: 9,567,891
RAC: 11,327
United States
Message 1336888 - Posted: 11 Feb 2013, 1:33:48 UTC

Just out of curiosity, when the current setup in the lab is working as it should with the current bandwidth, if there is X amount of data coming from Arecibo in a week, would we also be processing X amount of data in a week?

It would seem to me that if the collective SETI@Home clients cannot process all the data in a week that is generated in week then we should be continually falling further and further behind. That would certainly be the best argument for doing things differently such as increasing the bandwidth etc. If on the other hand the current setup can handle the data as fast or faster than it is generated at Arecibo, even with some of the day to day problems we are having, what is the point? If the drives with data are not stacking up higher and higher on the shelf, then there is no real incentive or strong reason to do things differently.

I would think the powers that be would be making decisions based on the amount of data coming in the front door rather than going out the back door. It would be a real shame if some data was dropped and never processed because of lack of resources such as bandwidth. Murphy would almost certainly dictate that the dropped data would be the data that had ET’s message. The converse would be wasted resources because you have more resources than you need to stay current with data arriving from Arecibo. The balance between the two extremes should be the guiding factor.


Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : A Modest Proposal for Easy FREE Bandwidth Relief

Copyright © 2014 University of California