a thought about bandwidth

Message boards : Number crunching : a thought about bandwidth
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1179384 - Posted: 19 Dec 2011, 21:11:55 UTC

Many of us, including me, have been bitching about bandwidth and why are they asking for hardware donations that don't seem to have to do with bandwidth.

But let me ask something, in the hope that someone knows fairly surely:

If they suddenly did have enough bandwidth for all the uploads and downloads to fly with no problems, would the splitters be able to keep up? The "results ready to send" tends to stay around ~200,000, more or less, with the current level of WU consumption. Do the splitters have the capacity to split as much work as would be demanded if it could flow without restriction?

If they don't, then within days, perhaps hours, of the new bandwidth being brought online, we would all be bitching about no work available (which is one of the things we're already bitching about, just because the somethingorother gets filled in 100 unit batches and answers no work available if someone asks at the moment between it sending the last one and getting the next batch).

Is increasing splitter capacity one of the things they intend to do with the hardware they're requesting?

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1179384 · Report as offensive
Treasurer

Send message
Joined: 13 Dec 05
Posts: 109
Credit: 1,569,762
RAC: 0
Germany
Message 1179390 - Posted: 19 Dec 2011, 21:21:14 UTC

I think both, more bandwidth and/or faster splitter, would result in one thing nobody wants: something like "all tapes split, waiting for arelcibo to send more tapes".
All SETI needs is a 1.1 faster splitting/uploading rate than arelcibos recordings.
ID: 1179390 · Report as offensive
musicplayer

Send message
Joined: 17 May 10
Posts: 2430
Credit: 926,046
RAC: 0
Message 1179393 - Posted: 19 Dec 2011, 21:32:33 UTC
Last modified: 19 Dec 2011, 21:35:55 UTC

Can the server handle one million new results an hour?

Or at least one thousand new results each hour.

I guess these results are automatically validated against each other.

From this you pick the best results for further numerical analysis of the data as well as possible re-observation in order to possibly be able to obtain even more results or possibly be able to get something else.

Which means that the human factor is always the limiting factor.
ID: 1179393 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1179395 - Posted: 19 Dec 2011, 21:40:35 UTC - in response to Message 1179390.  

Treasurer,
Don't forget we will be getting data from the Green Mountain telescope (I think it's called) in West Virginia soon. Plus, it won't be long before we start getting the new V7 apps. There should be no problem about having enough work to do soon.

As to the equipment they are requesting, it is to upgrade some of their servers and or replace others that are just too old to do the job required. It may not speed things up but it sure won't slow us down as much as a down server does. If GPUUG can get the things SETI is asking for they should have no problems getting work out to us.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1179395 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65827
Credit: 55,293,173
RAC: 49
United States
Message 1179398 - Posted: 19 Dec 2011, 21:48:48 UTC

Might bandwidth also apply to data written to the hard disk/ssd?

I know Matt tried an SSD and found It lacking, Does anyone know If He used an off the shelf unit or an Enterprise type SSD w/a sandforce chipset?

I'm not sure what the spindle speed is on the current drives at Seti is, but if their 10K, then 15K might not be all that much faster, considering that SSD's are faster still.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1179398 · Report as offensive
Profile Frizz
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 271
Credit: 5,852,934
RAC: 0
New Zealand
Message 1179399 - Posted: 19 Dec 2011, 21:51:54 UTC - in response to Message 1179395.  

What we really need is increased runtimes. That would solve all of our problems.
ID: 1179399 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1179402 - Posted: 19 Dec 2011, 21:58:16 UTC - in response to Message 1179399.  

What we really need is increased runtimes. That would solve all of our problems.

v7 is coming sometime, that has increased runtimes, especially shorties,

Claggy
ID: 1179402 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1179403 - Posted: 19 Dec 2011, 22:04:03 UTC - in response to Message 1179402.  

What we really need is increased runtimes. That would solve all of our problems.

v7 is coming sometime, that has increased runtimes, especially shorties,

Claggy

I'm impatient and know I've asked before, but do you have an ETA of V7? Weeks? Months?
ID: 1179403 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34270
Credit: 79,922,639
RAC: 80
Germany
Message 1179405 - Posted: 19 Dec 2011, 22:36:09 UTC - in response to Message 1179403.  

What we really need is increased runtimes. That would solve all of our problems.

v7 is coming sometime, that has increased runtimes, especially shorties,

Claggy

I'm impatient and know I've asked before, but do you have an ETA of V7? Weeks? Months?


Not yet.
When the stuff is ready.



With each crime and every kindness we birth our future.
ID: 1179405 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65827
Credit: 55,293,173
RAC: 49
United States
Message 1179407 - Posted: 19 Dec 2011, 22:38:25 UTC - in response to Message 1179405.  

What we really need is increased runtimes. That would solve all of our problems.

v7 is coming sometime, that has increased runtimes, especially shorties,

Claggy

I'm impatient and know I've asked before, but do you have an ETA of V7? Weeks? Months?


Not yet.
When the stuff is ready.

Yep, the oven hasn't said Ding yet.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1179407 · Report as offensive
Profile Slavac
Volunteer tester
Avatar

Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1179420 - Posted: 19 Dec 2011, 23:35:20 UTC - in response to Message 1179407.  

Hey guys,

Please take a second to read http://setiathome.berkeley.edu/forum_thread.php?id=66044&nowrap=true#1179419 which should answer some questions as to why we're prioritizing the RAID array project.

As to the bandwidth question: I think it's safe to assume that once we have provided SETI at Home with a new hardware infrastructure, we can begin addressing other concerns such as bandwidth. Essentially, we need to put gas in the car before we can worry about how fast it goes around the track.


Executive Director GPU Users Group Inc. -
brad@gpuug.org
ID: 1179420 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1179439 - Posted: 20 Dec 2011, 1:12:19 UTC
Last modified: 20 Dec 2011, 1:39:32 UTC

It isn't just a plain bandwidth problem, I'm convinced there is still something dodgy happening at the PAIX where SAH connects to the outside world.

If there isn't, can someone explain why I get a more reliable and consistent connection going through a proxy in Brazil than I do connecting direct?

With a direct connection the download speeds vary wildly from fast (20kBs+) to nearly stalled (<1kBs), retries and project backoffs are common.

Through the proxy, the peak speeds are not as high but there is a consistent download speed of around 8-10 kBs and retries and backoffs occur nowhere near as often. When they do, the direct path is frozen as well.

Whether the infamous "HE Router" is still having brain seizures or the problem is further upstream I don't know but there is definitely a problem there somewhere.

There are many crunchers experiencing the same problems and it appears to come down to which long haul provider their ISP uses and where that provider connects into the HE system.

This is one reason why some proxies work and others don't, you have to find one that has the right entry point and preferably it isn't at the PAIX.

T.A.
ID: 1179439 · Report as offensive
Profile SciManStev Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Jun 99
Posts: 6653
Credit: 121,090,076
RAC: 0
United States
Message 1179441 - Posted: 20 Dec 2011, 1:25:50 UTC

I was just wondering if somehow some future version of BOINC, would have an auto-proxy search engine. It would just find the best connection at the moment, and use it. It could work for all projects.

That would put a huge strain on the servers though, but once several of the fundraisers in progress have completed, the system would be much better equiped to handle a larger data load than we currently have.

Steve
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website
ID: 1179441 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1179454 - Posted: 20 Dec 2011, 3:37:06 UTC - in response to Message 1179384.  

...
If they suddenly did have enough bandwidth for all the uploads and downloads to fly with no problems, would the splitters be able to keep up? The "results ready to send" tends to stay around ~200,000, more or less, with the current level of WU consumption. Do the splitters have the capacity to split as much work as would be demanded if it could flow without restriction?
...

The project controls the mb_splitter processes so they don't start on a new channel if there's more than 200,000 "results ready to send". There's some overshoot because each channel may produce up to about 15872 WUs (31744 results). The splitters rest until "results ready to send" drops below a lower level, probably 150,000 or so.

It's fairly common when all 5 mb_splitter processes are actually active to see "Current result creation rate" over 40 per second. If the ~93 Mbits/sec shown on Cricket were clean delivery of only MB WUs, it would be about 31 per second delivery. Then the "Results received in last hour" would have to average to 111,600 some time later. With Astropulse work also being delivered and inefficiencies of the saturation, actual MB delivery is considerably less. If they needed to create work faster more splitters could be run, though with space in the closet for only 3 racks and limited power available that kind of growth cannot work miracles. There's also the data pipeline ahead of the splitters to consider, at one point Matt Lebofsky said that the examination of the data for software blanking used more compute time than splitting.

My guess is that doubled bandwidth would temporarily satisfy demand, could be supplied with a couple more MB splitters, and the rest of the servers as improved by donations through GPUUG could cope with the increased load.

Moore's Law and possibly lapsed users returning might grow demand beyond capacity again, though.
                                                                 Joe
ID: 1179454 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30721
Credit: 53,134,872
RAC: 32
United States
Message 1179455 - Posted: 20 Dec 2011, 3:38:20 UTC - in response to Message 1179439.  

It isn't just a plain bandwidth problem, I'm convinced there is still something dodgy happening at the PAIX where SAH connects to the outside world.

If there isn't, can someone explain why I get a more reliable and consistent connection going through a proxy in Brazil than I do connecting direct?

With a direct connection the download speeds vary wildly from fast (20kBs+) to nearly stalled (<1kBs), retries and project backoffs are common.

Through the proxy, the peak speeds are not as high but there is a consistent download speed of around 8-10 kBs and retries and backoffs occur nowhere near as often. When they do, the direct path is frozen as well.

Whether the infamous "HE Router" is still having brain seizures or the problem is further upstream I don't know but there is definitely a problem there somewhere.

There are many crunchers experiencing the same problems and it appears to come down to which long haul provider their ISP uses and where that provider connects into the HE system.

This is one reason why some proxies work and others don't, you have to find one that has the right entry point and preferably it isn't at the PAIX.

T.A.

Traceroute may tell you what is up; if others have the same issue at the same router, it may even be possible to get it fixed. I know some time ago I had an issue reaching LHC@Home from rigs I had at one location but not from rigs I had at another. Different ISP's and the routes the packets took were different. Somewhere in the middle there was a misconfigured router and it ate the packets from the one location. It wasn't near either end and impossible for me to find who owned it to send them and e-mail to please check their equipment. This could be your issue as well. The direct route from your place to the SSL may go through a bad router, but going from you to Brazil may bypass that bad box. Some experimentation on your part might find it by trying different proxies around the world.

All I can say is presently from two different locations none of my boxes is having any problems contacting SETI.

ID: 1179455 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1179472 - Posted: 20 Dec 2011, 6:44:59 UTC

I've been commenting/musing for the past two years in various threads about how more than 100mbit would certainly be welcomed by a lot of people, but the other half of the situation is can the servers handle it? More bandwidth means more task requests, which means more DB queries.

Presently, DB queries and disk I/O are the main bottlenecks, so a bigger communications channel will only amplify the effects of the bottleneck. However, this IS just speculation, but it seems pretty reasonable. Of course the only real way to find out for sure is to just do it and then observe/make panicked fixes, whichever the case may be.

Of course, doubling the run-time of MBs (again) could make a difference as well, but I'm not sure if it would be enough.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1179472 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1179501 - Posted: 20 Dec 2011, 9:06:35 UTC - in response to Message 1179455.  

It isn't just a plain bandwidth problem, I'm convinced there is still something dodgy happening at the PAIX where SAH connects to the outside world.

If there isn't, can someone explain why I get a more reliable and consistent connection going through a proxy in Brazil than I do connecting direct?

With a direct connection the download speeds vary wildly from fast (20kBs+) to nearly stalled (<1kBs), retries and project backoffs are common.

Through the proxy, the peak speeds are not as high but there is a consistent download speed of around 8-10 kBs and retries and backoffs occur nowhere near as often. When they do, the direct path is frozen as well.

Whether the infamous "HE Router" is still having brain seizures or the problem is further upstream I don't know but there is definitely a problem there somewhere.

There are many crunchers experiencing the same problems and it appears to come down to which long haul provider their ISP uses and where that provider connects into the HE system.

This is one reason why some proxies work and others don't, you have to find one that has the right entry point and preferably it isn't at the PAIX.

T.A.

Traceroute may tell you what is up; if others have the same issue at the same router, it may even be possible to get it fixed. I know some time ago I had an issue reaching LHC@Home from rigs I had at one location but not from rigs I had at another. Different ISP's and the routes the packets took were different. Somewhere in the middle there was a misconfigured router and it ate the packets from the one location. It wasn't near either end and impossible for me to find who owned it to send them and e-mail to please check their equipment. This could be your issue as well. The direct route from your place to the SSL may go through a bad router, but going from you to Brazil may bypass that bad box. Some experimentation on your part might find it by trying different proxies around the world.

All I can say is presently from two different locations none of my boxes is having any problems contacting SETI.


I'm with TA in this boat. TA is in the middle of Oz, I'm down south and am foreever having the same trouble ... using proxies helps to bypass the issue ... the real issue is what is the problem and why isn't someone from Berkeley following the data to see what is happening (that's two issues actually) ...

:)



ID: 1179501 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1179503 - Posted: 20 Dec 2011, 9:10:08 UTC - in response to Message 1179472.  

I've been commenting/musing for the past two years in various threads about how more than 100mbit would certainly be welcomed by a lot of people, but the other half of the situation is can the servers handle it? More bandwidth means more task requests, which means more DB queries.

Presently, DB queries and disk I/O are the main bottlenecks, so a bigger communications channel will only amplify the effects of the bottleneck. However, this IS just speculation, but it seems pretty reasonable. Of course the only real way to find out for sure is to just do it and then observe/make panicked fixes, whichever the case may be.

Of course, doubling the run-time of MBs (again) could make a difference as well, but I'm not sure if it would be enough.


Kepler GPUs will chop run time to bits again ... get ready to batten down the hatches when they come ...
ID: 1179503 · Report as offensive
Blake Bonkofsky
Volunteer tester
Avatar

Send message
Joined: 29 Dec 99
Posts: 617
Credit: 46,383,149
RAC: 0
United States
Message 1179510 - Posted: 20 Dec 2011, 10:35:25 UTC - in response to Message 1179503.  

Kepler GPUs will chop run time to bits again ... get ready to batten down the hatches when they come ...


I can't wait. I would love 8+ hour GPU tasks, especially if they validate and I get 2400+ credits for them :D
ID: 1179510 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65827
Credit: 55,293,173
RAC: 49
United States
Message 1179536 - Posted: 20 Dec 2011, 14:22:30 UTC - in response to Message 1179503.  

I've been commenting/musing for the past two years in various threads about how more than 100mbit would certainly be welcomed by a lot of people, but the other half of the situation is can the servers handle it? More bandwidth means more task requests, which means more DB queries.

Presently, DB queries and disk I/O are the main bottlenecks, so a bigger communications channel will only amplify the effects of the bottleneck. However, this IS just speculation, but it seems pretty reasonable. Of course the only real way to find out for sure is to just do it and then observe/make panicked fixes, whichever the case may be.

Of course, doubling the run-time of MBs (again) could make a difference as well, but I'm not sure if it would be enough.


Kepler GPUs will chop run time to bits again ... get ready to batten down the hatches when they come ...

Glad You can afford new stuff, I can barely afford 295 cards now, thankfully I only need 1 more v2 card, then there's My anxiety, which some in DC keep pushing, It'll kill Me yet, one way or another.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1179536 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : a thought about bandwidth


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.