Communication problem

Message boards : Number crunching : Communication problem
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Francesco Forti
Avatar

Send message
Joined: 24 May 00
Posts: 334
Credit: 204,421,005
RAC: 15
Switzerland
Message 1200012 - Posted: 26 Feb 2012, 16:20:00 UTC

As we can see here when everything is working at max speed in berkeley, we reach the maximum (Avg: 93.49 Mbits/se) and the green graph is flat.

But I can see also that my DL is very very slow.
In theory I can DL at 100Mb, but I see I DL at 30~50Kb

Result is a lot of DL open (hundreds), going slow.

When for any reason Berleley is out of work, I see that suddenly all my pending DL run fast at maximum speed.

So I think that must be someting wrong in how the commnication line is used.
We are too much, trying to pass a little door as in a cinema when someone is shouting "fire!".

Maybe a better usage can see anyone wait for a tiket and then pass data at maximum speed.
Or any other algorithm that can handle better the queue.

Some weeks ago I have seen some posts here about proxy setting. I have tried and for some day I had a good speed. Is this option still valid?

Bye
Franz

ID: 1200012 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1200033 - Posted: 26 Feb 2012, 17:16:36 UTC
Last modified: 26 Feb 2012, 17:17:16 UTC

With with a 100Mb line and 226,274 active hosts that would give about 0.442Kb per host. But our servers suffer form the C10K issue, at least last I heard. So with only 10,000, or maybe 20,000 with the 2 download servers, active connections. That would give 10Kb or 5Kb per connection. Without including the overhead from all the dropped connections pounding at the door. So anything more than that is a bonus!

I think the guys in the lab should have a standing item on their wish list for something like "Data center rental". Which would include the required servers and connection. I imagine this would be many more times the current cost that they spend on the connection and maintaining the servers.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1200033 · Report as offensive
Profile Francesco Forti
Avatar

Send message
Joined: 24 May 00
Posts: 334
Credit: 204,421,005
RAC: 15
Switzerland
Message 1200040 - Posted: 26 Feb 2012, 17:36:38 UTC
Last modified: 26 Feb 2012, 17:49:21 UTC

Thanks for the C10K info.
Avery client host can handle two DL in parallel and I use to see from 5 to 20Kb/s speed each, if good, but sometime less than 1.
With BoincTasks I can see all the DL/UL job of my 8 hosts here (other are far)
When berkeley is out of work and the green cricket line is falling down, I see that my DL are 150 to 200 Kb each and so I thing that if you at Berkeley make some filter in order to have only 10'000 clients for each of your DL server, bay be the queue will be used better.
I don't know if you have some "tap" in your communication equipment but if you have, try to set at ~10'000 connection.

Is it possible?
ID: 1200040 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1200128 - Posted: 26 Feb 2012, 20:20:00 UTC - in response to Message 1200040.  

There has been a known problem with the connection for quite some time which Matt has already posted about here,
Speaking of network competition - yes, we're away that we are dropping all kinds of connections during uploads/downloads. This isn't because of our router (which was definitely the problem over the summer before we added RAM to it), but somewhere else further up the pipeline. Still figuring this out, but it's certainly load related.
which is why a lot of us have been using proxies to get around that particular bit of hardware which mainly happens when AP's are being handed out or we're having a shortie storm.

Cheers.
ID: 1200128 · Report as offensive
Profile Alex Erne

Send message
Joined: 18 Nov 08
Posts: 12
Credit: 800,330
RAC: 0
Netherlands
Message 1200135 - Posted: 26 Feb 2012, 20:31:30 UTC

I have no problem with the slow download myself. But that it's constantly aborted and next put on hold by BOINC for 1-2 hours makes my work-queue almost empty (last task running and only 2 to download).
<--- Searching for this one ;-)
ID: 1200135 · Report as offensive
Profile Warren Kozey
Avatar

Send message
Joined: 6 Jul 99
Posts: 54
Credit: 5,026,721
RAC: 0
Canada
Message 1200188 - Posted: 26 Feb 2012, 23:43:41 UTC - in response to Message 1200135.  

I have upload speed of 5Mbit an download 10Mbit and 6 local machines sharing the connection only one uses the web the others are just running seti and they all have units in the down load cue but none have completed. I think the limit to down load should be only 10 units in your crunch cue and then return one and then get a new one, all my machines are slower and have small hard drives have small hard drives and by the time the do an astropulse unit is like 250+hours Cuda units take like 4 hours per each old 8600gt and its like 5 hrs for the old 9300GE card. the real problem is all these faster computers that do Seti6.03 in very little time and the cuda units in like 30 seconds keep hammering the servers and they can not get any units out. I think that any computer faster than 1.8Ghz core clock be made to do the large astropulse units and leave the small units to us people with old systems as we are no threat to your stats we just believe in the science and are here to support the project
Gimme BEER and WU's!!!!
ID: 1200188 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1200191 - Posted: 26 Feb 2012, 23:57:35 UTC - in response to Message 1200188.  
Last modified: 27 Feb 2012, 0:00:51 UTC

I have upload speed of 5Mbit an download 10Mbit and 6 local machines sharing the connection only one uses the web the others are just running seti and they all have units in the down load cue but none have completed. I think the limit to down load should be only 10 units in your crunch cue and then return one and then get a new one, all my machines are slower and have small hard drives have small hard drives and by the time the do an astropulse unit is like 250+hours Cuda units take like 4 hours per each old 8600gt and its like 5 hrs for the old 9300GE card. the real problem is all these faster computers that do Seti6.03 in very little time and the cuda units in like 30 seconds keep hammering the servers and they can not get any units out. I think that any computer faster than 1.8Ghz core clock be made to do the large astropulse units and leave the small units to us people with old systems as we are no threat to your stats we just believe in the science and are here to support the project


No argument with what you propose but my guess is that would
have to be decided from the server side.

How would you code that? Or could you find a volunteer to
write the code because the people at Berkeley are already up
to their eyeballs in work.
ID: 1200191 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1200211 - Posted: 27 Feb 2012, 0:57:29 UTC - in response to Message 1200188.  

I have upload speed of 5Mbit an download 10Mbit and 6 local machines sharing the connection only one uses the web the others are just running seti and they all have units in the down load cue but none have completed. I think the limit to down load should be only 10 units in your crunch cue and then return one and then get a new one, all my machines are slower and have small hard drives have small hard drives and by the time the do an astropulse unit is like 250+hours Cuda units take like 4 hours per each old 8600gt and its like 5 hrs for the old 9300GE card. the real problem is all these faster computers that do Seti6.03 in very little time and the cuda units in like 30 seconds keep hammering the servers and they can not get any units out. I think that any computer faster than 1.8Ghz core clock be made to do the large astropulse units and leave the small units to us people with old systems as we are no threat to your stats we just believe in the science and are here to support the project

So my 24 core machine can only do 10 at a time? :(

Are you using the stock S@H application? I wonder as my GT8500 normally does tasks in 1.5 hours.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1200211 · Report as offensive
Profile Warren Kozey
Avatar

Send message
Joined: 6 Jul 99
Posts: 54
Credit: 5,026,721
RAC: 0
Canada
Message 1200524 - Posted: 28 Feb 2012, 3:36:14 UTC - in response to Message 1200211.  

I am running bionic version 6.12.34 (x64)wxWidgets 2.8.10
and the lunatics 0.39 optimized clients. Until I installed the lunatics optimized client and the latest Nvidia 295.73 driver my machines only crunched setiathome_enhanced 6.03 I did not know that my old 9300GE and 8600GT cards did not think they were capable of Doing CUDA. but since I Installed the Lunatics 0.39 x64 client and the latest video drivers this old hardware is crunching setiathome_enhanced 6.10 (cuda_fermi) wu's
I have noticed that as of late the cuda units time has gone down significantly since upgrading the video driver from 285.62 to 295.73.
Now the cuda units range from about 25 minutes up to 1.5 Hrs
With older drivers and the stock client that runs on bionc I never ever got cuda units just the 6.03 enhanced CPU units.
Gimme BEER and WU's!!!!
ID: 1200524 · Report as offensive
Blake Bonkofsky
Volunteer tester
Avatar

Send message
Joined: 29 Dec 99
Posts: 617
Credit: 46,383,149
RAC: 0
United States
Message 1200527 - Posted: 28 Feb 2012, 4:25:16 UTC - in response to Message 1200188.  
Last modified: 28 Feb 2012, 4:26:52 UTC

I have upload speed of 5Mbit an download 10Mbit and 6 local machines sharing the connection only one uses the web the others are just running seti and they all have units in the down load cue but none have completed. I think the limit to down load should be only 10 units in your crunch cue and then return one and then get a new one, all my machines are slower and have small hard drives have small hard drives and by the time the do an astropulse unit is like 250+hours Cuda units take like 4 hours per each old 8600gt and its like 5 hrs for the old 9300GE card. the real problem is all these faster computers that do Seti6.03 in very little time and the cuda units in like 30 seconds keep hammering the servers and they can not get any units out. I think that any computer faster than 1.8Ghz core clock be made to do the large astropulse units and leave the small units to us people with old systems as we are no threat to your stats we just believe in the science and are here to support the project



Astropulse tasks are actually harder on the servers than MB, from a time vs. size comparison. They are roughly 22 times in size, but only take about 8 times as long to complete.

We need more bandwidth, there is no other way around it. The only other possibility is making the WU's longer to complete, by increasing the precision used to crunch them, while not making them any bigger. That is what happened when MB-Enhanced was released. If you double the crunching time, you effectively need half of the original bandwidth. That is just a patch though.

When the servers have work, the line is almost always maxed out. Only when we've been up for several days or more, and there is no new Astropulse going out, do we see a drop off in the graphs. If there is any significant work shortage, it's days again before we aren't maxing the line out.
ID: 1200527 · Report as offensive
Profile Francesco Forti
Avatar

Send message
Joined: 24 May 00
Posts: 334
Credit: 204,421,005
RAC: 15
Switzerland
Message 1200560 - Posted: 28 Feb 2012, 8:26:26 UTC

From yesterday, 27 feb, the speed of my connection is 3 times faster.
I didn't make any change here and the green line in Graphs for gigabitethernet2_3 was allways at max level.

It's just like someone closed a litte bit the tap :-) optimizing the data flux


ID: 1200560 · Report as offensive
Profile Warren Kozey
Avatar

Send message
Joined: 6 Jul 99
Posts: 54
Credit: 5,026,721
RAC: 0
Canada
Message 1200576 - Posted: 28 Feb 2012, 9:41:37 UTC - in response to Message 1200543.  

There is a different thread regarding the 295.73 driver update. Many, including me, with bad results.

Just stop BOINC - Uninstall Launatics - Do a 'Clean Install' of the 285.62 driver, and reinstall the Lunatics application.

Why would I do that? I have no problems or errors with the 295.73 driver all is up and running well, I have the 295.73 driver on 3 systems one with a sempron 3200+ & 9300GE 256meg, 3200+ Athlon64 w/ 8600GT 512meg and Athlon 64 w/ another 8600GT512meg


Gimme BEER and WU's!!!!
ID: 1200576 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1200588 - Posted: 28 Feb 2012, 10:42:22 UTC - in response to Message 1200543.  

I am running bionic version 6.12.34 (x64)wxWidgets 2.8.10
and the lunatics 0.39 optimized clients. Until I installed the lunatics optimized client and the latest Nvidia 295.73 driver my machines only crunched setiathome_enhanced 6.03 I did not know that my old 9300GE and 8600GT cards did not think they were capable of Doing CUDA. but since I Installed the Lunatics 0.39 x64 client and the latest video drivers this old hardware is crunching setiathome_enhanced 6.10 (cuda_fermi) wu's
I have noticed that as of late the cuda units time has gone down significantly since upgrading the video driver from 285.62 to 295.73.
Now the cuda units range from about 25 minutes up to 1.5 Hrs
With older drivers and the stock client that runs on bionc I never ever got cuda units just the 6.03 enhanced CPU units.


There is a different thread regarding the 295.73 driver update. Many, including me, with bad results.

Just stop BOINC - Uninstall Launatics - Do a 'Clean Install' of the 285.62 driver, and reinstall the Lunatics application.

You don't need to uninstall the Lunatics apps to change driver, But what you might want to do is disable Boinc Manager from starting up until the driver install is complete,
Just untick 'Run Manager at login?' (in Advanced>Options with Boinc 6.10.x, and Tools>Options with Boinc 6.12.x and later), Note: you'll have to fully exit Boinc Manager before this will work.
Once the drivers are installed, retick 'Run Manager at login?' and fully exit Boinc Manager again.

Claggy
ID: 1200588 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1200589 - Posted: 28 Feb 2012, 10:44:02 UTC - in response to Message 1200576.  
Last modified: 28 Feb 2012, 10:44:17 UTC

There is a different thread regarding the 295.73 driver update. Many, including me, with bad results.

Just stop BOINC - Uninstall Launatics - Do a 'Clean Install' of the 285.62 driver, and reinstall the Lunatics application.

Why would I do that? I have no problems or errors with the 295.73 driver all is up and running well, I have the 295.73 driver on 3 systems one with a sempron 3200+ & 9300GE 256meg, 3200+ Athlon64 w/ 8600GT 512meg and Athlon 64 w/ another 8600GT512meg


What ports are these GPUs connected to their monitors with?

Claggy
ID: 1200589 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1200590 - Posted: 28 Feb 2012, 10:46:04 UTC - in response to Message 1200543.  

There is a different thread regarding the 295.73 driver update. Many, including me, with bad results.

Just stop BOINC - Uninstall Launatics - Do a 'Clean Install' of the 285.62 driver, and reinstall the Lunatics application.



It's completely unnecessary to uninstall Lunatics if you want to change driver.

Of course downgrading because of issues calls for a clean driver install.
ID: 1200590 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1200593 - Posted: 28 Feb 2012, 10:53:42 UTC - in response to Message 1200590.  

There is a different thread regarding the 295.73 driver update. Many, including me, with bad results.

Just stop BOINC - Uninstall Launatics - Do a 'Clean Install' of the 285.62 driver, and reinstall the Lunatics application.

It's completely unnecessary to uninstall Lunatics if you want to change driver.

Of course downgrading because of issues calls for a clean driver install.

But as Claggy says, it's important that you prevent BOINC from running automatically following the Windows restarts that are likely to be necessary during such radical driver work.
ID: 1200593 · Report as offensive
Profile Warren Kozey
Avatar

Send message
Joined: 6 Jul 99
Posts: 54
Credit: 5,026,721
RAC: 0
Canada
Message 1200750 - Posted: 28 Feb 2012, 23:54:00 UTC - in response to Message 1200645.  

They are connected through the DVI ports to an 8 port KVM and then the KVM connects to the main keyboard video display and mouse.
Like I said why would I want to change the display driver it works just fine with the two Asus EN8600GT cards and the HP9300GE. Bionc and setiathome_enhanced 6.10 (cuda_fermi)Have no problems with the 295.73 Nvidia drivers with my hardware. In fact the GPU's are actually liking it I have yet to have any computation errors using the 295.73 drivers. Again I ask why would I revert to the old driver because even with Lunatic 0.39 installed and the 285.62 drivers my cards i got no cuda units. With the 295.73 driver I do get cuda units and lots of them, and I do not get computation errors at all with the 295.73 they have bee running error free since I installed them...
Gimme BEER and WU's!!!!
ID: 1200750 · Report as offensive
Profile Francesco Forti
Avatar

Send message
Joined: 24 May 00
Posts: 334
Credit: 204,421,005
RAC: 15
Switzerland
Message 1201632 - Posted: 2 Mar 2012, 7:59:53 UTC
Last modified: 2 Mar 2012, 8:00:25 UTC

I don't know if someone in Berleley have closed (a little) some tap but what I see now is that:
1) data comm graph isn't allways at max speed
2) there are 256,832 results ready to send
3) my hosts have the max of results to crunch
4) I don't have any transfer pending

So this is the ideal condition. :-)
Of course if it is so for everyone!

Is it so?
ID: 1201632 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1201644 - Posted: 2 Mar 2012, 8:46:16 UTC - in response to Message 1201632.  

I don't know if someone in Berleley have closed (a little) some tap but what I see now is that:
1) data comm graph isn't allways at max speed
2) there are 256,832 results ready to send
3) my hosts have the max of results to crunch
4) I don't have any transfer pending

So this is the ideal condition. :-)
Of course if it is so for everyone!

Is it so?

Kitties are very happy here right now, Francesco!
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1201644 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 1201652 - Posted: 2 Mar 2012, 9:04:26 UTC - in response to Message 1201632.  

I don't know if someone in Berleley have closed (a little) some tap...

Yes, of course: AstroPulse generation is suspended.

Gruß,
Gundolf
ID: 1201652 · Report as offensive

Message boards : Number crunching : Communication problem


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.