More of the Same (Jul 09 2009)

Message boards : Technical News : More of the Same (Jul 09 2009)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
JPP

Send message
Joined: 31 May 99
Posts: 18
Credit: 59,436,360
RAC: 47
France
Message 916569 - Posted: 10 Jul 2009, 18:11:17 UTC - in response to Message 916554.  

hi

I have any idea of the network you are using .
which are products involved in the datapath ;
how that is connected to internet ?
how a upload/download is supposed to work

to me the thing i perceive is that upload (the result from me to you) generally fail and big time but when they succeeded to **start**; then it is rather fast
almost same speed 'as usual' , then last phase is slow (when all is copied and we wait for some ack)
also the wu download ( from you to me ) are fast
so i could believe trouble to be about establishing the copy session upstream
(from me to you)
if that was true , i would not be totally convinced it is about network
without knowing who (which software/system) is involved when accepting results from the users
i m a bit far to plug my sniffer ...
cheers
jeanpierre@jpp

ID: 916569 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 916572 - Posted: 10 Jul 2009, 18:25:11 UTC - in response to Message 916554.  


In all seriousness, though, I've spent the past four decades primarily as a software-type, but my first serious job was in "Design Automation" -- helping build the Burroughs B6800 and B6900 mainframes through software.

The line between software and hardware has always been a little blurry for me.

I will admit that because I'm a software guy, I prefer a smaller iron when doing PWB work, because things happen a little slower. If I did it all the time, I'd want a hotter iron so I could go fast.

I have a collection of books and manuals (I was a technical writer) about past machines. One of them you must certainly know:
"Computer System Organization" by Elliott I. Organick, in the B5700/B6700 series. Very smart systems.
Tullio
ID: 916572 · Report as offensive
Profile mr.mac52
Avatar

Send message
Joined: 18 Mar 03
Posts: 67
Credit: 245,882,461
RAC: 0
United States
Message 916581 - Posted: 10 Jul 2009, 19:41:39 UTC

Sorry for the length of this post.

From: Gary Charpentier’s note 916371

Answered a long time ago.
Tweenday Two (Dec 27 2007)

In reality, we have a 1GBit connection to the world via Hurricane Electric, but alas this is constrained by a 100 Mbit fiber coming out of the lab down to campus - it will take some big $$$ to upgrade that, which may happen sooner or later (as it would not only benefit us).

I mentioned how we seem to have a 60Mbit ceiling...

... We have gigabit all over our server closet (more or less - some older servers are 100Mbit going into the 1Gbit switch).


From: seti@elrcastor.com’s note 916409

one problem is it could be multimode fiber with a length, short enough for a 100mb connection to work, less than 2km. But to do gig over multimode with a WS-G5486 LX/LH gbic you are limited to 550m. To get gig to go 2km or more you need single mode fiber, when you use single mode fiber with a LX/LH gbic you get 10km.


From: Ned Ludd’s note 916550

I know I pointed to 6a) but many of the solutions proposed (P2P, Torrent, offsite upload/download servers) feel like taking a problem and moving it around.

The problem is the 100 megabit pipe, and the servers themselves.

It looks to me like the current infrastructure can handle the current average load. The problem is how the load builds during an outage, and how networking works when the load is near or above 100%.

Increasing the bandwidth (and more/better servers) raise the 100% mark, and are always going to be a good idea.

I just wonder if there are some others that could help, without reshuffling the problem.


----
Ok, I accept the challenge of looking at the 100 mb pipe.

The use of laser based GBICs will not extend the range of multimode fiber beyond the 2 km limit and most limit it to 500m or less. To do this you need products that use other technologies to exceed the distance limitations.

There is basic information needed on the existing fiber link in use today to realistically speculate on potential solutions.

#1 is understanding the campus infrastructure. There have been many campus level fiber networks that used a higher level switch to drive individual drops to points of presence (POP) in each building. If that is the case here, the possibility of facilities agreeing to allow the use of this specific fiber optic pair to be used as ‘dark fiber’ would be critical. If it is dark fiber, you are usually not limited by the owner of the fiber as to what equipment uses it.

#2 is the type of fiber in use. The common multimode fiber in use today is either 50/125(OM2) or 62.5/125(OM1), there was a 100/140 variant but I’ve never seen it in the field. Knowing this can give us some idea of optical losses to be expected.

#3 is the length of the physical fiber in use today. The distance is only one of the limiting factors but it is important. When facilities install fiber infrastructure it is generally not a shortest path run. If the real length is known fine, if not an OTDR can estimate the length as well as the over all attenuation of the cable. It is also useful to know if any patch panels are in use as this introduces more light loss and possible problems.

#4 is the light loss in this specific fiber run. There are two ways this can be specified, round trip light loss assuming equal loss for the pair or individual fibers so that we really know the quality of each strand. Most optical testers will test at more than one wave length which is useful if looking at current technologies to solve the speed problem.

Once these questions are answered we can proceed to consider the options that might work in this specific environment. What options can we speculate on without this info; I was hoping you would ask.

#1 if the length is 2km or less and we know the light loss figures, the laser GIBC solution could work given the equipment using the GBICs can work at higher link speeds and distances.

#2 if #1 is not valid we could consider using wave division multiplexing (WDM) to use separate wave lengths in parallel to get to higher speeds.

Two products that I’ve seen in use with multimode fiber are the PL-400 and PL-1000 by PacketLight Networks. The PL-400 can handle GbE and the PL-1000 10GbE and depending on the physical fiber, multiples of these interfaces. They have options for either Course WDM (CWDM) or Dense WDM (DWDM), the CWDM is usually significantly less expensive. Speaking of costs, they are not inexpensive but I think much less costly than installing new fiber infrastructure as they are built for this very purpose.

If on the other hand, facilities will not allow the use of their physical fiber as dark fiber, you are at their mercy on costs to get higher speeds to your building.

FWIW and YMMV

John

ID: 916581 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 916582 - Posted: 10 Jul 2009, 19:49:09 UTC - in response to Message 916554.  
Last modified: 10 Jul 2009, 19:55:03 UTC



Yeah, yeah, yeah, I know:

Question: how many software guys does it take to change a light bulb?

Answer: they can't, light bulbs are hardware.

Seriously, some of us software dudes know a little about hardware. A few of us even have DVMs, oscilloscopes, and soldering irons, and know how to use them.


Ned, the correct answer is "they don't etc..." - its not that they can't, it's just not in their job description! (so, like government workers everywhere, their attitude is "Not My Job!") ;-) (from a former government employee...)

Actually, this is what is known as a Joke.

It isn't necessarily true, it is meant to be humorous.

In all seriousness, though, I've spent the past four decades primarily as a software-type, but my first serious job was in "Design Automation" -- helping build the Burroughs B6800 and B6900 mainframes through software.

The line between software and hardware has always been a little blurry for me.

I will admit that because I'm a software guy, I prefer a smaller iron when doing PWB work, because things happen a little slower. If I did it all the time, I'd want a hotter iron so I could go fast.


Yes, I know it;'s a joke; just one that isn't being told correctly... :-)

Real joke is:
"how many software engineers (or "programmers") to change a light bulb?"
"NONE- that's a hardware problem!"

--- after a string of other "light bulb" jokes... This as told by an old-time computer (IBM 360, 370, etc.) operator - who's caught hell from both sides!
.

Hello, from Albany, CA!...
ID: 916582 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 916584 - Posted: 10 Jul 2009, 19:52:43 UTC - in response to Message 916572.  


In all seriousness, though, I've spent the past four decades primarily as a software-type, but my first serious job was in "Design Automation" -- helping build the Burroughs B6800 and B6900 mainframes through software.

The line between software and hardware has always been a little blurry for me.

I will admit that because I'm a software guy, I prefer a smaller iron when doing PWB work, because things happen a little slower. If I did it all the time, I'd want a hotter iron so I could go fast.

I have a collection of books and manuals (I was a technical writer) about past machines. One of them you must certainly know:
"Computer System Organization" by Elliott I. Organick, in the B5700/B6700 series. Very smart systems.
Tullio

They still are, Unisys "Libra" machines run the same architecture. I remember the book, but it was a long time ago.
ID: 916584 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 916586 - Posted: 10 Jul 2009, 19:56:29 UTC - in response to Message 916581.  

If on the other hand, facilities will not allow the use of their physical fiber as dark fiber, you are at their mercy on costs to get higher speeds to your building.

They won't. That isn't to say that they aren't open to some pretty original thinking, because the gigabit pipe SETI has comes from PAIX, through campus-provided facilities and to the closet.

What they don't want is someone implementing some solution, and then being stuck with trying to fix it if it breaks later.

ID: 916586 · Report as offensive
JPP

Send message
Joined: 31 May 99
Posts: 18
Credit: 59,436,360
RAC: 47
France
Message 916647 - Posted: 10 Jul 2009, 23:15:22 UTC - in response to Message 916586.  

well
in my previous reply i was trying to understand the exact current situation and therefore trying to identify the exact bottleneck
i was not trying to play starwars
it is true situation is a bit peculiar; i m avl if net tech expertise is really needed
btw; led is used for mm (multimode) gbics as laser is used for sm (singlemode) gbics
cheers
jeanpierre@jpp

ID: 916647 · Report as offensive
Profile John

Send message
Joined: 5 Jun 99
Posts: 30
Credit: 77,663,734
RAC: 236
United States
Message 916668 - Posted: 11 Jul 2009, 0:20:47 UTC

Perhaps I am over simplifying my take on the solution to the problem of "log jams" on Seti servers, BUT it seems to me that CUDA work units need to be at least 10 times longer. On one of my systems a CUDA work unit runs from 6 to 15 minutes and since there are so many of these it seems that making them larger would cut down on the number of requests and therefore the traffic to the servers. While I do not know the percentage of total traffic that CUDA units make up, it certainly must be significant. Is this a reasonable partial solution?
Later. John (OS/2 Warp team founder/administrator)
ID: 916668 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20084
Credit: 7,508,002
RAC: 20
United Kingdom
Message 916693 - Posted: 11 Jul 2009, 1:51:07 UTC - in response to Message 916581.  

... Two products that I’ve seen in use with multimode fiber are the PL-400 and PL-1000 by PacketLight Networks.

Well, that's a start even though the PL-400 looks to be overkill. How much for two of 'em?

Others?

Note that we don't know if the fibre is multi-mode or single-mode. However, considering the 2km+ length, I would expect that single-mode fibre would have been installed. (Note: I have no connection there and don't actually know. Any clues Matt?)

Keep searchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 916693 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18996
Credit: 40,757,560
RAC: 67
United Kingdom
Message 916702 - Posted: 11 Jul 2009, 2:25:16 UTC - in response to Message 916668.  

Perhaps I am over simplifying my take on the solution to the problem of "log jams" on Seti servers, BUT it seems to me that CUDA work units need to be at least 10 times longer. On one of my systems a CUDA work unit runs from 6 to 15 minutes and since there are so many of these it seems that making them larger would cut down on the number of requests and therefore the traffic to the servers. While I do not know the percentage of total traffic that CUDA units make up, it certainly must be significant. Is this a reasonable partial solution?
Later. John (OS/2 Warp team founder/administrator)

One has to agree that the CUDA tasks need to be made longer. BUT, there are no specific CUDA tasks, the tasks are the same, it is just that there are two methods of processing them either on the cpu or the gpu. The results from both methods are the same or else they wouldn't validate.

The Berkeley staff have the knowledge to modify the cpu app to make it more sensitive etc, but I am not sure about the CUDA gpu app as that was mainly done by Nvidia. It is generally though that this was probably a one time gift from Nvidia or else we would have had the problem with VLAR tasks sorted by now.
ID: 916702 · Report as offensive
Profile John

Send message
Joined: 5 Jun 99
Posts: 30
Credit: 77,663,734
RAC: 236
United States
Message 916737 - Posted: 11 Jul 2009, 3:56:27 UTC - in response to Message 916702.  

It seems that once a work unit for CUDA gets on my machines it is marked as a CUDA unit-the application is "Seti@home Enhanced 6.08 (CUDA)" and no other Seti application will process it. Perhaps it is not specifically a CUDA work unit until I receive it, but it seems like it is. Oh well, I will leave that determination up to the software authors. Later. John (OS/2 Warp team founder/administrator)
ID: 916737 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18996
Credit: 40,757,560
RAC: 67
United Kingdom
Message 916742 - Posted: 11 Jul 2009, 4:29:53 UTC - in response to Message 916737.  

It seems that once a work unit for CUDA gets on my machines it is marked as a CUDA unit-the application is "Seti@home Enhanced 6.08 (CUDA)" and no other Seti application will process it. Perhaps it is not specifically a CUDA work unit until I receive it, but it seems like it is. Oh well, I will leave that determination up to the software authors. Later. John (OS/2 Warp team founder/administrator)

Yes once the task is received on your computer the tasked has been marked to be processed by the gpu (CUDA) or cpu app. But as the Shameless Plug for Lunatics Forum - ReSchedule 1.7 thread shows. A way to get the most from your computer is to ReSchedule the tasks that run slow with the CUDA app, the VLAR tasks, on to the cpu and vice versa.
It is not the best solution but is one that seems to work for most people and is better than the previous method which was to 'kill VLAR's' using the CUDA app by introducing an error. Which caused re-issues and therefore more server load.
ID: 916742 · Report as offensive
Jeff Griffith

Send message
Joined: 14 May 99
Posts: 3
Credit: 185,317
RAC: 0
United States
Message 916930 - Posted: 11 Jul 2009, 22:03:56 UTC - in response to Message 916742.  

So, when oh when are our Macs going to get the Cuda access so that we can start crunching faster? I've got two Intel Macs rarin' to go and so do a lot of the rest of us from the looks of your stats. Since we don't have Snow Leopard yet, there's no way yet to access the GPU for Boinc. Got a time frame in mind?
ID: 916930 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 917182 - Posted: 12 Jul 2009, 17:01:22 UTC - in response to Message 916930.  

So, when oh when are our Macs going to get the Cuda access so that we can start crunching faster? I've got two Intel Macs rarin' to go and so do a lot of the rest of us from the looks of your stats. Since we don't have Snow Leopard yet, there's no way yet to access the GPU for Boinc. Got a time frame in mind?

When someone sits down and writes it.

Nvidia did the Windows app. I don't know if the Mac drivers have a similar API, but I suspect that someone who has a Mac and knows the OS could do the port, using the WinApp as a model.

It'll likely be someone outside SETI@Home.
ID: 917182 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 917199 - Posted: 12 Jul 2009, 17:54:26 UTC - in response to Message 917182.  
Last modified: 12 Jul 2009, 17:55:38 UTC

So, when oh when are our Macs going to get the Cuda access so that we can start crunching faster? I've got two Intel Macs rarin' to go and so do a lot of the rest of us from the looks of your stats. Since we don't have Snow Leopard yet, there's no way yet to access the GPU for Boinc. Got a time frame in mind?

When someone sits down and writes it.

Nvidia did the Windows app. I don't know if the Mac drivers have a similar API, but I suspect that someone who has a Mac and knows the OS could do the port, using the WinApp as a model.

It'll likely be someone outside SETI@Home.


The CUDA toolkit appears to be identical between Linux and Mac so the
difference is just in the compiler and linker. I'm not sure that it's
that easy to port from Windows code, though. I tried crunch3r's Linux
code on my new net-top and couldn't resolve all the libraries (I suspect
he used V2.1, while I tried V2.2) so I plonked in Win 7 RC and it's now
crunching along with the CPU matching my laptot's throughput and the
GPU adding another 120% or so.
ID: 917199 · Report as offensive
Profile Edywison
Avatar

Send message
Joined: 31 May 09
Posts: 13
Credit: 20,563
RAC: 0
Indonesia
Message 917205 - Posted: 12 Jul 2009, 18:29:08 UTC - in response to Message 916334.  

what you need is to control the number of tasks to be sent out, therefore reduce the number of completed tasks need to be received. Because, simply SETI@HOME doesn't have the required resources.

Guess... some user might be frustrated of not being able to receive tasks, but I guess SETI@HOME is a science project, not some CRUNCHING COMPETITION, why we should forcefully request tasks?
ID: 917205 · Report as offensive
Profile Edywison
Avatar

Send message
Joined: 31 May 09
Posts: 13
Credit: 20,563
RAC: 0
Indonesia
Message 917212 - Posted: 12 Jul 2009, 18:40:07 UTC - in response to Message 916334.  

A programmer who can't handle LAN is unforgivable.
A programmer who can't hanldle WAN is reasonable

A progammer who can't even handle SETI@HOME network is TOTALLY FINE, nothing to worry about.

because, I am a programmer, and I know some of those network stuffs, are totally not in our field.
ID: 917212 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 917226 - Posted: 12 Jul 2009, 19:51:40 UTC - in response to Message 917205.  

what you need is to control the number of tasks to be sent out, therefore reduce the number of completed tasks need to be received. Because, simply SETI@HOME doesn't have the required resources.

I'm not at all sure I agree with that statement, because of Zimmerman's Law:

"Nobody notices when things go right."

We only talk about resources during unusual loads, caused by outages, failures, runs of "shorties" or whatever.

But let's take your model for a moment. Picture an average cache of four days, and you "flow control" the outbound work to the most you can comfortably get back each day.

Then, you have a four day outage.

When you come back up, you're going to get four days worth of work in the maximum retry interval, which is about four hours.

That's the wrong place to control flow.

ID: 917226 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 917278 - Posted: 13 Jul 2009, 3:42:08 UTC - in response to Message 916930.  
Last modified: 13 Jul 2009, 3:43:09 UTC


So, when oh when are our Macs going to get the Cuda access so that we can start crunching faster? I've got two Intel Macs rarin' to go and so do a lot of the rest of us from the looks of your stats. Since we don't have Snow Leopard yet, there's no way yet to access the GPU for Boinc. Got a time frame in mind?

You could use boot camp and setup a windows partition to run BOINC on. :D
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 917278 · Report as offensive
Profile Peter M. Ferrie
Volunteer tester

Send message
Joined: 28 Mar 03
Posts: 86
Credit: 9,967,062
RAC: 0
United States
Message 917352 - Posted: 13 Jul 2009, 13:41:14 UTC

is'nt the current upload/download server set to allow you to download or upload 2 workunits at a time .. would setting it to 1 at a time help a little and there is no need to download or upload at 200+kbps when it would get there at 10kbps ( numbers are just out of my head ) so traffic shaping or setting a upload/download speed limit would ease the burden...
ID: 917352 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Technical News : More of the Same (Jul 09 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.