Message boards :
Number crunching :
What's up with NEZ?
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6
Author | Message |
---|---|
ML1 Send message Joined: 25 Nov 01 Posts: 20289 Credit: 7,508,002 RAC: 20 |
[...] OK, so Atlas includes: Debian GNU/Linux powers Max Planck Institute 32.8 TFlops supercomputer ... hierarchical fully non-blocking network. The EFX 1000 core switch features 144 10 Gb/s CX4 ports and connects currently to 32 TRX100 edge switches which feature 48 1 Gb/s ports and 4x10 Gb/s uplinks, reaching 2880 Gb/s. Also their Sun Fire X4500 are directly connected to the core switch. ... which is a fair bit better than your usual 1Gbit/s LAN interconnect! I think that rates more as a cluster rather than just a mere loosely connected farm. In contrast, the interconnect for Merlin and Morgane looks to be just a hierachically connected Gbit LAN. Myself, I'd rate them as big farms, but you can call them a cluster if the interconnect isn't a bottleneck for the overall performance... Happy fast crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
Do all the ATLAS nodes share a global memory (multiprocessor like) or each has its own private memory (multicomputer) and communicate by passing messages like PVM? This, in my opinion, is the focal point. Tullio |
ML1 Send message Joined: 25 Nov 01 Posts: 20289 Credit: 7,508,002 RAC: 20 |
Do all the ATLAS nodes share a global memory (multiprocessor like) or each has its own private memory (multicomputer) and communicate by passing messages like PVM? This, in my opinion, is the focal point. All three use "off-the-shelf" PC parts. Each node has a multi-core CPU with memory shared for just those on the same motherboard cores, as per any PC. Inter-node communications will be via the OS via the network connections. Physically, it is like having a shelf of individual PCs but with a 10Gbit/s LAN instead of the more common 100Mbit/s or 1Gbit/s LAN. The focus is: When is the internode communications "fast enough" to call the system a cluster? How "closely" must the individual CPUs work on the same data? Happy crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
Do all the ATLAS nodes share a global memory (multiprocessor like) or each has its own private memory (multicomputer) and communicate by passing messages like PVM? This, in my opinion, is the focal point. I would call it a NUMA machine (Not Uniform Memory Architecture). Right? Tullio |
ML1 Send message Joined: 25 Nov 01 Posts: 20289 Credit: 7,508,002 RAC: 20 |
[ATLAS, Merlin, Morgane] For this example, that depends on what level you consider. NUMA: Non-Uniform Memory Access or Non-Uniform Memory Architecture You have NUMA at the motherboard level for the multiple cores for each node. However, I very much doubt that each node CPU can directly address or access the memory space of any other node. Inter-node communication must be via 'message passing' at a level higher than the memory system. For example, the OS or application must implement communication with other nodes. Or have they done something clever with a hypervisor or some such? (I would not expect so, I doubt they have the need.) Further thoughts on the cluster/farm distinction: I think the emphasis must be on whether the nodes are connected via a 'standard' LAN or whether something 'special' has been done to give significantly faster/better node interconnect (cluster) than that possible by a mere LAN (farm). Happy crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
[ATLAS, Merlin, Morgane] Here is what the top500 list says (#58): Pyramid Cluster Xeon QC 32xx 2.4 GHz, GigEthernet |
speedimic Send message Joined: 28 Sep 02 Posts: 362 Credit: 16,590,653 RAC: 0 |
Further thoughts on the cluster/farm distinction: I think the emphasis must be on whether the nodes are connected via a 'standard' LAN or whether something 'special' has been done to give significantly faster/better node interconnect (cluster) than that possible by a mere LAN (farm). What I get from reading across the net is, that it's the cluster software together with the head nodes what makes the difference. In a server-farm setup every computer does the work it gets assigned no matter if it' neighbour sits idle. It's just a bunch of somehow independent computers. In a cluster setup you got head nodes with some cluster-software (Condor in this case) on them which distributes the work and check the status (working/idle) of every computing node. The nodes aren' independent - they only do what the head-node/cluster-software thells them to do. mic. |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
I think that the first example of a cluster was the CMP or CM* (if I remember) of Carnegie Mellon University back in the Seventies. At Elettronica San Giorgio ELSAG in Genoa they had conceived a similar system (EMMA, Elaboratore Multi Mini Associativo) where each node was a minicomputer and it was used in a postal automated system later sold also to the US Postal Service. When I described it to prof.Emilio Segre' he told me "then I won't receive any mail". He did not trust Italian technology after having taken part in the Manhattan project. Tullio |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
Can you say CLUSTER? While that may clarify (somewhat) that Atlas could be considered a cluster in the most basic sense, bringing the focus back to the original statement: BOINC does not work with clustering in that it uses each machine as a single node, which is more like a farm. |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
I agree. Cheers. Tullio |
ML1 Send message Joined: 25 Nov 01 Posts: 20289 Credit: 7,508,002 RAC: 20 |
While that may clarify (somewhat) that Atlas could be considered a cluster in the most basic sense, bringing the focus back to the original statement: BOINC does not work with clustering in that it uses each machine as a single node, which is more like a farm. Sorry... That's far too easy an answer. We should be able to thrash about for when is a cluster not a cluster and when is a farm a cluster or a farm for another 100 posts or so at least! I prefer the description of a cluster being that of interconnected nodes that are coordinated by supervisor nodes/software. Whereas, a farm is a collection of nodes independantly working on a similar task. A cluster should also imply that you have "good" interconnect between the nodes for interprocess and supervisor communication. Meanwhile, what has happened to Nez? Happy fast crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
While that may clarify (somewhat) that Atlas could be considered a cluster in the most basic sense, bringing the focus back to the original statement: BOINC does not work with clustering in that it uses each machine as a single node, which is more like a farm. I admit that I wanted to end the discussion in order to watch the presentations at the Grenoble BOINC Workshop. Unfortunately, some of the slides, included those of prof.Allen of Einstein@home, were pretty invisible on my screen Too bad. Practically all of the big clusters in the top500 list have high speed interconnections between the nodes, Infiniband or Quadrics, Xeon processors and Linux as OS plus an hypervisor software. I have no experience of clusters but years ago, while at Trieste Science Park as manager of the Unix Bull Laboratory (four Bull employees plus four graduate students) I downloaded a software called PVM (parallel virtual machine) from the University of Tennessee and compiled it on my Bull/MIPS R6000 minicomputer (not to be confused with IBM RS6000) running UNIX System V. I compiled both server and client, then I put a client on a SUN Sparc workstation running SUNOS (practically Berkeley Unix). I was thus able to parallelize a job on those two machines. When I suggested to people of the Istituto Nazionale di Fisica Nucleare to put a client also on their DEC systems running both VMS and the DEC version of UNIX (ultrix), they looked on me with horror. They believed in DEC more than in God. Poor guys. Tullio |
Misfit Send message Joined: 21 Jun 01 Posts: 21804 Credit: 2,815,091 RAC: 0 |
Meanwhile, what has happened to Nez? The thread turned into a cluster .... me@rescam.org |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
Meanwhile, what has happened to Nez? bomb |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.