As briefly explained on the overview page, distributed computing is a method that is used to utilize extra CPU cycles on computers linked together over a network. According to Claudia Leopold distributed computing can be defined as follows:
“A distributed system is a collection of autonomous computers that are interconnected with each other and cooperate, thereby sharing resources such as printers and databases” (Leopold 2).
Distributed computing systems group individual computers together and pool their associated computing resources in order to accomplish higher level computation. The practice of distributed computing requires that unique - and as Leopold mentions, autonomous - computers, be networked over either a Local Area Network (LAN) or Wide Area Network (WAN). The network provides a means by which client and host machines communicate, sharing computed information or passing information that requires analysis or computation.
Current distributed computing projects such as the extremely well known SETI@Home project of the University of California Berkley or the smaller projects such as Folding@Home or Genome@Home from Stanford University all utilize the client server interaction briefly described in the overview section. The server or host computer distributes data that needs to be processed to individual computers on the network client machines. These client machines process this data in CPU cycles when the machine is otherwise idle.
For example, after the client machine’s owner is done word processing and walks away from the machine, the computer becomes idle switching on a screen-saver and all CPU cycles are wasted while the computer is in idle screen-saver mode. However, if distributed computing software is installed on this networked client machine, the computational software received from the client machine begins completing computations as requested by the server. If the client machine connects to the host via a LAN that is always connected to the machine, the results are immediately transmitted to the server and then processed. If the client connects with a dial-up connection, however, the client software waits for a connection to be established at which point information is transmitted back to the server.
In utilizing the idle CPU time of autonomous machines that are all connected via a network, the server can complete computations that would otherwise be impossible on a single machine. Folding@Home uses distributed computing methods to complete computations that would otherwise require centuries on a single supercomputer. Because the task is spread between numerous computers, no single machine must spend an extraordinarily large amount of time or CPU cycles on computation and information is processed by different machines at the same time; thereby greatly diminishing the number of hours researchers must spend in computation.
According to Claudia Leopold, there are eight main motivations for implementing a distributed system instead of simply utilizing the computing resources of a standard computer. These advantages are described briefly below:
(1) Distributed systems improve the “absolute performance” of the computing system
(2) The Price to Performance ratio for the system is more favorable for a distributed system
(3) Technological advantages
(4) Some applications are inherently distributed problems (they are solved most easily using the means of distributed computing
(5) Distributed computing allows the sharing of resources - both hardware and software
(6) Each piece of hardware is replaceable should it fail.
(7) Distributed Computing allows the system to grow incrementally as computers are added one by one.
(8) Distributed computing allows for “scavenging.” According to Leopold, “a lot of power is wasted, particularly during business hours. By integrating the computers into a distributed system, the excess computing power can be made available to other users or applications.” (8)
Discussion to follow if necessary...
Although distributed computing is a distinct method for harnessing the unused power of networked computers, it bears close resemblance to another multiple processor computing architecture: parallel computing, which is the practice of employing multiple processors at the same location to spread break down the computing task. In fact, because of the close similarity between the two, many authors will fail to distinguish between the two computing strategies. For a clear distinction between the two tactics, we once again look to Leopold’s book:
“Parallel computing splits an application up into tasks that are executed at the same time whereas distributed computing splits an application up into tasks that are executed at different locations using different resources” (Leopold 3).
Parallel computing is a computational method that is extremely similar to distributed computing. It is, for the most part of this discussion, outside of the scope of the scope of this website. The basics behind parallel computing are explained fantastically in Claudia Leopold’s text entitled Parallel and Distributed Computing: A survey of Models Paradigms, and Approaches. The basic practice of parallel computing splits an application or process into subtasks that are to be solved at the same time (sometimes in a “tightly coupled manner”). Each task must be able to be considered individually by any given machine running “homogeneous architectures” which may or may not have shared memory.
|[Home] [What Is It?] [History] [Future] [Concerns] [Efficiency] [Curr. Projects] [Resources]|