meanings of Distributed processing encyclopedia of Distributed processing dictionary of Distributed processing thesaurus on Distributed processing books about Distributed processing dreams about Distributed processing
 Distributed processing - Definition 

This article or section should include material from Distributed programming
This article or section should include material from Distributed system

Distributed computing is the process of aggregating the power of several computing entities to collaboratively run a single computational task in a transparent and coherent way, so that they appear as a single, centralized system.

Contents

Introduction

Distributed computing differs from cluster computing in that computers in a distributed computing environment are typically not exclusively running 'group' tasks, whereas clustered computers are usually much more tightly coupled. The difference makes distributed computing attractive because, when properly configured, it can use computational resources that would otherwise be unused. It can also make available computing resources which would otherwise be impossible. For example, the SETI@home project uses 'idle time' on many thousands of computers throughout the world, and is able to analyze received signals that would have been impossible otherwise. Such arrangements permit handling of data that would otherwise require the power of expensive supercomputers.

Distributed computing is very attractive in part because interactive operation leaves most computers in 'idle' most of the time. The process which implements the distributed aspect (ie: that running on a machine normally devoted to other work) is usually specially designed to be a low priority process, using only computing power that would be 'wasted' anyway.

However, having a low-priority process constantly running prevents operating system power management routines from putting the processor into a low-power mode, resulting in increased electricity consumption. For some (typically recent, and high speed) CPUs, the difference can be on the order of tens of watts.

Distributed computing also often involves competition with other distributed systems. This competition may be for prestige, or it may be a means of enticing users to donate processing power to a specific project. For example, there is the so-called "stat race": a measure of what project has managed to perform the most distributed work over the past day or week. This has been found to be so important in practice that virtually all distributed computing projects offer on-line statistical analyses of their performances, updated at least daily, if not in real-time.

Distributed computing is also an active area of research with an abundant literature. The best known distributed computing conferences are The International Conference on Dependable Systems and Networks [1] (http://www.dsn.org/) and the ACM Symposium on Principles of Distributed Computing [2] (http://www.podc.org). Journals include the Journal of Parallel and Distributed Computing [3] (http://www.academicpress.com/jpdc), IEEE transactions on Parallel and Distributed Systems [4] (http://www.computer.org/tpds/about.htm), and Distributed Computing [5] (http://www.springeroline.com).

The rendering of 3D computer images is often spread between several computers to speed up the process. These computers are often referred to as render farms.

Goals

There are many different types of distributed computing systems, and many challenges to overcome in successfully architecting one. The main goal of a distributed operating system is to connect users and resources in a transparent, open, and scalable way.

Transparency

Transparency means that a distributed system should hide its distributed nature from its users, appearing and functioning as a normal centralized system. There are many types of transparency:

  • Access transparency - Regardless of how resource access and representation has to be performed on each individual computing entity, the users of a distributed system should always access resources in a single, uniform way.
  • Location transparency - Users of a distributed system should not have to be aware of where a resource is physically located.
  • Migration transparency - Users should not be aware of whether a resource or computing entity possesses the ability to move to a different physical or logical location.
  • Relocation transparency - Should a resource move while in use, this should not be noticeable to the end user.
  • Replication transparency - If a resource is replicated among several locations, it should appear to the user as a single resource.
  • Concurrency transparency - While multiple users may compete for and share a single resource, this should not be apparent to any of them.
  • Failure transparency - Always try to hide any failure and recovery of computing entities and resources.
  • Persistence transparency - Whether a resource lies in volatile or permanent memory should make no difference to the user.

The degree to which these properties can or should be achieved may vary widely. Not every system can or should hide everything from its users. For instance, due to the existence of a fixed speed of light there will always be more latency on accessing resources distant from the user. If one expects real-time interaction with the distributed system, this may be very noticeable.

Openness

Openness is the property of distributed systems that measures the extent to which it offers a standardized interface that allows it to be extended and scaled. It is clear that a system that easily allows more computing entities to be plugged into it and more features to be easily added to it has an advantage over a perfectly closed and self-contained system. This is usually achieved by using an Interface Definition Language (IDL) that captures the syntax of all services offered by the system.

Scalability

A scalable system is one that can easily be altered to accommodate changes in the amount of users, resources and computing entities affected to it. Scalability can measured in three different dimensions:

  • Load scalability - A distributed system should make it easy for us to expand and contract its resource pool to accommodate heavier or lighter loads.
  • Geographic scalability - A geographically scalable system is one that maintains its usefulness and usability, regardless of how far apart its users or resources are.
  • Administrative scalability - No matter how many different organizations need to share a single distributed system, it should still be easy to use and manage.

Some loss of performance may occur in a system that allows itself to scale in one or more of these dimensions.

Architecture

Various hardware and software architectures exist that are usually used for distributed computing. At a lower level, it is necessary to interconnect multiple CPUs with some sort of network, regardless of that network being printed onto a circuit board or made up of several loosely-coupled devices and cables. At a higher level, it is necessary to interconnect processes running on those CPUs with some sort of communication system.

Hardware

Multiprocessor systems

A multiprocessor system is simply a computer that has more than one CPU on its motherboard. If the operating system is built to take advantage of this, it can run different processes on different CPUs, or different threads belonging to the same process.

Over the years, many different multiprocessing options have been explored for use in distributed computing. CPUs can be connected by bus or switch networks, use shared memory or their own private RAM, or even a hybrid approach.

These days, multiprocessor systems are available commercially for end-users, and mainstream operating systems like Windows and Linux already have built-in support for this. Additionally, recent Intel CPUs have begun to employ a technology called Hyperthreading that allows more than one thread to run on the same CPU.

Multicomputer systems

A multicomputer system is a system made up of several independent computers interconnected by a telecommunications network.

Multicomputer systems can be homogeneous or heterogeneous:

A homogeneous multicomputer is one where all CPUs are similar and are connected by a single type of network. They are often used for parallel computing which is a kind of distributed computing where every computer is working on different parts of a single problem.


A heterogeneous multicomputer is one that can be made up of all sorts of different computers, eventually with vastly differing memory sizes, processing power and even basic underlying architecture. They are in widespread use today, with many companies adopting this architecture due to the speed with which hardware goes obsolete and the cost of upgrading a whole system simultaneously. The Second Life grid is a heterogeneous multicomputer and so are most Beowulf (http://www.beowulf.org) clusters.

Software

Distributed Operating Systems

Network Operating Systems

Middleware

Distributed computing infrastructure

Proprietary

  • OfficeGRID [6] (http://www.officegrid.net) is a grid solution from MESH-Technologies A/S [7] (http://www.meshtechnologies.com).
  • Xgrid
  • ICE
  • United Devices [8] (http://www.ud.com) is the largest commercial distributed computing network.
  • DataSynaspe [9] (http://www.datasynapse.com) another commercial provider of distributed computing software.
  • Entropia [10] (http://www.entropia.com) vendor of distributed computing software technologies.

See also

External links



da:Distributed computing de:Verteiltes Rechnen es:Computación distribuída fr:Calcul réparti nl:Distributed computing ja:分散コンピューティング pl:Obliczenia rozproszone pt:Sistema distribuído ru:Распределённые вычисления zh-cn:分布式计算

Copyright 2008 WordIQ.com - Privacy Policy  ::  Terms of Use  :: Contact Us  :: About Us
This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Distributed processing".