Key Characteristics of Distributed Systems

scalability

the capability of a system, process, or a network to grow and manage increased demand.

a scalable system has to be able to <span class="text-highlight">continuously evolve</span> in order to support the growing amount of work

reasons to scale:

generally the performance of a system declines with system size, due to:

some tasks may not be distributed <span class="text-highlight">exercise: identify which parts in a system cannot be distributed (at work or in an example large system</span>

horizontal scaling - add servers

vertical scaling - add ram/cpu/storage etc. to your one server

reliability

reliability = the probability a system will fail in a given period

availability

availability - the time a system remains operational to perform its required function in a specific period.

if an aircraft/app is down for maintenance, it is considered not available during that time.

reliability vs availability

if a system is reliable, it is available. However, it it is available, it is not necessarily reliable.

Let’s take the example of an online retail store that has 99.99% availability for the first two years after its launch. However, the system was launched without any information security testing. The customers are happy with the system, but they don’t realize that it isn’t very reliable as it is vulnerable to likely risks. In the third year, the system experiences a series of information security incidents that suddenly result in extremely low availability for extended periods of time. This results in reputational and financial damage to the customers.

efficiency

two standard measures of efficiency are:

The two measures correspond to the following unit costs:

the complexity of operations supported by distributed data structures (e.g. searching for a specific key in a distributed index) can be characterized as a function of one of these cost units.

serviceability or manageability

serviceability or manageability = the speed with which a system can be repaired of maintained

if time to fix a failed system increases, availability will decrease.

things to consider:

early detection of faults can decrease or avoid system downtime.