Created by Lenio Vogt
about 5 years ago
|
||
Question | Answer |
What is a Distributed System? | Is a collection of autonomous computing elements that appears to its users as a single coherent system |
Autonomous computing elements | - consits of all kinds of nodes - nodes can act independently |
Single coherent system | - where data is stored should be of no concern - infrastructure in the background is not visible for the user |
Centralized/Decentralized/Distributed | |
Middleware | - assist the development of distributed applications - separate layer of software that is logically placed on top of the respective operating system - data exchange between different OS |
RPC | Remote Procedure Call - Communication Service - Allows application to invoke a function on a remote computer |
Transactions | - applications make use of multiple services that are distributed among several computers - makes sure that every service is invoked, or none at all |
Goals for Distributed Systems | - make resources easily accessible - hide the fact that resources are distributed across a network - open (components can be easily integrated) - be scalable |
Types of Distributed Systems | - computing systems - information systems - pervasive systems |
Computing Systems | - Cluster computing - Grid computing - Cloud computing |
Cluster Computing | - Group of connected computers - Act like single entity --> High redundancy and distributed workload |
Grid Computing | - Loosely coupled (decentralized) - Sharing tasks over multiple computers |
Cloud computing | - Storing and accessing applications and data over the internet - Coupled (distributed) - Single system image |
Information Systems | - Server running application and making it available to remote programmes (clients) - Clients send request --> Server sends response - Requests to different servers --> called distributed transaction |
Pervasive Systems | - System is often equipped with many sensors that pick up various aspects - Small, battery-powered, mobile and having a wireless connection --> IoT |
Internet of Things (IoT) | - Connecting all kinds of electronic devices to the internet - Benefits: Pick up data - Downsides: Data safety, rely too much on technology |
Reasons for distributed data | - Scalability - Fault tolerance - High availability - Latency |
Replication | - keeping a copy of the same data on serveral different nodes in different locations |
Leader-based replication | |
Leader-based replication | - one leader and followers, every write request goes to the leader, any read request by leader or any follower |
Syncrhonous vs. Asynchronous Replication | |
Handling node outages | - Follower: each follower keeps log data, after recovery he resynchronise with the leader - Leader: timeout -> leader failed, follower with most up-to-date data becomes new leader |
Data loss | |
Split Brain | |
Timeout | |
Read your own writes | |
Monotonic reads | |
Multi-Leader Replication | |
Single vs Multileader | Performance Tolerance of outages Tolerances of network problems |
Performance | Single: every write must go over internet to the leader Multi: every write can be processed by local datacenter |
Tolerance of outages | Single: if leader fails, failover can promote a follower to be leader Multi: if leader fails, other leaders continue operating independetly |
Tolerance of network problems | Single: very sensitive, because writes are made synchronously Multi: can tolerate temporary network problems |
Leaderless Replication | |
Partitioning | Splitting data into smaller subsets called partitions so that different partitions can be assigned to different nodes |
Hotspot | Partition with disproportionately high load |
Partitioning by Hash Key | - Takes skewed data and makes it uniformly distributed (Timestamp) Disadvantages: - losing property of key-range partitioning -> ability to do efficient range queries - keys that were once adjacent --> sort order is loss |
Rebalancing | |
Request routing | |
Zoo Keeper | |
Models of Data Flow | - via Databases - via service calls - asynchronous message passing |
via Database | - data outlives the code process writes encoded data, another process reads it again sometime in the future - Migrating data is possible, but expensive on a large dataset |
via Service Calls | |
Service Calls - Web services | - when http is used as the underlying protocol for talking to the service -> web service |
Service Calls - REST REpresentational State Transfer | - not a protocol - design philosophy - builds upon the principles of http - Using URL´s for identifying resources |
Service Calls - SOAP Simple Object Access Protocol | - XML based for making network API requests |
via Asynchronous Message Passing | - sender doesn´t wait for the message to be delivered - simply sends it and then forgets about it |
SOA | Service oriented architecture: Decomposing large applications into smaller services by functionality |
Message broker | - stores the messages temporarily - act as a buffer if recipient is unavailable - redeliver messages if crashes - allows sending message to different recipient |
Want to create your own Flashcards for free with GoConqr? Learn more.