Distributed Database Systems
A Distributed Database System (DDBS) is a single logical database that is spread physically across multiple computers, nodes, or even distinct geographical data centers. From the user's or application's perspective, it appears and behaves exactly like a single, centralized database.
Unlike a centralized database running on a massive single machine, a distributed database connects multiple, smaller machines together to form a highly resilient, scalable cluster.
Why Distribute a Database?
- Horizontal Scalability: Add more commodity machines to the cluster instead of buying one impossibly expensive supercomputer.
- High Availability & Fault Tolerance: If one node crashes or a whole data center goes offline, the system continues operating from other nodes.
- Low Latency: Data can be placed geographically closer to the users accessing it (e.g., storing European user data in an EU data center).
Core Concepts
1. The CAP Theorem
When designing or choosing a distributed database, you are bound by the CAP Theorem, which states that a distributed data store can only simultaneously guarantee two out of the following three:
- Consistency (C): Every read receives the most recent write or an error. All nodes see the exact same data at the same time.
- Availability (A): Every request receives a (non-error) response, without the guarantee that it contains the most recent write. The system stays up even if nodes fail.
- Partition Tolerance (P): The system continues to operate despite an arbitrary number of messages being dropped or delayed by the network between nodes.
Note: Because network partitions (P) are a reality of distributed systems (networks fail, cables get cut), distributed databases must choose between CP (Consistency and Partition Tolerance) or AP (Availability and Partition Tolerance).
2. Data Distribution Strategies
How do we actually spread the data? Usually through a combination of two techniques:
A. Replication
Copying the exact same data across multiple nodes. This ensures Availability and Fault Tolerance.
B. Partitioning (Sharding)
Splitting a large dataset into smaller chunks (shards/partitions) and storing them on different nodes. This ensures Scalability.
Architectural Diagram
Here is a conceptual look at a typical distributed database using both partitioning and replication (often seen in systems like Cassandra or DynamoDB).
Flow Explanation
- The Client sends a request to the database cluster.
- A Coordinator Node receives it. It uses a hashing algorithm on the partition key (e.g.,
User ID) to determine which partition owns this data. - The request is routed to the Primary Node of that partition.
- The Primary Node handles the write/read and replicates the data to its Replica nodes to ensure high availability.
Examples of Distributed Databases
1. Amazon DynamoDB (AP System)
A fully managed, serverless, key-value NoSQL database designed for high availability and single-digit millisecond performance. It replicates data synchronously across multiple Availability Zones. It prioritizes availability and partition tolerance, offering "eventual consistency" by default (though strong consistency can be requested).
2. Apache Cassandra (AP System)
A masterless distributed database designed to handle massive amounts of data across commodity hardware with no single point of failure. It uses a ring architecture where every node is equal. Excellent for heavy write workloads.
3. Google Cloud Spanner (CP System)
A globally distributed relational database that uniquely provides ACID transactions and SQL semantics at a massive global scale. It relies on Google's TrueTime (atomic clocks and GPS receivers) to maintain strict consistency globally.
Distributed Transactions: The Two-Phase Commit (2PC)
When a transaction needs to modify data that lives on multiple different nodes, ensuring ACID properties (specifically Atomicity) becomes complex. The most common protocol to handle this is the Two-Phase Commit (2PC).
- Prepare Phase: The coordinator asks all participating nodes if they are ready to commit the transaction. The nodes lock the necessary resources and reply "Yes" or "No".
- Commit Phase: If all nodes said "Yes", the coordinator tells them to physically commit the changes. If any node said "No" (or timed out), the coordinator tells everyone to rollback.
The Problem with 2PC
It is a blocking protocol. If the coordinator goes down after the prepare phase but before the commit phase, the participant nodes are stuck holding locks indefinitely until the coordinator recovers. This severely impacts availability, which is why many modern NoSQL systems avoid multi-record distributed transactions entirely.
