-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
93828f3
commit 20859eb
Showing
1 changed file
with
32 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Partitioning Vs Sharding | ||
|
||
Both sharding and partitioning are techniques used to manage large databases, but they differ in how they distribute the data: | ||
|
||
**Sharding** | ||
|
||
- **Distribution:** Sharding splits the data horizontally across **multiple servers or nodes**. Each shard is a complete and independent subset of the data, containing its own copy of the table schema. | ||
- **Scalability:** Sharding excels at horizontal scaling. As your data grows, you can simply add more servers to distribute the load. | ||
- **Complexity:** Sharding introduces complexity in managing a distributed system. You need to handle routing queries to the appropriate shard and ensure data consistency across all shards. | ||
- **Example:** Imagine a social media platform with sharded user data. Users from North America might be stored on one shard, while users from Europe reside on another. | ||
|
||
**Partitioning** | ||
|
||
- **Distribution:** Partitioning divides a **single table** horizontally within the same database server. Partitions are essentially sub-tables that hold specific subsets of the data based on a chosen criteria. | ||
- **Performance:** Partitioning improves query performance by allowing you to quickly locate relevant data. Queries can target specific partitions, reducing the amount of data scanned. | ||
- **Management:** Partitioning is easier to manage compared to sharding as everything remains within a single server. | ||
- **Example:** An e-commerce website might partition its order table by year. Queries for past orders can then be directed to the appropriate year partition. | ||
|
||
**Here's a table summarizing the key differences:** | ||
|
||
| Feature | Sharding | Partitioning | | ||
| ------------ | -------------------------------------------------------- | ---------------------------------------- | | ||
| Distribution | Across multiple servers | Within a single server | | ||
| Scalability | Excellent horizontal scaling | Limited by server capacity | | ||
| Complexity | More complex (distributed system management) | Simpler management | | ||
| Performance | Improved due to parallel processing | Improved for focused queries | | ||
| Consistency | Maintaining consistency across shards can be challenging | Consistency is generally straightforward | | ||
|
||
**In conclusion:** | ||
|
||
- Use sharding for massive datasets requiring horizontal scalability and potentially high write volume. | ||
- Use partitioning for improved query performance on large tables within a single server, especially when queries target specific subsets of data. |