-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protocol Design 🚀 #262
Protocol Design 🚀 #262
Changes from 8 commits
298749b
f56dac4
fe8872a
3093e62
328af7e
864dbd5
a4e4e66
af15db2
c30916b
305b24c
39bec46
8d18e17
6131405
c1d0fa7
9d91347
4427eb1
431cfbf
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -1,9 +1,118 @@ | ||||||||||
# Protocol Design Philosophy | ||||||||||
# [Protocol Design Philosophy](#protocol-design-philosophy) | ||||||||||
|
||||||||||
> :warning: This article is a [stub](https://en.wikipedia.org/wiki/Wikipedia:Stub), help the wiki by [contributing](/contributing.md) and expanding it. | ||||||||||
These are the core tenets that sparked the work on Ethereum's architeture and imlementation. | ||||||||||
|
||||||||||
- **Simplicity**: | ||||||||||
Since its inception, the Ethereum protocol was designed in a way that reduced complexity and made it simpler, even at the cost of data storage inefficiency and time inefficiency. This stemmed from the idea that any average programmer should ideally be able to understand and implement the entire specification -- primarily to minimize the influence on the protocol by an individual or an elite group of developers. While this narrative has transmuted largely due to major changes that have been made to the protocol. Further | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
- **Universality**: | ||||||||||
One of the fundamental doctrine of Ethereum's design philosophy is that Ethereum has no ***features***. | ||||||||||
Instead, Ethereum provides an internal Turing-complete scripting language [**solidity**](https://docs.soliditylang.org/en/v0.8.25/), which you can use to construct any smart contract or transaction type that can be mathematically defined. Ethereum tries to become a platform where new age developers can build decentralised and truly trustless applications without being prudent about the underlying complexity. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Saying Ethereum has no feature is quite radical. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think no features is indeed the right term here but it needs to be explained correctly. The idea is that ethereum is generalized platform and doesn't provide features for specific use cases. |
||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Ethereum protocol per say doesn't concern with Solidity or any high level programming language. Which is why we (or the yellow paper) does not cover it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh yeah just realised it. will make the changes |
||||||||||
- **Modularity**: | ||||||||||
Making the Ethereum protocol modular is crucial to it being **future-proof**. While Ethereum is far from being perfect, there is a continuous and rigourous research and engineering effort that runs parallel to the existence of the protocol. Over the course of development, it should be easy to make a small protocol modification in one place and have the application stack continue to function without any further modification. Innovations such as Dagger, Patricia trees and RLP should be implemented as separate libraries and made to be feature-complete even if Ethereum does not require certain features so as to make them usable in other protocols as well. Ethereum development should be maximally done so as to benefit the entire cryptocurrency ecosystem, not just itself. | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The modularity should also explain another important benefits. It talks only about modifying individual parts but it should also mention the encapsulated complexity, easier debugging of individual pieces, open plugable designs for different implementations |
||||||||||
- **Non-discriminant**: | ||||||||||
Ethereum was born out of the pillars set by various movements like [**FOSS**]() and [**Cypherpunk**](). And non-discrimination is foundational to fabric of Ethereum's design philosophy. The protocol should not attempt to actively restrict or prevent specific categories of usage, and all regulatory mechanisms in the protocol should be designed to directly regulate the harm, not attempt to oppose specific undesirable applications. You can even run an infinite loop script on top of Ethereum for as long as you are willing to keep paying the per-computational-step transaction fee for it. | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would link foss to something with better explanation of the idea because we are referring to philosophy influence. Maybe something like https://www.fsf.org/about/what-is-free-software which is great but ethereum also benefits from the pragmatic oss part |
||||||||||
- **Agility**: | ||||||||||
Details of the Ethereum protocol are not set in stone. It's crucial to be extremely judicious about making modifications to high-level contructs such as solidity and the address system, computational tests later on in the development process may lead to dicovery that certain modifications to the algorithm or scripting language will substantially improve scalability or security. If any such opportunities are found, they will be utilized. | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
# [Principles](#principles) | ||||||||||
|
||||||||||
The Ethereum protocol evolves and changes over time but it always follow certain principles. These principles reflect values of the whole community and are reflected in some of the main design decisions of Ethereum. | ||||||||||
|
||||||||||
- **Managing Complexity**: One of the main goals of Ethereum protocol design is to minimize complexity: make the protocol as simple as possible, while still making a blockchain that can do what an effective blockchain needs to do. The Ethereum protocol is far from perfect at this, especially since much of it was designed in 2014-16 when we understood much less, but we nevertheless make an active effort to reduce complexity whenever possible. | ||||||||||
One of the challenges of this goal, however, is that complexity is difficult to define, and sometimes, you have to trade off between two choices that introduce different kinds of complexity and have different cost | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This overlaps with the first point of philosophy. Maybe combine this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The first part is about philosophy, this section should focus on design principles stemming from that. Managing complexity is an important piece elaborating a part mentioned in simplicity |
||||||||||
1. **Sandwich model complexity**: The sandwich model focused simplifying the bottom layer of the architecture of Ethereum and the interface to Ethereum should be as easy to understand as possible. Where complexity is inevitable, it should be pushed into the "middle layers" of the protocol, that are not part of the core consensus but are also not seen by end users - high-level-language compilers, argument serialization and deserialization scripts, storage data structure models, the leveldb storage interface and the wire protocol, etc. | ||||||||||
2. **Encapsulated complexity**: This occurs when there is a system with sub-systems that are internally complex, but that present a simple "interface" to the outside. Systemic complexity occurs when the different parts of a system can't even be cleanly separated, and have complex interactions with each other. Often, the choice with less encapsulated complexity is also the choice with less systemic complexity, and so there is one choice that is obviously simpler. But at other times, you have to make a hard choice between one type of complexity and the other. What should be clear at this point is that complexity is less dangerous if it is encapsulated. The risks from complexity of a system are not a simple function of how long the specification is; a small 10-line piece of the specification that interacts with every other piece adds more complexity than a 100-line function that is otherwise treated as a black box. Here are few [examples](https://vitalik.eth.limo/general/2022/02/28/complexity.html) | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
The preference order for where the complexity goes in: layer 2 > client implementation > protocol spec | ||||||||||
|
||||||||||
- **Freedom**: Users should not be restricted in what they use the Ethereum protocol for, and we should not attempt to preferentially favor or disfavor certain kinds of Ethereum contracts or transactions based on the nature of their purpose. This is similar to the guiding principle behind the concept of "net neutrality". One example of this principle not being followed is the situation in the Bitcoin transaction protocol where use of the blockchain for "off-label" purposes (eg. data storage, meta-protocols) is discouraged, and in some cases explicit quasi-protocol changes (eg. OP_RETURN restriction to 40 bytes) are made to attempt to attack applications using the blockchain in "unauthorized" ways. In Ethereum, instead strongly favor the approach of setting up transaction fees in such a way as to be roughly incentive-compatible, such that users that use the blockchain in bloat-producing ways internalize the cost of their activities (ie.[Pigovian taxation](https://en.wikipedia.org/wiki/Pigouvian_tax)). | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This too overlaps with non-discrimination from philosophy. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @raxhvl Check the original doc he follows https://web.archive.org/web/20211121044757/https://ethereumbuilders.gitbooks.io/guide/content/en/design_rationale.html I agree there is a certain overlap, that being said, I like each point properly articulated. I am on the fence whether to keep it shorter or elaborate, it doesn't have to map 1:1 to original |
||||||||||
- **Generalization**: Protocol features and opcodes in Ethereum should embody maximally low-level concepts, so that they can be combined in arbitrary ways including ways that may not seem useful today but which may become useful later, and so that a bundle of low-level concepts can be made more efficient by stripping out some of its functionality when it is not necessary. An example of this principle being followed is our choice of a LOG opcode as a way of feeding information to (particularly light client) dapps, as opposed to simply logging all transactions and messages as was internally suggested earlier - the concept of "message" is really the agglomeration of multiple concepts, including "function call" and "event interesting to outside watchers", and it is worth separating the two. | ||||||||||
|
||||||||||
- **We have no features**: As a corollary to generalization, we often refuse to build in even very common high-level use cases as intrinsic parts of the protocol, with the understanding that if people really want to do it they can always create a sub-protocol (eg. ether-backed subcurrency, bitcoin/litecoin/dogecoin sidechain, etc) inside of a contract. An example of this is the lack of a Bitcoin-like "locktime" feature in Ethereum, as such a feature can be simulated via a protocol where users send "signed data packets" and those data packets can be fed into a specialized contract that processes them and performs some corresponding function if the data packet is in some contract-specific sense valid. | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
|
||||||||||
|
||||||||||
# [Blockchain level protocol](#blockchain-level-protocol) | ||||||||||
|
||||||||||
### **Accounts over UTXOs** | ||||||||||
Earliest implementations of blockchain including bitcoin and it's derivatives, store user balance in structure based in unspent transaction outputs (UTXOs). Ethereum on the other hand uses an account based model. The account based model is more flexible and allows for more complex transactions. | ||||||||||
|
||||||||||
> **UTXO**: an unspent transaction output (UTXO) is a distinctive element in a subset of digital currency models. A UTXO represents a certain amount of cryptocurrency that has been authorized by a sender and is available to be spent by a recipient. | ||||||||||
|
||||||||||
A user's "balance" in the system is thus the total value of the set of coins for which the user has a private key capable of producing a valid signature. The account based model is more flexible and allows for more complex transactions. | ||||||||||
|
||||||||||
#### Benefits of UTXOs | ||||||||||
- **Higher degree of privacy**: if a user uses a new address for each transaction that they receive then it will often be difficult to link accounts to each other. This applies greatly to currency, but less to arbitrary dapps, as arbitrary dapps often necessarily involve keeping track of complex bundled state of users and there may not exist such an easy user state partitioning scheme as in currency. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if covering UTXO is worthwhile. cc: @taxmeifyoucan There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The general reason for account model should be explained but I think we just could sum up pros and cons in few sentences without its own section on Benefits of utxo. Utxo means more complexity for system like Eth, more privacy in certain use cases but less fungibility, etc.. |
||||||||||
|
||||||||||
- **Potential scalability paradigm**: UTXOs are more theoretically compatible with certain kinds of scalability paradigms, as we can rely on only the owner of some coins maintaining a Merkle proof of ownership, and even if everyone including the owner decides to forget that data then only the owner is harmed. In an account paradigm, everyone losing the portion of a Merkle tree corresponding to an account would make it impossible to process messages that affect that account at all in any way, including sending to it. However, non-UTXO-dependent scalability paradigms do exist | ||||||||||
|
||||||||||
#### Benefits of Accounts | ||||||||||
- **Space Saving**: for example, if an account has 5 UTXO, then switching from a UTXO model to an account model would reduce the space requirements from (20 + 32 + 8) * 5 = 300 bytes (20 for the address, 32 for the txid and 8 for the value) to 20 + 8 + 2 = 30 bytes (20 for the address, 8 for the value, 2 for a nonce(see below)). In reality savings are not nearly this massive because accounts need to be stored in a Patricia tree (see below) but they are nevertheless large. Additionally, transactions can be smaller (eg. 100 bytes in Ethereum vs. 200-250 bytes in Bitcoin) because every transaction need only make one reference and one signature and produces one output. | ||||||||||
|
||||||||||
- **Great fungibility**: UTXOs are not perfectly fungible, as a UTXO can be tainted by being used in a transaction with a tainted UTXO, and there are some heuristics that can be used to track the history of a coin. Accounts are perfectly fungible, as any coin can be replaced by any other coin. | ||||||||||
|
||||||||||
- **Simiplicity**: Accounts are simpler to implement and reason about than UTXOs. UTXOs require a more complex transaction validation algorithm, and the UTXO model is less flexible and less powerful than the account model. For example, it is impossible to implement a decentralized exchange in the UTXO model, as the UTXO model does not allow for the existence of a "sell" order that is not tied to a specific UTXO. | ||||||||||
|
||||||||||
One weakness of the account paradigm is that in order to prevent replay attacks, every transaction must have a [**nounce**](https://ethereum.stackexchange.com/questions/27432/what-is-nonce-in-ethereum-how-does-it-prevent-double-spending), such that the account keeps track of the nonces used and only accepts a transaction if its nonce is 1 after the last nonce used. This means that even no-longer-used accounts can never be pruned from the account state. A simple solution to this problem is to require transactions to contain a block number, making them un-replayable after some period of time, and reset nonces once every period. Miners or other users will need to "ping" unused accounts in order to delete them from the state, as it would be too expensive to do a full sweep as part of the blockchain protocol itself. | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Nonce is simply the number of transaction sent from an account. I don't think its much of a storage overhead. cc: @taxmeifyoucan There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The nonce has to be mentioned here, it's necessary for account model and it should be clear it is certain complexity that protocol needs to deal with. But the section could be shorter, the pruning of old nonce transactions etc is irrelevant old idea, nodes don't prune txs now |
||||||||||
### **Merkle Patricia Trie(MPT)** | ||||||||||
Ethereum's data structure is a 'modified Merkle-Patricia Trie', named so because it borrows some features of PATRICIA (the Practical Algorithm To Retrieve Information Coded in Alphanumeric), and because it is designed for efficient data retrieval of items that comprise the Ethereum state. | ||||||||||
|
||||||||||
A Merkle-Patricia trie is deterministic and cryptographically verifiable: The only way to generate a state root is by computing it from each individual piece of the state, and two states that are identical can be easily proven so by comparing the root hash and the hashes that led to it (a Merkle proof). Conversely, there is no way to create two different states with the same root hash, and any attempt to modify state with different values will result in a different state root hash. Theoretically, this structure provides the 'holy grail' of O(log(n)) efficiency for inserts, lookups and deletes. | ||||||||||
|
||||||||||
Merkle-Patricia implemented trie are staged for deprecation to be replaced by a more efficient data structure called [**Verkle**](https://vitalik.ca/general/2022/02/28/complexity.html#verkle-trees). | ||||||||||
### **Introduction to Verkle** | ||||||||||
|
||||||||||
> :warning: Verkle trees are currently an active research area and this article may not be up to date with the latest developments. One can participate in the development and dicussions on [Ethereum Research](https://ethresear.ch/t/portal-network-verkle/19339) | ||||||||||
|
||||||||||
MPTs are currently employed in a variety of application in which membership proofs are sent across a network, including protocols, public-key directories, cryptocurrency such as Bitcoin and Secure File Systems. A Merkle Tree with $n$ leaves has $O(log{_2}{n})$-sized proofs. Althought, $O(log{n})$ complexity can be quite comforting, however, in large trees, sending proofs can dominate bandwidth comsumption. Verkle tree with branching factor $k$ achieve $O(kn)$ construction time and $O(log{_k}{n})$ membership proof-size. This means that the branching factor $k$ offers a tradeoff between computational power and bandwidth. | ||||||||||
|
||||||||||
|
||||||||||
One of the pressing problems of Ethereum is the current state size. Estimate at around 1-2TB(at the time of writing this article). It is impractical for nodes to hold in working memory or even in slower permanent storage per se, thus, the need for statelessness becomes crucial to growth of the network. Verkle trees with it's vector commitments allow for much smaller proofs (**called witnesses**). Instead of needing to provide hashes of all "sibling nodes" at each level, Merkle Trees, the prover needs only to provide all parent nodes(plus an extra proof, called an optional) along the path from each each leaf to the root. | ||||||||||
|
||||||||||
### **Recursive Length Prefix (RLP)** | ||||||||||
The rationale begin creating a new serialization scheme, lies in the | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. RLP is covered in the wiki maybe give a one line intro and link to the RLP page |
||||||||||
probablistic nature of other schemes. RLP solves this problem by being highly minimalistic serialization; and guarantees absolute byte-perfect consistency. RLP does not attempt to define any specific data type such as boolean, floats, doubles or even integers -- instead, it simply exists to store structure, in the form of nested arrays. Key/value maps are also not explicitly supported; the semi-official suggestion for supporting key/value maps is to represent such maps as``` [[k1, v1], [k2, v2], ...]``` where ```k1, k2...``` are sorted using the standard ordering for strings. | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
The notion of complete anonymity of the data structure to the serizalation algorithm over the course of time has turn out to ineffinient in case of fixed length data types like boolean, integers. SimpleSerialize(SSZ) was introduced in Ethereum 2.0 which supported both variable sized and fixed sized data types with additional features like Merkleization. | ||||||||||
|
||||||||||
### **Simple serialize (SSZ)** | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like RLP, SSZ is covered in the wiki. Consider linking the page without going into depth here. |
||||||||||
Serialization is the process of converting data structures into format that can be transmuted, transmitted and reconstructed later. SSZ is a serialization format that is used in Ethereum 2.0 Beacon chain. Designed to be serialization scheme that is not self-describing -- rather it relies on a schema that must be known in advances. SSZ has a bunch of advantages over RLP, like efficient re-hashing of objects and fast indexing which RLP lacks resulting in $O(N)$ complexity. | ||||||||||
|
||||||||||
Based on [V's comment](https://ethresear.ch/t/replacing-ssz-with-rlp-zip-and-sha256/5706/12), one of the major problem SSZ tries to solve is RLP doesn't allow Merkelization, and this would mean disqualifying any possibility of succinct light client proofs of anything. Thus, leaving no scope of achieving statelessness -- while stateleness remains a crucial objective of current Ethereum's R&D. | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
Further implementation and details about Simple Serialize can be found [**here**](#) | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
### **Tryst with Finality** | ||||||||||
In Ethereum's proof-of-stake based consensus mechanisms, finality refers to the guarantee that a block cannot be altered or removed from the blockchain without burning at least 33% of the total stacked ETH. The underlying algorithm to achieve this is called **Gasper** | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did you mean trust here? |
||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Suggested change
|
||||||||||
- ***Casper FFG*** | ||||||||||
The [Casper FFG](https://arxiv.org/abs/1710.09437v4) is an overlay atop a proposal mechanism -- a mechanism which proposes blocks. Casper is responsible for finalizing these blocks, essentially selecting a unique chain which represents the canonical transaction in the ledger. This is achieved by employing [slashing](https://blog.ethereum.org/2014/01/15/slasher-a-punitive-proof-of-stake-algorithm) which was first proposed in 2014. Casper is follows a BFT tradition with some modifications to achieve PoS. | ||||||||||
Simply put, each validator will vote on the checkpoint, and after two rounds of voting, the checkpoint will be **finalized**. All the finalized checkpoints become the canonical chain(part of the blockchain history). While Casper is used to guarantee **finality**, brought about by attestations to the latest block addition to the canonical chain -- it requires a fork-choice rule where validators attest to blocks to signal support for those blocks. | ||||||||||
|
||||||||||
- ***LMD GHOST*** | ||||||||||
Latest Message Driven Greediest Heaviest Observed Sub-Tree (LMD-GHOST) is a *fork choice rule* where validators attests to blocks to signal support for those blocks. This similar in some ways to the fork choice rule used in Proof-of-Work network -- where the fork with the most work done is selected to be the canonical chain. | ||||||||||
|
||||||||||
![LMD-GHOST-Alogrithm](./img/lmt-ghost.png) | ||||||||||
|
||||||||||
Gapser is full proof-of-stake protocol that is an idealized abstraction of the Ethereum implementation. A combination of Casper FFG and LMDGHOST driving the consensus mechanism for the Eth2. | ||||||||||
# References | ||||||||||
|
||||||||||
- https://web.archive.org/web/20211121044757/https://ethereumbuilders.gitbooks.io/guide/content/en/design_rationale.html | ||||||||||
|
||||||||||
- https://vitalik.eth.limo/general/2022/02/28/complexity.html | ||||||||||
|
||||||||||
- https://dankradfeist.de/ethereum/2021/02/14/why-stateless.html | ||||||||||
|
||||||||||
- https://math.mit.edu/research/highschool/primes/materials/2018/Kuszmaul.pdf | ||||||||||
|
||||||||||
- https://arxiv.org/pdf/2003.03052 | ||||||||||
|
||||||||||
- https://blog.ethereum.org/2014/01/15/slasher-a-punitive-proof-of-stake-algorithm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.