When we buy breakfast at 7-11, if there is only one cashier, there will be a long queue waiting for checkout; if there are two cashiers, it will be doubled immediately; if there are four cashiers, maybe There is no need to line up. This is the basic logic of sharding, which divides the work of one person into multiple people to complete it to improve efficiency.
From the perspective of Ethereum’s distributed ledger: Before sharding, there is only one ledger on the main chain, which can process approximately 12 to 45 transactions per second. When the transaction volume is greater than this data, it needs to queue up, that is, the network will be congested; Fragmentation is to turn a ledger into 64 ledger books, allowing them to process transactions at the same time, which is equivalent to 7-11 opening 64 cashiers to collect cash.
The logic of sharding is simple, but why is it so difficult to implement? Because dividing a ledger into 64 ledger accounts will face many new problems, and what the sharding technology has to do is to solve them. This article will start from these issues to clarify what the sharding of Ethereum 2.0 is all about.
1. How to shard
1. Allocate transactions to shards
A shard contains transactions and verifiers who package the transactions into blocks. The first step to complete the shard is to determine how to allocate transactions and verifiers to a shard. Let’s look at distribution transactions first.
Let us use the stories of three villages to understand: There is a fishing village, an Orion village, and a farmer’s village. There are often transactions within and between villages, but there is no currency, and everyone keeps accounts. In the past, one account book was used to record the accounts of three villages. The speed was a bit slow. Now it is changed to three accounts. Then which account book is used to record which accounts?
One way is to put three ledgers there, and after a transaction comes, look at which ledgers there is no one in line before it is recorded on which ledgers; but this will bring about a problem that each ledger must have an owner Account information, or I will come to you in line, and you do not have my account.
Because of this, one of the main problems of this sharding method is that it cannot reduce the amount of data stored on a single ledger, and this storage requirement is a high threshold for nodes that want to participate in accounting; this method also needs to solve the double-spending problem. Because one person can spend the same amount of money on different shards at the same time.
Another method is that the fishing village has an account, the Liehu village has an account, and the Nongfu village has an account. The account books only contain the account information of their own village and only record transactions in their own village. In this way, three ledgers can be booked at the same time, with high bookkeeping efficiency and low storage requirements. This is exactly the sharding method adopted by Ethereum: state sharding. Each shard stores and only stores the account state belonging to its own shard. In terms of implementation, the user chooses which shard to join in Ethereum instead of sharding by the natural village.
The biggest problem with state sharding is, what if people in the fishing village want to trade with people in the Orion village? There is no account of the Orion villager in the account book of the fishing village, and there is no account of the fishing villager in the account of the Orion village. In fact, this is the biggest test facing sharding technology, cross-sharding communication. When this problem is completely solved, it is when Ethereum 2.0 can be used. This article will discuss some solutions to this problem in the second part.
2. Assign validators to shards
After arranging transactions to different shards, the next problem to be solved is how to assign a bookkeeper to a certain shard, that is, assign validators.
Ethereum has 64 shards, and each shard has 128 validators. If the validators of the shards are fixed or predictable, the attacker controls the shards, which means buying 2/3 of 128 is one What to do with an easy thing? Ethereum’s solution is to randomly select a validator for a shard from all validators and replace the validator every 6.4 minutes (the length of an epoch). In this way, the attacker only has a chance of controlling 2/3 of the people in a shard less than one in a trillion (see Reference 1 for the reasoning process).
One of the main tasks of the beacon chain is to assign validators to the shard chain. The most important aspect of this work is the realization of randomness. The first is the importance of randomness. If verifiers cannot be randomly assigned, the security of the ledger cannot be guaranteed; secondly, the difficulty of randomness. It is extremely difficult to achieve randomness on the blockchain, and it can be considered as this until now. It is not really a random algorithm that has been verified by engineering.
Ethereum’s solution is to use RANDAO+VDF to provide random numbers to achieve randomness. Disassembling RANDAO into RAN (random) and DAO is easy to understand. It means that each person in a group of people proposes a random number individually, and then combines the random numbers of all people to generate the random number that is finally used. Because it is difficult for anyone to know the numbers provided by others, it is also difficult to predict the final number combined.
However, the RANDAO model has a flaw, that is, the person who provided the last number has the opportunity to cheat: he knows the sum of the random numbers provided by all the people in front of him, and he can adjust the numbers provided by him to make the final result beneficial to him.
To solve this problem, Ethereum introduced VDF (Verifiable Delay Function). Its function is very simple. It is to prevent the last person who provided the random number from calculating the sum of the random numbers of all the people before he provided the number. It is impossible to manipulate random numbers. (For a detailed introduction of RANDAO+VDF, see Reference 2)
3. Store the fragments by the relayer
I don’t know if you have noticed that the verifier who rotates the ledger will bring a new problem: the verifier will be assigned to the fishing village for a while, and then assigned to the Orion village to keep the account. If he does not have all the account information, how to keep the account? Account? If he has all the account information, he is holding a full ledger and has not achieved state fragmentation.
To solve this problem, Ethereum proposes an important new design: a stateless client. Simplified understanding is that the ledgers of the fishing village are placed in the fishing village, and the ledgers of the Orion Village are placed in the Orion Village. The verifier does not hold the ledgers in his hands and is only responsible for running around in different villages to keep accounts.
So who will keep the books of different villages? Ethereum introduces the role of relayers (state providers), who are responsible for storing the account states of different shards and can only serve a certain shard. The work of relayers is easy to understand, but how to pay for their services, how to ensure their honesty… The design of these related mechanisms is a brand new problem that needs to be solved, and it is also a governance problem that community members should participate in the discussion.
The actual situation of stateless clients is much more complicated than described above. The composition of the “transaction” itself is different from that when it is not sharded. It must be accompanied by witness data to prove that it is valid. It can be considered that in 1.0, the verifier needs to store the old account to verify the new transaction; in 2.0, the transaction needs to bring the old account with the verifier for verification.
But we cannot require every user to store all old accounts so that the transaction can be proved after the transaction is initiated. At this time, a “relayer” is needed, which stores all the account status of the shard. As long as the user raises the demand, it It can help users provide witness data of transactions to verifiers.
Vitalik Buterin published an article on March 11 and proposed to use polynomial commitments instead of state roots. This technique is used here. It uses a zero-knowledge proof method to provide proof for transactions, which can be understood as providing data calculation results. Verification is done to the verifier instead of directly providing all relevant data to the verifier for verification. This method can greatly reduce the size of the witness data and effectively reduce various overheads. (For a detailed introduction of Vitalik’s new method, see Reference 3, and for a detailed introduction of stateless clients, see Reference 4)
At this step, the work of dividing one ledger into multiple ledgers is completed, that is, dividing into pieces.
2. Cross-shard transactions
If the people in the fishing village only trade with the people in the fishing village, and the people in the Liehu Village only trade with the people in the Liehu Village, then each village can keep its own accounts. This does not require any new technology. But what if the people in the fishing village want to trade with the people in the Orion village? How do different ledgers communicate? This is the most difficult problem facing state sharding.
There are two ways to solve this problem, one is synchronous (tightly coupled), and the other is asynchronous (loosely coupled).
Suppose there is a person in the fishing village called A, and someone in the Orion village is called B, and A must give B 100 yuan. Synchronization means: when A initiates the transfer, the bookkeepers in the fishing village and the Orion village know the transaction and the progress of the transaction. The bookkeeper subtracted 100 from A on the ledger, and the bookkeeper in Orion Village added 100 to B on the ledger. After the transaction was completed, the two villages simultaneously generated a new block.
Asynchronous means: when A initiates the transfer, the account book of the fishing village is reduced by 100 to A, and a new block is generated; the person who keeps the account in Orion village receives this message in some way and confirms that A’s money is indeed reduced. Add 100 to B on his own ledger and the transaction is completed, but the two villages generate new blocks asynchronously.
The synchronization method looks friendly, and the look and feel of the transaction execution process are the same as unshared, but it hides a big problem, that is, it is difficult to deal with “continuous state changes”. What does it mean?
If A only transfers 100 yuan to B, the fishing village and Liehu Village can easily confirm that everyone keeps the accounts in this way after hearing the transaction. The fishing village’s ledger will deduct 100 from A and Liehu will add 100. Complete bookkeeping. However, if A transfers to B 100, and then to B 50, a continuous state change occurs, but A totals only 120 yuan. At this time, it is difficult for the two villages to confirm how each other’s accounts are kept:
If each verifier goes to the other’s verifier to communicate with each other, the communication cost will increase sharply, and it will be extremely difficult to reach a certain result; The village head tells the other party a certain result. In addition to increasing the cost, it is difficult to achieve because the consensus mechanism of Ethereum itself cannot achieve a certain result (finality).
The asynchronous method will not be troubled by the continuous state change, because its method is to “wait”. When your state is confirmed, I will proceed to the next step; when the fishing village has finished accounting for A, Orion village sees that A is After subtracting 100 or 50, decide to add 100 or 50 to B.
The problem with the asynchronous mode is an atomic failure. The transaction should be atomic, either executed or not executed, but in the asynchronous mode, it is possible that part of the transaction is confirmed, but the other part is discarded.
For example, the block in the fishing village that subtracted 100 from A was finally confirmed on the main chain of the fishing village, but the block in which Orion village added 100 to B was finally on the side chain of Orion village and was abandoned. Atomic failure is a problem, but it can be solved by design. For a detailed introduction to this part, see Reference 5 at the end of the article.
Another problem with the asynchronous method is the time cost and communication and storage cost, that is, the waiting time and the resources occupied to complete a cross-shard transaction. The way in which information is transmitted between different shards determines the amount of these overheads. Different types of overheads are related to each other and cannot be balanced. The design should pursue a balance. The future performance of Ethereum 2.0 is dominated by the way of information transmission.
Ethereum has discussed some asynchronous architecture models. The latest one was proposed by Vitalik at the DevCon 5 conference in October 2019. The basic idea is to use the beacon chain to transmit information: in each slot (12 seconds), points The shard chain generates blocks and is cross-linked with the beacon chain block. The connection method is as shown in the figure below. In this way, any shard can know the information of all other shards through the beacon chain when packaging its own new transaction. A slot is asynchronous between different shards.
This method reduces the waiting time for cross-shard transactions, but increases the requirements for the beacon chain, which needs to store proof data for all shards; this method also increases the number of cross-linked links (the original design is Each epoch, that is, 6.4 minutes, 32 blocks are cross-linked once), which will inevitably increase various related expenses, and because of this, the number of shards in Ethereum has changed from 1024 to 64, from another design direction Reduce the total number of links. (For a detailed introduction of the architecture, see Reference 6)
Judging from some current sharding design solutions, the synchronous model is more inclined to communicate between shards and shards themselves, while the asynchronous model is more inclined to communicate between shards and shards through a third party; the former faces the problem of communication volume. The problem is that the latter faces a balance of multiple costs. The design and implementation of cross-shard transactions are still in progress. It is not yet certain which architecture will be adopted by Ethereum 2.0.
3. Smart contracts across shards
After introducing the sharding and cross-sharding transactions, the ultimate big boss on the road of Ethereum 2.0 development comes, it is the cross-sharding smart contract. The difference between cross-shard transactions and cross-shard smart contracts is that transactions only have global variables, while smart contracts have local variables. What trouble can local variables cause?
After Ethereum is sharded, there are 64 ledgers from a physical point of view, but from an abstract point of view, there is only one ledger: you can think of the ledger as a big tree, each leaf of the tree stores an account status data, 64 The ledgers are 64 trees, and then the roots of these trees are given to the beacon chain to form a new big tree, and 64 ledgers are combined into one ledger (this is only an approximate analogy).
In a cross-shard transaction, when a shard needs to know the account status of another shard, no matter how it is, it can always follow the tree to find the leaf that stores the state, and then change its own shard The account status, complete the transaction. It can be considered that through this tree, different fragments have completed the intercommunication of information.
But for cross-shard smart contracts, the problem is that the data stored on the leaves of this tree are global variables, and there are no local variables. If one sharded smart contract calls another sharded smart contract, what about the two? Pass information about local variables? This tree cannot serve them.
It can also be understood that cross-shard transactions only need to look at global variables, that is, to look at the first-level state, and smart contracts need to look at local variables across shards, that is, they need to look at the second-level state. The design difficulty of cross-shard transactions and smart contract cross-shards is not in the same order of magnitude.
At present, we have not seen the design plan of a system of smart contracts across shards, but there are two proposals. One is to put related smart contracts into the same shard for execution, which is to eliminate smart contract cross-sharding. The needs of the chip; one is to propose the use of SIMD (Single Instruction Stream Multiple Data Stream) technologies to allow the smart contract itself to be executed in parallel.
Ethereum 2.0 will introduce smart contracts in Phase 2, which means that the cross-sharding of smart contracts will not be realized until Phase 2, and only after this step can it truly announce that Ethereum has entered the 2.0 era.
The above is an introduction to the design and difficulty of Ethereum sharding. At present, it is still in the initial stage of the Ethereum 2.0 implementation. The following keywords are worthy of attention at this stage: state sharding, stateless client, random number.
1. “Minimum Committee Size Explained”; Author, Chih-Cheng Liang; https://medium.com/@chihchengliang/minimum-committee-size-explained-67047111fa20
2. “Ethereum 2.0: Randomness”; author, Bruno Škvorc; translation, Jhonny, Ajian; https://ethfans.org/posts/two-point-oh-randomness
3. “Using polynomial commitments to replace state roots”; author, Vitalik Buterin; https://ethresear.ch/t/using-polynomial-commitments-to-replace-state-roots/7095
4. “Eth2.0 Relayer Network and Fee Mechanism”; author, John Adler; translation, IAN LIU, A Jian; https://ethfans.org/posts/relay-networks-and-fee-markets- in-eth-2
5. “Concepts and Challenges of Blockchain Sharding”; author, Alexander Skidanov; translation, Jhonny, Echo, A Jian; https://ethfans.org/posts/the-authoritative-guide-to-blockchain-sharding- part-1
6. “Eth2 shard chain simplification proposal”; author, Vitalik Buterin; https://notes.ethereum.org/@vbuterin/HkiULaluS
7. “ETH 2.0 Guide for Engineers”; author, James Prestwich; translation, Aisling, Qiqi, stormpang, Ajian; https://ethfans.org/posts/what-to-expect-when-eths-expecting
8. “Merge blocks and synchronous cross-shard state execution”; author, Vitalik Buterin; https://ethresear.ch/t/merge-blocks-and-synchronous-cross-shard-state-execution/1240