🐘

Verkle Trees for Statelessness

💡
Verkle Trees = Vector Commitments + Merkle Trees

Contents

site maintained by @rudolf6, @ignaciohagopian, and @gballet (ping with any questions/requests)

1  Introduction

Updated: Nov 21, 2023

Tl;dr: Verkle Trees and statelessness bring many benefits to Ethereum
  • Smaller proof sizes allow proofs to be passed over the network, which will unlock many new types of functionality in addition to stateless clients
  • Lower hardware requirements to run a node, which improves decentralization
  • New nodes can join the network right away with faster sync
  • Potential scaling benefits in that it could allow for higher gas limits
  • More compatible with a zk-EVM future

What are Verkle Trees?

Verkle Tries are a type of data structure similar to Merkle Patricia Trees (MPT), which is the data structure used in Ethereum today. While Verkle Tries utilize a tree-like structure similar to MPT, a key difference is each node uses a special type of hash (called a vector commitment) to commit to children nodes. These vector commitments provide several important, long-term benefits, and help pave the way for statelessness in Ethereum.

Note: Both Verkle Tries and Merkle Trees serve a key purpose in blockchain networks — providing data in a format that allows for network participants to easily verify network state (needing only a short proof, aka “witness”, and the root of the tree).

Why?

Because it can help Ethereum achieve statelessness! The key benefit of the vector commitments used in Verkle is that they allow for much smaller proof sizes (called witnesses). Instead of needing to provide hashes of all "sibling nodes" at each level, as in Merkle Trees, the prover needs only to provide all parent nodes (plus an extra proof, called an optional) along the paths from each leaf node to the root. Since we no longer need to provide all sibling nodes, we can structure Verkle Tries to be much wider than Merkle Trees. All of this allows for much smaller witness sizes.

What’s so great about small witness sizes? Large witnesses are the key problem standing in the way of statelessness. When witnesses (proofs) are small enough, they can be contained within each block. This allows for the verification of any block using only what’s contained within the block itself.

The general principle of “statelessness” is that nodes verifying blocks no longer need to store state. See this writeup from Dankrad Feist on why statelessness is a worthy goal.

How?

How do we actually get from the current Merkle world to Verkle?

See this writeup on potential migration paths, which currently include four options:

  1. Overlay Method:
    • This is currently the leading candidate. In this method, you start with an empty, new overlay tree (the Verkle tree), and a base tree (the existing Merkle tree). Each block, a fixed number of values are copied from the base tree to the overlay tree. When reading the state, you first search the overlay tree for a key, and if it’s not found then search the base tree. When writing to the state, you always write to the overlay tree.
  2. Conversion Node:
    • A small group of very powerful machines do the translation at a fixed block height, and then share the result with the rest of the network. Less powerful clients download the conversion and replay blocks in Verkle mode until the head has been reached.
  3. Local Bulk:
    • Similar to the “Conversion Node” method, except there is no special class of nodes and all machines do the conversion themselves. This approach was previously not feasible, but has started to look possible based on recent performance optimizations.
  4. State Expiry:
    • The Merkle tree is frozen in place, and we start from a fresh tree. Reading and writing work the same as with the overlay method, but no conversion/merging of the trees ever happens.

Want to help?
🤝
Join the conversation in the Ethereum R&D discord (#verkle-trie channel)
📞

Specific things you can do right away:

  1. Implement it in your clients, & try to join the testnets.
  2. Deploy your dapp on one of the testnets and report issues in Discord.
  3. Write some tooling!
  4. Ask a question in Discord :)

What’s up with the elephant?

Elephants have been known to uproot trees :)

image

This is the latest and most up-to-date talk about Verkle Trees:

EthCC 2023

2  FAQ

Updated: July 24, 2023

Why does Verkle require gas cost changes?

The motivation can be summarized as follows:

There is a strong desire to allow stateless verification of blocks. A client should be able to verify the correctness of any individual block without any extra information except for a small file that any state-holding node can generate, called a witness, that contains the portion of the state accessed by the block along with proofs of correctness. Stateless verification has important benefits including reducing disk-space requirements, allowing semi-light “trust but verify if alerted” clients, and supporting sharding setups where clients frequently jump between shards. Stateless verification is not viable at present, but the introduction of Verkle trees will make it viable. However, for this to work we need to adjust gas costs so that the gas cost of an operation corresponds to the witness size required for that operation. Operations that require witness sizes include: - Account reads (mostly calls but also EXTCODEHASH and similar operations): - SLOAD and SSTORE - Contract code execution This EIP introduces a cleaner and simpler gas cost schedule that directly charges for accessing a subtree and accessing an element within the subtree. For (1) and (2), this EIP does not greatly change costs and is largely a simplification, as EIP 2929 already raised gas costs for those operations to a sufficiently high level. For (3), however, it does add substantial gas cost increases, as contract code access is not properly charged for without this EIP.

Could these gas cost changes break some fundamental assumptions for dapps/L2s/etc?

That might be the case, yes. See list of “Open Questions” in the Dashboard below. Help is appreciated!

What are stateless clients?

Today, to validate the chain, you need to have a fully synced node containing all the state. This is needed since, currently, blocks don’t have all the required state to execute that block due to the cost of generating a witness using the current Merkle Patricia Trie.

With Verkle Trees EIP, blocks can be self-contained execution units which allow them to be verified without requiring any extra information, in particular the full state of the chain.

When will Verkle Trees be available on mainnet?

There’s currently no defined date, but targeting sometime in 2024

Regarding the Overlay Tree migration, how many key-values are migrated per block?

This number is still undecided since it’s mostly a balance between overall migration duration speed (good!) and rising the bar of hardware requirements for following validators (bad!). See the “Open Questions” section of the Dashboard below.

3 Dashboard

Updated: Nov 10, 2023

All progress is shared in the Verkle Implementers Calls, which anyone can join.

Current main tasks

Continue to refine the overall state migration strategy (Merkle to Verkle)
Shadow fork
Verkle Snap sync testing on Kaustinen
Activate proof generation/verification in existing benchmarks to see how this hits performance and the potential number of key-value.
Produce test vectors

Upcoming tasks

Include in the spec the overlay tree migration logic (e.g. walking order of the tree, max number of key-values per block, what to do on empty blocks, reorg's impact, etc.).
Prepare new benchmark data closer to the current mainnet state (i.e: ~2023).
Do some analytics on VKT's final shape post-migration.
Update all VKT-related EIP to make sure they’re up to date with the latest information.

Note: OQ = Open question

Future milestones

More CL and EL clients joining testnet
Additional large contracts deployed on testnet
CL & EL passing spec tests
Preimage strategy:
Recording activation, if needed.
Distribution (p2p/file/torrent, TBD)
Shadow forks:
Testnet shadow forks
Mainnet shadow forks

Open questions

Preimage generation and distribution strategy?

The overlay tree migration strategy requires the nodes to have the preimages for all the states in the MPT since we need to re-key them to the VKT.

If you want to understand more about the problem, see this document.

What are witness and block size estimates post Verkle EIP?

As we create shadow forks, we can have more up-to-date information. But we’ll have a reasonable estimate when activating witness/proofs in our benchmarks, so we can indirectly have a sense of this sooner than shadow forks.

Backward compatibility impact on the ecosystem?
  1. How does this affect L2s?
  2. How does this affect dev tools?
  3. How does this affect most used contracts/dapps?

From Vitalik’s writeup here

The three main backward-compatibility-breaking changes are:

  1. SELFDESTRUCT neutering (see here for a document stating the case for doing this despite the backward compatibility loss)
  2. Gas costs for code chunk access make some applications less economically viable.
  3. Tree structure change makes in-EVM proofs of historical state no longer work

(2) can be mitigated by increasing the gas limit at the same time as implementing this EIP, reducing the risk that applications will no longer work due to transaction gas usage rising above the block gas limit. (3) cannot be mitigated this time, but this proposal could be implemented to make this no longer a concern for any tree structure changes in the future.

Testing

Apart from EL+CL joining devnets and shadow forks, it will be helpful to have some set of spec tests that all clients can run to have quick feedback about correctness. This EIP has many changes in data structures and gas model, which might have many border cases hard to test in test networks.

Tests for:

Consensus (e.g block format changes)
Execution
Data structure (Verkle Tree)?
Proofs?
New gas accounting?
Tests for migration phase?

This is just a braindump and requires more thinking on how to organize.

How do reorgs impact clients?

Chain reorgs open some set of questions that we should explore and maybe document further:

  • How does a deep reorg impact low-setup nodes when catching up to the tip again?
  • Despite it might have a simple answer, should we document any consideration of reorgs during the migration phase?

Impact on APIs

How does Verkle Trees impact on existing APIs (i.e: JSON-RPC), for example:

  • APIs related to proof generation.
  • Any consideration for the migration phase?
Minimal hardware setup support + final migrated # key-values per block?

The Verkle Trees EIP has two main overheads:

  • It uses more complex cryptography, which will use more CPU on EL clients. This is independent of the migration phase or strategy.
  • The overlay migration period has the extra load (IO + CPU) of migrating state from MPT to VKT apart from usual block execution duties.
  • Producing and validation blocks have different CPU overheads, mainly regarding proof generation vs validation.

A natural question is whether this overhead will make some existing viable hardware setups enviable. We should separate this into more precise phases:

  1. During the migration phase:
    1. Creating blocks.
    2. Validating blocks.
  2. After the migration phase:
    1. Creating blocks.
    2. Validating blocks.

For now, validating blocks during and after the migration phase shouldn’t be a big problem for lower hardware setups (e.g: Rock5B). Generating blocks during the migration phase needs more research, especially during the migration phase, where the worst-case scenario might be. We should remember that the migration phase will have a limited timespan.

Finally, most of our benchmarks today aren’t accounting for 4844 overhead, which isn’t negligible. We have no historical chain to use in benchmarks, nor have we tried any draft version of clients supporting 4844 with some generated “high load” chain to understand things. It will be somewhat hard to be 100% sure about the reality of performance or low-power setups until 4844 is close or fully deployed to mainnet.

Verkle Implementers Call

Completed tasks

4  Testnets

Updated: Dec 1, 2023

Kaustinen

icon
https://verkle-gen-devnet-2.ethpandaops.io
  • Contains proofs in blocks
  • (Lighthouse/Lodestar) + Geth combo is live

See the following tutorial to know how to join the testnet.

5  Resources

Updated: Nov 10, 2023

💬 Join the Eth R&D discord server (#verkle-trie-migration)

🧠 Bringing Verkle into Ethereum involves many changes in the protocol:

  • A new data structure to save the state of the network
  • A new gas accounting model
  • A strategy to migrate the existing state from the MPT to the VKT
  • A new set of cryptography primitives
  • New fields at the block level

6  Client Implementations

Updated: Nov 10, 2023

Updates shared in Verkle Implementers Calls…

Latest summary here

Cryptography and other

EL Status

Client
Implementation
Testnet
Besu
WIP
🔧 In progress
Erigon
WIP
EthJS
WIP
Geth
WIP
✅ Kaustinen
Nethermind
WIP
✅ Kaustinen
Reth

CL Status

Client
Implementation
Testnet
Lighthouse
WIP
✅ Kaustinen
Lodestar
WIP
✅ Kaustinen
Nimbus
WIP
🔧 In progress
Prysm
Teku

💻
Client Updates (old)