Technical
Sep 4, 2024
Objective
This document is a brainstorming exercise that delves into the complexities and challenges encountered in the design and implementation of a compiler to translate Ethereum Virtual Machine (EVM) bytecode into Valida bytecode. Throughout this exploration, it became evident that constructing the EVM to Valida compiler is a more intricate task than initially anticipated. Without relying on the LLVM Valida backend, we found ourselves on a path toward crafting a more nuanced and sophisticated solution, navigating through myriad considerations such as word size disparities, interaction models with external worlds, and opcode implementations among others.
The motivating question of this document is: how can we design and implement a compiler which translates Ethereum Virtual Machine (EVM) bytecode into Valida bytecode? The reason we (i.e., Lita Foundation) find this question interesting is that we believe that Valida can offer a more performant zk- EVM solution, compared to extant solutions. Creating a zk-EVM which directly executes EVM bytecode is not simple or easy, because of the complexity of the EVM bytecode language and its individual instructions. The resulting ZK constraint systems may have hundreds of columns and thousands of constraints. The hypothesis behind the present work is that a more performant solution (in terms of the time and space complexity of proving) is attainable by translating EVM bytecode into Valida bytecode, and executing the resulting Valida bytecode on the Valida zk-EVM.
A possible criticism of this approach is that it modifies the statements being proven. Instead of proving statements of the form “EVM program p executed on some input produces output x,” the statements proven would be of the form “Valida program p′ executed on some input produces output x.” This is not an issue if the verifier can verify that p′ is the compilation of p to Valida by re-running the EVM to Valida compiler on p (or by just being hard-coded to know this fact). Another way to address the criticism is by running the EVM to Valida compiler in the Valida zk-VM, to produce a succinct proof of “p′ is the result of running the EVM to Valida compiler on p.” Then the verifier can succinctly verify both the execution of p′ and the compilation of p to p′.
The EVM to Valida compiler should be written in Rust. There are three reasons for this:
1. This is in order to facilitate compiling the EVM to Valida compiler to Valida in order to use Valida to succinctly prove executions of the EVM to Valida compiler.
2. This is in order to facilitate efficiently writing correct code (as compared to C/C++).
3. This is in order to use one of the first languages we plan to support in the Valida LLVM backend.
The EVM is a model of computation. The following key distinctions make EVM unique as a model of computation:
1. EVM uses a 256-bit word size.
2. EVM uses a Harvard architecture, with separate address spaces for RAM and program ROM. The program ROM is readable and executable.
3. EVM is a stack machine, with a 1024-word stack (separate from RAM) which EVM opcodes implicitly reference.
4. EVM bytecodes interact with the outside world via two state tries: the global state trie and the contract data storage trie. The global state trie is shared by all Ethereum contracts, and each Ethereum contract has its own contract data storage trie.
For comparison:
1. Valida uses a 32-bit word size.
2. Valida uses a Harvard architecture, with separate address spaces for RAM and program ROM. The program ROM is executable but not readable.
3. Valida is a stack machine. Its stack exists in the RAM address space.
4. Valida bytecodes interact with the outside world via a read-only input tape and a write-only output tape.
Valida is not a bad target for EVM compilation. EVM and Valida have certain similarities which would assist with this process; they are both Harvard architecture stack machines where code runs in a hermetic environment with only circumscribed ways of interacting with the outside world. There is some non-trivial work to do to figure out how to compile EVM to Valida, though, because of the impedance mismatches: mainly the different word size, and the different model for interactions with the outside world.
It is possible to emulate 256-bit arithmetic opcodes in Valida, using eight 32-bit words to represent one 256-bit word. This is one of the necessary steps for EVM to Valida compilation. The implementations of 256-bit arithmetic opcodes can either be inlined or separated as subroutines, whatever seems to be more efficient. It would be nice to be able to detect when the full 256-bit range is not used and a smaller word size will suffice, but that optimization is beyond the level of sophistication contemplated in the current design.
One of the challenges to consider is that EVM has a 256-bit address space, whereas Valida’s address space is smaller than 32 bits. From a performance standpoint, the best solution would be to assume that the input EVM bytecode does not use addresses larger than the max address in the Valida address space. For example, loads from RAM could use just the least significant 32 bits of the 256-bit address. This would result in lots of equations between different addresses in the EVM address space. This could be correct for some EVM bytecodes, but not for others.
A more general, less efficient, yet still reasonably efficient solution would be to use a direct mapping between the EVM RAM address space and the Valida RAM address space for a range of small addresses, and then hold the values at larger addresses in a data structure (such as a hash table or a binary search tree) held in a section of Valida RAM disjoint from the part which is mapped directly to the EVM RAM address space. This approach would have the advantage of semantics preservation for all EVM bytecodes which don’t consume more RAM than Valida can make available. It would have the (dis/)advantage of being efficient for small enough addresses, and less efficient yet still correct for larger addresses.
Another challenge to consider is that EVM bytecodes have two state tries which they can interact with, whereas Valida bytecodes only have a read-only input tape and a write-only output tape. To solve this challenge, we need to come up with an EVM ABI (Application Binary Interface) for Valida which defines how interactions with the EVM state tries should be encoded as interactions with the Valida input and output tapes.
Semantically, a state trie is a key-value mapping. State tries support two forms of interaction: reading the value at a key, and writing the value at a key. A naive approach to designing a Valida ABI for this would be to supply the state tries on the input tape, and let the program write a log of state trie changes to the output tape. This na ̈ıve approach will not work, because the state tries in general are much too large to supply them in their entirety.
The ABI style proposed in this design has the following elements:
1. The host code, the code which calls the zk-VM prover, supplies the initial root hash of each state trie as the first things in the input tape.
2. The host code supplies the following data as the next things in the input tape:
(a) the gas price (of executing the transaction)
(b) the coinbase (i.e., address of the current block’s miner)
(c) the network ID (i.e., the chain ID)
(d) the current block timestamp, number, difficulty, gas limit, and base fee
(e) the hashes of the most recent 256 blocks
3. The host code supplies the calldata as the next thing on the input tape.
4. The guest code, i.e. the Valida bytecodes, copy the initial state trie root hashes from the input tape as the first things in the output tape, followed by the gas price, the coinbase, and the hashes of the most recent 256 blocks (or whatever subset thereof are relevant).
5. The host code runs the EVM bytecodes in a non-ZK EVM in order to obtain a trace which specifies all of the read state trie and write state trie operations and their results, as well as the final state trie hashes which are obtained.
6. The host code writes to the input tape some claims about the contents of the state tries at the keys which the program queries. Along with these claims, the host code writes opening proofs which prove that the state tries do in fact contain those key/value pairs.
7. At the point of each write to a state trie in the trace, the host code writes the new state trie root hash to the input tape.
8. The guest code verifies the opening proofs sent by the host code on the input tape.
9. The guest code writes to the output tape all inputs to LOGi EVM opcodes which it executes.
10. The guest code writes to the output tape all calls to external contracts which it makes.
11. The host code writes to the input tape all calls to external contracts which the guest code makes, and the resulting state trie root hashes.
12. The host code writes to the input tape the results of all invocations of the RETURNDATASIZE and RETURNDATACOPY opcodes. The guest code copies all results of invocations of these opcodes from the input tape to the output tape.
13. If the source guest code’s execution trace includes an invocation of SELFDESTRUCT, then the compiled guest code writes a message unambiguously to the effect that it has done so it has done so to the output tape.
14. The guest code writes to the output tape the final root hashes for each of the two state tries, by copying the last root hash for each state trie supplied on its input tape by the host code.
This ABI style is designed to prove statements of the form “program p, when run over the state tries with root hashes (a0,b0), gas price p, coinbase c, network id N, block timestamp t, block number i, block difficulty d, gas limit l, base fee b, recent block hashes h0,...,h255, and some calldata, results in state tries with root hashes (a1,b1) and logging outputs L.”
Another challenge to consider is that EVM contains the CODECOPY opcode, which can be used to copy bytes from the program ROM into the RAM. To emulate this opcode in Valida, we can add a preamble to compiled EVM programs which use this opcode. The preamble would write all of the EVM bytecode (the source program) to a fixed location in RAM.
A further challenge to consider is that EVM contains the EXTCODESIZE and EXTCODECOPY opcodes, which can be used to check the size of or copy bytes from a program ROM for a different contract than the one currently executing. To emulate these opcodes in Valida, we can recognize that they are examples of reading from the global state trie, since the global state trie contains the hashes of all contracts, and the host code can provide opening proofs for these hashes, by exhibiting the source data that was used to create them.
A further challenge to consider is that EVM contains the GAS instruction, which outputs the current remaining gas limit. For programs which contain this instruction, we can initialize a global variable containing the current remaining gas limit, initially to the gas limit, and we can follow the compilation of each opcode with an instruction or sequence of instructions which decrements the remaining gas limit by the gas cost of the opcode.
Another challenge to consider is the PC instruction, which outputs the current value of the program counter. This is relative to the location of the program counter in the EVM source bytecode, not the target Valida bytecode. Thus, we cannot get the answer we need by reading the value of the ‘pc‘ register. What makes this instruction readily implementable is the fact that PC is a compile-time constant, or else it’s equal to a compile-time constant plus the address of the first instruction in ROM (if the address of the first instruction in ROM might not be equal to zero). If the address of the first instruction in ROM might not be zero, then I suppose we can supply the address of the first instruction on the input tape.
To implement the REVERT opcode, the EVM to Valida compiler can simply write the final state root hashes to the output tape, indicating the initial state root hashes as the final state root hashes, and then terminate.
Another set of challenges to consider is how to deal with the opcodes related to external contracts: CREATE, CALL, RETURNDATASIZE, RETURNDATACOPY, RETURN, DELEGATECALL, CREATE2, and STATICCALL. These have to do with creating and calling other Ethereum contracts. When calling another contract, it has its own RAM and ROM address spaces, and it can add its own calls to the call stack, each of which has its own RAM and ROM address spaces.
CREATE adds a new contract to the global state. This opcode can be implemented in the compiler by considering that this is an instance of writing to the global state trie. In the ABI, the host code writes the resulting global state trie root has to the input tape, and the guest code writes the new element(s) of the global state trie to the output tape. CREATE2 is similar and can be implemented similarly.
The remaining opcodes related to external contracts have to do with calling other Ethereum contracts. It seems simplest to handle these opcodes by externalizing calls to other contracts. In the ABI, the host code writes to the input tape the calls, and the results of the calls, that the guest code makes. The guest code reads the input tape and checks that the calls it wants to make are recorded there where it expects them to be. It reads the results of the calls. The guest code writes the same data that it reads about calls to the output tape, so that this data becomes part of the statement being proven. The data about the calls includes the resulting state tries after each call.
In the presence of external calls, the statement being proven does not assert the correctness of a state transition; instead, it conditionally asserts the correctness of a state transition, conditional on the correctness of the state transitions claimed for the external calls. In order to succinctly prove the correctness of a state transition unconditionally in the presence of external calls, a proof aggregator can combine execution proofs relating to the external calls, with conditional execution proofs relating to the contexts in which those calls occur.