ERC20-K: Formal Executable Specification of ERC20
The ERC20 standard is one of the most important standards for the implementation of tokens within Ethereum smart contracts. ERC20 provides basic functionality to transfer tokens and to be approved so they can be spent by another on-chain third party. Unfortunately, ERC20 leaves several corner cases unspecified, which makes it less than ideal to use in the formal verification of token implementations. Indeed, we at RV, Inc., have been asked to verify smart contracts for ERC20 compliance. However, we found that it is unclear what ERC20 compliance means, because the existing presentations of ERC20 are far from serving as mathematical models of the standard token. Consequently, we decided to create ERC20-K, a mathematically rigorous formalization of ERC20, making sure that all corner cases are thought through, explicitly covered, and thoroughly tested. From here on, when we claim that we prove implementations of ERC20 tokens correct, we mean that they provably satisfy the 13 rules of ERC20-K.
We have chosen K as a semantic framework for our formalization effort for several reasons.
First, K is executable. So the defined semantic models can be tested to find early errors in their design and to accumulate confidence in their adequacy.
Second, K provides a suite of formal analysis tools that can be used in combination with any semantics:
Thus, we can prove properties about the ERC20 specification itself, such as "the total supply is not changed by applications of ERC20 operations" on the one hand, and also prove properties about smart contract implementations of ERC20 tokens.
Third, K is modular. This allows us to compositionally define semantics for execution environments or programming languages on top of ERC20-K, giving the feel of "plug-and-play ERC20 into your favorite language". Moreover, once done, we can then verify complex programs in the resulting languages.
Fourth, K is under active development and we have control over its design and implementation, so it can quickly respond to future needs of ERC20-K. For example, we plan to incorporate ERC20-K within the IELE VM, and implement a fast LLVM backend for K; if successful, this will eliminate the need for adhoc implementations of VMs and ERC20 tokens on top of them. The same can be done with the EVM through its KEVM semantics, provided an injection morphism of ERC20-K configurations (based on semantics cells) into EVM configurations (based on partial maps) is adequately defined.
Highlights of ERC20-K
ERC20-K is parametric in addresses, values, and maxvalue. This way it can be easily instantiated to various target execution environments besides EVM, such as eWASM or IELE, and to various programming languages besides Solidity, such as Viper or Plutus.
ERC20-K is parametric in the execution environment. It can be plugged-and-played within any execution environment which has a formal semantics, such as a virtual machine or a programming language. The ERC20 functions are simply added to the target VM/language's syntax, and the ERC20 semantics smoothly composes with the VM/language's semantics, yielding a variant of the VM/language with ERC20 support. We have done that with a simple imperative language, IMP. The video below shows an interactive program in IMP that calls the ERC20 functions. This can be easily adapted to language implementations, where ERC20 can be regarded as a library.
ERC20-K covers all the corner cases including, explicitly, the cases where exceptions are thrown. For example, the
transfer(To,Value) function requires 4 cases, depending on whether the caller is different from
To or not, and on whether the transfer succeeds or it throws.
ERC20-K is executable. This not only allows us to elegantly integrate it with programming languages (item 2 above), but also allows us to test it. Indeed, like code, a formal specification can also have errors, especially when there is no other formal specification to formally compare with. Unlike code, which can be verified against a specification, there is nothing we can do to prove a formal specification correct: since it serves as a definition, it is correct by definition. The best we can do is to accumulate confidence in its correctness, by various means. One of the most tangible means is to test it. To this end, we provide more than 60 unit tests for our ERC20 semantics; these can be easily translated to tests for implementations of ERC20. It is also possible to translate and run test suites already deployed for Ethereum tokens on ERC20-K, building confidence that our semantics accurately reflects popular deployed tokens.
Technical Details and Download
ERC20-K is thoroughly commented and freely available under the UIUC license (as permissive as the MIT license) on Github:
Comments and contributions are highly appreciated.
In the spirit of open source, community-driven development, we will be holding all ERC20-K formal specification discussions on our channels
We encourage any interested parties to engage us, ask questions, contribute code, or build experience with our tools. We are also always looking for contributors able to work on documentation, efficient install/quickstart process for new developers, and more examples and tests. We are hiring, and will be sure to keep an eye open for helpful contributors!
We will also be posting updates on our brand new Twitter page @rv_inc, which we hope any interested developers will follow and interact with.
Let's build more secure smart contracts for everybody, together!
We thank the K team who defined the KEVM semantics (see technical report, too) and verified smart contracts for ERC20 compliance. It is their effort that inevitably led to the quest for a formal specification of ERC20. In particular, we would like to thank Philip Daian for brainstorming and help with understanding the corner cases of ERC20.
We also warmly thank IOHK not only for their generous funding support of both KEVM and IELE, but also for the stimulating technical discussions we had with their research team. Discussions about IELE and about the planned separation between the settlement and the computational layers in Cardano, in particular, led to the question of whether the ERC20 specification can be defined in a more abstract way that makes it usable in combination with computational layers different from EVM.