New Technologies for the Blockchain: IELE (virtual machine) and K (universal language framework)

IELE and K Team Runtime Verification has been recently awarded a research and development contract by IOHK to design a next generation virtual machine and a universal language framework to be used as core infrastructure for future blockchain technologies. The formal analysis and verification technology employed in this project has been developed and improved over more than 15 years of research and development, both in the Formal System Laboratory (FSL) at the University of Illinois at Urbana-Champaign and at Runtime Verification, with generous funding from organisations including NSF, NASA, DARPA, NSA, Boeing, Microsoft, Toyota, and Denso. It is about time that aircraft grade, software analysis technology used for mission critical software gets deployed to smart contracts, the blockchain and cryptocurrencies. The project will be executed by a team of Runtime Verification experts led by Prof. Rosu, who will work closely with students at the University of Illinois, also funded by IOHK, and with IOHK R&D personnel. IELE and K Team Photo, left to right: Daejun Park (PhD student at UIUC, RV intern); Theodoros Kasampalis (PhD student at UIUC, RV intern); Yi Zhang (PhD student at UIUC, RV intern); Traian Serbanuta (RV; screen, left bottom); Grigore Rosu (RV and UIUC; screen, center, taking the picture); Virgil Serbanuta (RV; screen, right bottom); David Young (RV); Brandon Moore (RV); Yiyi Wang (RV); Dwight Guth (RV). Also Chris Hathhorn (RV), who missed picture.

IELE – A Register-Based Virtual Machine (VM) for the Blockchain

Based on learnings from defining KEVM, our semantics of EVM in K, we will design and define a new VM, which we call IELE (after the Mythological Iele). Unlike the EVM, which is a stack-based machine, IELE will be a register-based machine, like LLVM. It will have an unbounded number of registers and will also support unbounded integers. There are some tricky but manageable aspects with respect to gas calculation, a critical part of the design. Here are the forces that will drive the design of IELE:

  1. To serve as a uniform, lower-level platform for translating and executing smart contracts from higher-level languages, which can also interact with each other by means of an ABI (Application Binary Interface). The ABI will be a core element of IELE, and not just a convention on top of it. Also, unbounded integers and unbounded number of registers will make compilation from higher-level languages more straightforward and elegant and, looking at the success of LLVM, more efficient in the long term. Indeed, many of the LLVM optimizations are expected to carry over. For that reason, IELE will follow the design choices and representation of LLVM as much as possible. The team also includes LLVM experts from the University of Illinois (where LLVM was created).
  2. To provide a uniform gas model, across all languages. The general design philosophy of gas calculation in IELE is “no limitations, but pay for what you consume”. For example, the more registers a IELE program uses, the more gas it consumes. Or the larger the numbers computed at runtime, the more gas it consumes. The more memory it uses, in terms of both locations and size of data stored at locations, the more gas it consumes. And so on.
  3. To make it easier to write secure smart contracts. This includes writing requirements specifications that smart contracts must obey as well as making it easier to develop automated techniques that mathematically verify / prove smart contracts correct wrt to such specifications. For example, pushing a possibly computed number on the stack and then jumping to it regarded as an address makes verification hard, and thus security weaker, with current smart contract paradigms. We will have actual labels in IELE, like in LLVM, and structured jumps to those labels. Also, avoiding the use of a bounded stack and not having to worry about stack or arithmetic overflow will make specification and verification of smart contracts significantly easier.

Like KEVM, the formal semantics of EVM that we previously defined, validated and evaluated using the K framework, the design of IELE will also be done in a semantics-based style, using K. Together with a fast (LLVM-based) execution backend for K, it is expected that the interpreter obtained automatically from the semantics of IELE will be sufficiently efficient to serve as a reference implementation of IELE.

K as a Universal Language for the Blockchain

Besides EVM, several languages have been given a complete formal semantics in K, including C, Java, and JavaScript. Several others will be given K semantics as part of this project, including IELE itself, Solidity and Plutus. We want to allow all these languages, and possibly more as they are given K semantics, to be used for writing smart contracts. For that, we need to enhance the capabilities and implementation of the K platform itself, which will increase the practicality of all new and existing K semantics. We will implement several performance improvements not yet taken beyond proofs of concept, and develop production-quality implementations of analysis features such as the K symbolic execution engine, the semantics-based compiler (SBC), and the program verifier, which currently have only prototype-quality implementations in the academic K project.

Faster Execution

We plan to develop a concrete execution backend to K that will be at least one order of magnitude faster than the current one. The current one is based on translation to OCaml; we plan on translating to LLVM and specializing the pattern matcher to the specific patterns that occur in semantics. We believe it will be possible to execute programs against our KEVM semantics as efficiently as the reference C++ EVM implementation (!). If that will indeed be the case, and we strongly believe it will, then this will mark an unprecedented moment in the history of programming languages, when a language implementation automatically derived from a formal semantics of the language can serve as a realistic implementation of that language. While this was proved as a concept with toy languages, it has never been proven to work with real languages in practice. The K technology has reached a point where this is possible now. And not only to execute programs, or smart contracts, but also to reason about them, because a formal semantics, unlike an interpreter, can also be used for formal verification.

Semantics-Based Compilation

One of the most challenging components of the K framework that will be built in this project is what we call semantics-based compilation. The following picture shows how SBC works:

5104335

We have implemented a rough prototype and were able to make it work with a simple imperative language, which we call IMP. Here is an example:

5104335

The program to the left is transformed, using the semantics, into a much simpler program that looks like an abstract machine. The four states represent the “instructions” of the new language L’, and the edges are the new semantic rules of L’. As seen, the semantics of the various sequences of instructions has been symbolically summarized, so that the amount of computation that needs to be done at runtime is minimized and everything that can be done statically is hardwired in the new semantics of L’, so all done before the program is executed. Preliminary experiments are encouraging, confirming our strong belief that the resulting SBC programs will execute one order of magnitude, or more, faster:

5104335

Big Picture and Vision

These improvements to the K framework will not only yield a reasonably efficient prototype of executing smart contracts on IELE, but, more importantly, will give us an approach to write smart contracts in any programming languages that have a formal semantics in K.

Many of the new technologies and tools that will be developed in this project, such as program verifiers and semantics-based compilers, will be useful for any blockchain-based smart contract platform, including Ethereum VM and projects still in development, like Cardano and Tezos.

As a company, we are currently undertaking a number of efforts across the smart contract space. We are also defining the formal semantics of Viper in K, and will soon be announcing smart contract verification services for high-assurance contracts.

Get involved

In the spirit of open source, community-driven development, we will be holding project discussions on our new K Framework Riot channels at #k:matrix.org.

All software is released under the MIT-like UIUC license, and will be totally free for the community to use as they see fit. We welcome contributions on Git.

We encourage any interested parties to engage us, ask questions, contribute code, or build experience with our tools. We are also always looking for contributors able to work on documentation, efficient install/quickstart process for new developers, and more proof examples. We are hiring, and will be sure to keep an eye open for helpful contributors!

We will also be posting updates on our brand new Twitter page @rv_inc, which we hope any interested developers will follow and interact with.

Let's build a better technology for more secure smart contracts for everybody, together!

IOHK - The Project Funder

IOHK is a research and engineering company committed to using peer-to-peer innovations to provide financial services to the three billion people who don’t have them. IOHK builds cryptocurrencies and blockchains for academic institutions, government entities and corporations. IOHK focuses on practical, peer reviewed research to create live protocols, and the technological underpinnings to next-generation cryptocurrencies.

Charles Hoskinson, CEO of IOHK, said:

I’m honored and humbled to work with Grigore and his team at Runtime Verification on bringing both the K Framework and semantics based compilation into the cryptocurrency space. These innovations alongside Cardano’s new virtual machine – IELE – will eventually enable developers to write smart contracts in the programming language of their choice and with the tools they have come to know and love. Furthermore, we are extremely excited to explore formal verification of smart contracts with the RV team in order to ensure higher security, performance and reliability as smart contracts begin to pivot from experiments to enterprise and government adoption.

See also the Paymentweek Press Release.

10 thoughts on “New Technologies for the Blockchain: IELE (virtual machine) and K (universal language framework)

  1. You mention here (and in other places) the "reference C++ EVM implementation". I don't know which version of the code you refer to.

    The original version was a straightforward but inefficient "switch-on-opcode" implementation with an inefficient "gasometer" off to the side. Some other versions copied it, so in that sense it served as a reference. You might be able to approach its level of performance. The Yellow Paper remains authoritative, and I use it when I have questions about the intended semantics of the code I am crafting.

    The current version is more difficult to understand, because I put a lot of work into making it a lot more more efficient. I doubt you can approach its performance. With more work its speed could probably be doubled or more. And the more the efficiency of the C++ interpreter is improved the less useful it becomes as a reference. So we have a C++ implementation that is accessible to experts in C++ and optimized interpreter implementation. And a K specification that is incomprehensible to me, and probably to most other mere programmers. And a Yellow Paper that is difficult, but comprehensible to anyone who can read English and basic logical and mathematical notation. This is not a good situation. Or maybe I'm just old and in the way.
    -- Greg Colvin -- [email protected]

    • Thank you for your comment, Greg. You are not old and not in the way, you a major player in this field and a source of inspiration for all of us.

      We were referring to the C++ implementation of the EVM that we compared KEVM extensively with and reported in our technical report at https://www.ideals.illinois.edu/handle/2142/97207. which is whatever was at https://github.com/ethereum/cpp-ethereum a couple of months ago. When did you release the latest and fastest C++ implementation? We'd love to consider it as well in our future experiments. Note that we are also actively working on improving the performance of the K execution backend, through a translation to LLVM.

      We put our K specification of the EVM into a more human readable format at https://jellopaper.org/. Maybe this is easier to grasp? Probably what stays in the way is K itself, which is not like any usual programming language, because its purpose is to define a mathematical model of the language that can be used to formally verify programs in addition to executing them. But once you pass the K barrier, the EVM semantics itself is rather trivial; at lease when compared to other language semantics defined in K (like C11, Java 1.4, JavaScript ES5, etc.). The Jellopaper has a link at the top to a short overview video introducing the K notation; maybe that help.

      All the best,
      Grigore

  2. Hello Grigore.

    An exact date and time or Github commit hash would be needed to identify a version, but I haven't done anything to the code since October. If you can generate code that matches its performance I will be amazed and gratified. Further speed improvements would require things like moving the gas calculations to the branches, recognizing sequences of operations where native registers can be used instead of wide-integer libraries, and other such. Also, and Pawel tells me this is underway, using a better wide-integer library. All of this will further obscure the simple logic of the machine. The code stopped being simple sometime before Devcon2. So I guess I prefer it not be referred to as a reference implementation. The Yellow Paper is the spec.

    I feel old and in the way in the sense that I was taught to program in Fortran and several assembly languages, and like many of my peers have been producing extremely reliable software in insecure languages without the aid of formal methods ever since. And yet with great anxiety that somewhere fatal errors still lurk. "Testing can only demonstrate the presence of errors, not their absence." So I have great hopes for formal methods and am discouraged that I cannot understand the specification of a machine I know so well. I probably need time to work with someone who can teach me. And to get past (sorry) the sheer ugliness and redundancy of the syntax 😉

    • Greg, my colleagues doing the actual experiments will answer the questions about the exact version of the C++ implementation they are using. Our plan with the KEVM is to turn it into a reference semantics and model for the EVM. So it is beneficial for all of us if the C++ implementation goes for speed and worries less about human readability. That being said, we do have to improve the readability of our K EVM semantics, and we will. While in general we can always safely make an argument that an automatically generated interpreter can never beat a hand-crafted one implemented in C++, in the case of the EVM there is some hope to get close, because we may get to a point where a large portion of the execution time is spent in cryptographic functions, which we do not formalize.

      • To the extent I can follow them these sorts of formalizations look like implementations of the EVM in a very-high-level declarative language that in principle could be compiled all the way down to machine code. If so, we humans might not stand a chance 😉

  3. Also, the final word "syntax" above was humorously surrounded by opening and closing XML-style angle brackets enclosing the letter K. I don't know where they went, or how to prevent them being swallowed. This isn't the first place this has happened to me. I'm sure I'm not the first and won't be the last to wonder why almost any other syntax wasn't chosen. But as Bjarne Stroustrup says, there two kinds of programming languages, those people complain about, and those nobody uses. 😉

    All the best to you and your team too,
    Greg

  4. I hear you, but syntax matters. Programmers fight religious wars over syntax. To the extent that papers in formal semantics use familiar logical or programming notations backed up by English I can usually follow them. K is difficult that way. I understand that there is a tension there with a machine-readable syntax. But the sheer amount of money riding on such small programs creates a need and an opportunity for formal methods that this old man would like to take advantage of.

Leave a Reply

Your email address will not be published. Required fields are marked *