Formal Verification 101 for Blockchain Systems and Smart Contracts: Formalizing Requirements
In this second part of our four part series, we will discuss the process of formalizing system requirements and how it fits into the larger context of formal verification for blockchain systems and smart contracts.
Recall that formal verification is all about knowing whether our system implementation (e.g., blockchain system/smart contract), satisfies our system requirements.
Today’s article is about the process of converting our requirements document into an equivalent formal, mathematical requirements specification.
The Long Road from English to Logic
English is the lingua franca of modern business. For that reason, unsurprisingly, we often write our requirements as simple English text documents. Unfortunately, for formal verification purposes, English language documents are far too imprecise and typically gloss over important implementation details. For this reason (as we will see later), we typically want a formal verification expert to help us rewrite our English language requirements document into a formal and mathematical requirements specification. For example, suppose we are designing a smart contract that implements an open bid auction. We might write the requirements for an auction contract as follows:
English Language Requirements
Each auction has a host, an active period, a minimum bid, an optional highest bidder (defaulting to none), and an unbounded number of bidders. While the auction is active, bidders may place bids in sequence. If the bid is greater than the minimum bid, then it is recorded as the new minimum bid and the bidder is simultaneously recorded as the highest bidder. Otherwise, the bid is ignored. When the auction ends, an amount equal to the highest bidder’s bid is deducted from the bidder’s account and transferred to the auction host, unless there were no bidders.
So, now we want to convert that English description into a formal mathematical description. To do that, we will choose a mathematical logic that is appropriate to express our desired properties. Just like there are different languages like English; French; Chinese; etc…, there are different mathematical logics that we can use. Some common choices include: equational logic, first-order logic, set theory, higher-order logic/type theory, rewriting logic, or reachability logic. How do we know which one to use? As it turns out, the logics used in (a) formalizing our requirements, (b) our theorem prover, and (c) our formal semantics all need to fit together, so how we make this choice for one part of our process will affect the others.
For our purposes we will use reachability logic (this is also the logic used by the K Framework) since it both (1) allows us to very naturally formalize requirements for systems that evolve and (2) has great tool support. So that means we will describe our system as a reachability logic theory. Reachability logic theories have (a) syntax and configuration that describes our system state; and (b) reachability rules that describe our system requirements. Let’s invent some syntax and a configuration now to define an auction contract state. To do this, we will use the syntax schema provided by K language.
Auction State (Syntax/Configuration)
syntax Address ::= (Letter | Number)* syntax MaybeAddress ::= Address | “None” configuration <host> Address </host> <start> Number </start> <end> Number </end> <highestBidder> Address </highestBidder> <minBid> Number </minBid>
With this syntax and configuration, we can write concrete auction states like:
<host> ab2r3f <host> <start> 10 </start> <end> 20 </end> <highestBidder> None </highestBidder> <minBid> 15 </minBid>
The meaning of the configuration hopefully is clear: the auction starts at time unit 10, ends at time unit 20, is hosted by the the owner of the account with address ab2r3f, with no viable bids yet, and a minimum bid of 15 tokens. Now, this simple syntactic description already has the potential to help us spot problems---in particular, it helps us see whether our system state description is incomplete. In fact, our auction configuration is missing one crucial element: the current time! This omission makes sense, since time was never directly mentioned in our original contract description and only indirectly referenced. So, a revised auction contract state configuration will look like:
configuration <host> Address </host> <start> Number </start> <end> Number </end> <currentTime> Number </currentTime> <highestBidder> Address </highestBidder> <minBid> Number </minBid>
Now we know when the auction is active. For example, the auction configuration:
<host> ab2r3f <host> <start> 10 </start> <end> 20 </end> <currentTime> 5 </currentTime> <highestBidder> None </highestBidder> <minBid> 15 </minBid>
describes an auction that has not started yet. There are other issues we might clarify as well, e.g., a Number must be zero or positive, since we don’t want negative money or time.
Auction Contract Requirements (Rules)
Reachability logic rules have the form:
rule Configuration1 => Configuration2 requires A ensures B
This can be read as: a system in states described by Configuration1 that satisfy condition A must evolve into systems described by Configuration2 that ensure condition B. As a convenience, we will assume that parts of the configuration that are not mentioned do not change. Now let’s see some example rules.
Immediately, a simple rule comes to mind: for any duration of time, we need (a) the duration to be static, i.e. unchanging; and (b) the start to come before the end. This can be described by a rule of the form:
rule <start> S </start> <end> E </end> => <start> S </start> <end> E </end> requires S < E
Requirement (a) is covered by the fact that the start and end time are unchanged by this rule; requirement (b) is covered by the requires clause.
Monotonicity of Time and Bid Values
In all but the most exotic of physics studies, time always moves forward. Let’s encode that requirement into a rule:
rule <curTime> C </curTime> => <curTime> C’ </curTime> requires C <= C’
This rule is saying that current time C must evolve into some (possibly different) time C’ such that C is less than or equal to C’.
We have an almost identical rule that states the minimum bid must always increase in value or stay the same (in case no new bid is made):
rule <minBid> M </minBid> => <minBid> M’ </minBid> requires M <= M’
The finality rule states that the auction state must not change after the auction has ended. We can write that as the rule:
rule <end> E </end> <curTime> C </curTime> <highestBidder> B </highestBidder> <minBid> M </minBid> => <end> E </end> <curTime> C’ </curTime> <highestBidder> B </highestBidder> <minBid> M </minBid> requires E <= C
This rule exactly encodes our requirements.
Pulling Back the Curtain
From the above description, one might be lulled into thinking that smart contract verification isn’t really that hard after all. Unfortunately, we have hidden much of the complexity of the verification problem for the purposes of simplifying our explanation. Let us now pull back the curtain to see the complexity of verifying real smart contracts. In practice:
- Our system syntax/configuration actually describes an entire blockchain system, e.g., our syntax and configuration would be that of Ethereum virtual machine (EVM) bytecode;
- Our rules would talk about how a particular EVM bytecode auction smart contract evolves. That contract would involve variables for our state components we mentioned above, i.e., it will have a variable host for the auction host which is an address, a variable start for the auction start time, etc…
The upshot of (1)-(2) is that, in practice, the source code programs we reason about (whether smart contracts or blockchain system code) only indirectly describe the system we want to design, i.e., we do not get to design whole new languages from scratch to specifically describe our problem. This means the mapping from our formalized requirements must talk about concepts totally unrelated to our overall desired system behavior, e.g., EVM details that have absolutely nothing to do with auctions like EVM instructions and memory locations.
We have seen an overview of how English language requirements are mapped into formalized mathematical requirements using K with an eye towards verifying blockchain systems and smart contracts. Furthermore, we see that the formal verification problem goes deep. It requires someone who is simultaneously (i) a logic expert (e.g. reachability logic) and; (ii) a systems expert (e.g. Ethereum). Fortunately, many logical tools (e.g. the K Framework) and textbooks exist to help systems experts become more proficient in the mathematical logic skills needed to tackle formal verification.
Is your company or organization using formal verification to accelerate your development processes and mitigate potential design risks? Do you need a trusted partner to help you untangle the issues surrounding your blockchain system design, smart contract language, or smart contract implementation? If so, we’d love to help.
Please contact us to get in touch!