Formal Verification Lore
Intuitive Intro to Why We Can Prove Programs Correct

Posted on April 2nd, 2024 by Juan Conejero

fv lore.png

This article aims to provide context to formal verification, its different approaches, and why it can provide ultimate assurance of code correctness. Throughout the article, we provide insights into the field of formal verification and some of its components rather than explain how to use a specific tool. These insights should make it clearer to the newcomer to the field why we can claim mathematical rigorousness when we perform formal verification.

Intuitively, formal verification can be described as: you specify what your program should do, and then prove that your program behaves as specified. But instead of explaining what specifying something is and then what proving that your program adheres to said specification means, let’s take one step back. What is a proof? What does it mean that something is “proven”? Why does it mean anything at all? These are by no means trivial questions, and, in fact, at the turn of the 20th century, the entire mathematical field was submerged in a crisis trying to come up with satisfying answers to these questions. To understand how we arrived at this crisis, and hence to understand what a proof is, we first need to know (by no means in exhaustive terms) the history of mathematics.

A Brief History of Mathematics 1

The concept of mathematics has been far from static throughout history. In the beginning (i.e., in Plato), mathematics was thought to be one of the most elevated sources of knowledge that we sensible humans could aspire to acquire2, offering the truth about a more perfect realm than the one we live in. Science, having to rely on faulty sensory organs to acquire its knowledge, was regarded as mere opinion. However, this was not the case regarding the relationship between mathematics and science at the time of the scientific revolution (a giant leap forward in time).

In science, parting from the Aristotelian goal of explaining the underlying principles that govern the world, Galileo introduced the methodological shift of describing how the world behaves. Thus, while eliminating the need to find the ultimate cause of such behaviors, mathematics was established as the ultimate language in which the Book of Nature was written. Using the same conceptual methodology as Galileo to study freefall, Newton studied gravity, seeking a mathematical description with no concern for its root cause. However, this mathematical description is by no means what we would intuitively think of today. Rather than laying out the foundations of the mathematical theory of calculus and, from there, using general mathematical methods to describe the desired natural phenomena, Newton resolved each problem individually. He didn’t devise a general mathematical theory that would seek to address all individual cases. In fact, after Newton’s “Principia” from 16873, it took more than 50 years until Euler published Newton’s laws of motion (which includes the modern formulation of the famous F=ma). The mathematics used by Newton in the Principia is more akin to synthetic geometry4. Modern calculus comes from the Leibnizian tradition, which uses a familiar algebraic and symbolic approach.

The algebraic and symbolic approach prevailed throughout the field of mathematics during the 18th century, but the goal remained the same: use mathematics to understand the Book of Nature. Researching new mathematical areas made sense if they could be applied to understanding the world around us.

At the same time, for mathematical proofs, we have to ask, “What was the source of mathematical truth?" Since mathematics was the language in which the world was written and could be described, mathematical truth was derived from the world it described. But there was also rigor, not just observations. However, mathematical rigor was not what we hold today with such high standards. Inference based on intuitive principles, such as the Principle of Permanence or the Generality of Algebra, were not out of place. Neither was filling proof gaps with empirical observations.

In that sense, differential equations that describe heat dissipation could be justified by observing how their formulation matched against empirical observations. In order to derive conclusions from such equations, one could have arguments at their disposal that we no longer deem valid. Such was the state of things during the 18th century.

Enter the 19th century. The century in which mathematics gained independence from any worldly business. With increasingly abstract and complex subjects, around the mid-century, it was well established that mathematics could be concerned with matters that had nothing to do with the physical world; being useful for understanding the Book of Nature was no longer required for a mathematical theory to exist.

Moreover, Euclidean geometry, thought to be the true description of how physical space works for more than a thousand years, had to contend with the rise of non-Euclidean geometries, which posed a different description of reality and were mathematically consistent. The state-of-the-art and true description of space became just one more theory within the ever-growing mathematical body of knowledge.

However, in the early 19th century there was still the assumption that the physical world could be faithfully described with mathematics. Sure, mathematicians can grow their abstract garden, but the world is mathematical in nature, and as such, it can be described as-is with mathematics. This view of mathematics also saw its end during the 19th century. Abstract mathematics with no physical representation started to yield better descriptions of the physical world than mathematics with more straightforward ties to what is observable5. Thus, the link between mathematics and reality was broken. Mathematical objects, structures, and theories became means to approximate or conceptualize how the actual physical world might be.

By the end of the 19th century, myriad branches of pure mathematics had been created, but there was no universal consensus on which methods were appropriate for creating new mathematical knowledge. In the old days, nature acted as a sort of foundation for mathematical knowledge, but there was no replacement for it yet. The pragmatism and intuition that were present throughout mathematics emphasized the need for rigor and explicitness.

This emphasis was not just a “nice to have.” There were inadequacies in proofs and a lack of understanding of the fundamental concepts and foundation that gave consistency to all these new mathematics. Closely inspecting the state of mathematics gave rise to paradoxes or inconsistencies that had to be solved. Two main fronts needed to be addressed: understanding the primitive notions each mathematical discipline built upon, and rigorously defining what constituted correct mathematical reasoning.

To make a very long story short, the goal was to devise a good enough axiomatic system to virtually formalize all of mathematics. But what is an axiomatic system, and what does it actually mean to formalize something?

Axiomatic Systems & Formalizing Stuff

An axiomatic system has two key components: a rigorous reasoning engine and a set of mathematical assumptions.

The rigorous reasoning engine is an unambiguous description of what it means for an expression to be true. If we want to know if 1 + 1 = 2 is true, there are two items that need to be resolved from the reasoning engine’s perspective for that statement to actually mean something. The first is if “1 + 1 = 2” constitutes a valid syntactical statement. To know if it’s a valid statement, we need to know the following:

  • What exactly is a number? When we say “1”, what is the exact mathematical object that 1 is? The mere concept of 1 or what it intuitively means is no longer valid.
  • What is this “+” symbol, and how can we use it? Is “1 + 2 +” a valid thing to say?
  • Likewise, what is “=”? We have an intuition of what “equal” is supposed to mean, but we need to lay down the rules for using it. Why is “1 = + 2” a valid or invalid statement? But what are the actual underlying rules that make it unequivocally clear when two things are equal?

It is important to know that when answering these questions, we are not looking for universal answers. That is, we’re not looking for the “true nature” of “1” or “+” or “=”. We are only looking for a rigorous definition of these symbols that leaves no room for interpretation or ambiguity.

Once this step is done, we should know what constitutes a valid expression and what doesn’t. That is, we have an argument that explains why “1 + 1 = 2” is a syntactically correct expression. But we still need to know if “1 + 1 = 2” is true or false.

First, without adding any mathematical assumptions, we must specify which deductive arguments are valid. Arguments outside of those specified will not be considered valid. A usual valid reasoning argument is modus ponens: if A is valid and we know that “A implies B” is also valid, we can conclude that B is valid.

Once we know what we accept as valid reasoning steps, we can add our list of mathematical assumptions. To exemplify how to formalize that one plus one equals two, we’ll use Robinson arithmetic 6.

Robinson arithmetic uses first-order logic with equality as a reasoning engine. It is not necessary to fully understand it for the purpose of this example.

The mathematical language of Robinson arithmetic defines the set of valid expressions N as follows:

  • The symbol 0: represents the concept of 0 and is a valid expression; hence, it belongs to N.
  • Applying the unary function S to a valid expression: if n is a valid expression (belongs to N), S(n) also belongs to N. This function represents the “successor” function. In this context, the number 1 would be represented as S(0) and 2 as S(S(0)).
  • Applying any of the binary infix functions + or · to valid expressions: if n and m are valid expressions (belong to N), n + m and n · m are also valid expressions.

Knowing the set of valid expressions in Robinson arithmetic, we need to define its mathematical axioms. Note that these axioms tell us how the symbols above behave; they don’t influence what constitutes a valid argument. What constitutes a valid argument is set by the above-mentioned first-order logic.

Let’s lay down Robinson’s arithmetic axioms. First, we define how the successor function S behaves:

  • S(n) ≠ 0: Zero is not the successor of any number
  • S(n) = S(m)n = m: if the successor of n equals the successor of m, then n must be equal to m
  • n=0 ∨ ∃m (S(m) = n): given a number n, either it is 0 or is the successor of another number m

Second, we define how the addition function + behaves:

  • n + 0 = n
  • n + S(m) = S(n + m)

Finally, we define how the multiplication function · behaves:

  • n·0 = 0
  • n·S(m) = (n·m) + n

We’re only interested in the successor and addition function axioms to prove that 1 + 1 = 2:

First, we need to convert the statements “1 + 1” and “2” into valid expressions of Robinson arithmetic: S(0) + S(0) and S(S(0)). Then, applying the second axiom of how + behaves with n being S(0) and m being 0, we have S(0) + S(0) = S(S(0) + 0).

For the next step, we won’t start with any of Robinson’s arithmetic axioms. Rather, we will use a deductive step from our reasoning engine, first-order logic with equality. This axiom is called Substitution for functions. It states that for all numbers n and m, and any function symbol f (such as S), we have that n = m → f(..., n, ...) = f(..., m, ...). For our particular case, this means that n = m →S(n) = S(m) is a valid implication.

From the first addition axiom, we know that S(0) + 0 = S(0). Combining this with the valid implication S(0) + 0 = S(0)→S(S(0) + 0) = S(S(0)), we can conclude by modus ponens that S(S(0) + 0) = S(S(0)).

We have produced two different equalities:

  • The first one, directly from the axioms of addition: S(0) + S(0) = S(S(0) + 0)
  • The second one, by applying the function substitution axiom from our reasoning engine to the first axiom of addition via modus ponens: S(S(0) + 0) = S(S(0))

We can now use the transitivity of equality: if n = m and m = p hold, then n = p holds. This is inherited from our reasoning engine, not our mathematical axioms of arithmetic. Now we can conclude our desired result: from S(0) + S(0) = S(S(0) + 0) and S(S(0) + 0) = S(S(0)), we can derive S(0) + S(0) = S(S(0)), which represents 1 + 1 = 2 in the language of Robinson arithmetic.

Although (admittedly very) tedious, we have been aware of the meaning of every symbol, its underlying assumptions, and the validity of each reasoning step we took to prove 1 + 1 = 27. We know why this constitutes a valid mathematical proof: all mathematical assumptions were clearly defined axioms, and our reasoning steps were clearly defined and studied (although not here). Note the stark contrast between this methodological approach and a “proof” based on the intuition of what natural numbers are and crediting truth to the statement 1 + 1 = 2 as being a fact of nature. Hopefully, this gives some intuition as to what it means to work with a formal system and the precision achieved with them.

As per how the crisis of rigor was solved, it is a long story full of intricacies, but here is a very general tl;dr:

  • Many different formalisms were devised to resolve the paradoxes that either were already present in the body of mathematics or arose from previous attempts at closing the rigor gap (such as Russell’s or Burali-Forti paradoxes).
  • The quest for finding an axiomatic system that could formalize all of mathematics was proven impossible by Gödel’s incompleteness theorems. These theorems state that any axiomatic system that is expressive enough will have statements that cannot be proven true or false within the system. That the axiomatic system in question is consistent (i.e., cannot derive any contradiction) is one of such statements.
  • Zermelo-Fraenkel set theory (ZFC) has been established as the canonical formalization of mathematics. Many independent statements of ZFC have been discovered over the years.
  • Although ZFC is considered the canonical formalization of mathematics, many other formalisms were created during this period, giving rise to a wide range of applications within and outside the foundations of mathematics.
  • As per questions like the concept of truth in mathematics or what constitutes a valid axiom, the field of philosophy of mathematics is rich in answers.

Formal Methods and Verification

So far, we’ve seen why rigorous techniques and methodologies were needed to produce mathematical knowledge. We also went through an example of how rigorous mathematical reasoning looks. But we want to prove programs correct, so what has all of this been for? Well, the tools developed out of the need for mathematical rigor are the basis or the starting point for the mathematical foundations that allow us to prove programs correct. Instead of just rehearsing what a formal system is, we now understand why they were needed in the first place and how to produce a formal proof by hand. Next, we show different approaches to prove properties of programs:

Interactive Theorem Provers (ITPs)

Also known as proof assistants, ITPs are software that can produce formal mathematical proofs with the user’s help. A proof assistant has three main components:

Foundations: each proof assistant has an underlying formal system. This formal system can be a foundational theory8 such as set theory or type theory. It can also be a logical system such as higher-order logic. The foundation of an ITP is one of its most defining features.

Proof Checker: with the foundations of an ITP comes the notion of formal proof. Each formal system has a rigorous definition of what a proof is, and with that definition comes a decision procedure to determine whether a given proof candidate (i.e., a series of arguments attempting to prove a given statement) actually constitutes a proof or not. The foundational rigorous definition of proof is translated to an algorithm (the proof checker), which is in charge of determining whether a statement has been proven or not.

Interface: the method for providing the arguments for the proof checker to validate. Proof assistants have specialized user interfaces to allow the creation of proofs. In addition to the actual frontend interface to write the proof steps, it also determines what kind of steps one can provide or how the proof development process is shown.

Interactive theorem provers can be used to formalize mathematics and do formal verification.

Mathematics papers don’t include proofs in the style we proved 1 + 1 = 2 above; that would be impractical and arguably impossible to do by hand. Formalizing mathematics means translating “pen and paper” proofs into a proof assistant to have it verify that the arguments provided are correct and that the result indeed holds. For those interested in the formalization of mathematics, we recommend watching Terence Tao’s talk “Machine Assisted Proof”.

Besides formalizing mathematics, proof assistants are also used to verify programs. Given a programming language, one can define the mathematical theory corresponding to it in a proof assistant. Once the mathematical foundations of a programming language are laid down, one can state properties about its programs as theorems and attempt to prove them. See this survey for a more in-depth explanation of how this can be done. As an example of formally verified software using a proof assistant, see the C compiler CompCert.

The price to pay for the virtually unbounded expressiveness that an ITP allows for is the lack of automation. Producing a moderately complicated proof in an ITP relies heavily on user input. Although different automation strategies are built into the different ITPs, a high degree of expertise is still needed to achieve meaningful results using a proof assistant for software verification.

Satisfiability Modulo Theories (SMT Solvers)

The SMT problem extends the Boolean Satisfiability problem (SAT) to domains that are more complex than the Boolean. But what do these words mean?

The SAT problem is deciding whether a formula in propositional classical logic can be satisfied. For example, can the formula p, where p is a boolean variable (that is, either true or false), have an evaluation (an assignment of its variables) such that the truth value of the formula is true? For this example, the answer is trivially yes since we can assign p the value true, satisfying the expression p. However, for a formula such as p & ¬p (that is, p AND the negation of p), any truth value assigned to p will render p & ¬p as false. These types of formulas are called unsatisfiable or UNSAT. That is, formulas for which no assignment of their variables exists that leads to the whole expression being evaluated as true. Given a propositional formula P(x1, x2, …, xn), the general problem of finding if there exists or not an evaluation (i.e., a truth assignment to the variables involved) such that the formula evaluates to true is called the Boolean Satisfiability problem.

The Satisfiability Modulo Theories problem extends the SAT problem by introducing predicates with more complex variables than Boolean. For example, we can have the boolean predicates P(x, y) := x > 10y and Q(x, y) := 5x > y, which take values true or false depending on the integer values of x and y (for instance, P(0, 2) evaluates to false). An SMT problem is asking if the formula S(x, y) := P(x, y) & Q(x, y) is satisfiable. Indeed, it is satisfiable since an assignment of x and y exists, such that S(x, y) evaluates as true (and that assignment can be x=23y=0).

The meaning of “Modulo Theories” is that the truth of, in this case, S(x, y), depends on the defined theory of arithmetic and how all the symbols involved in that expression have been defined. That is, the truth of the statements is specific to the theory of arithmetic (or whatever other theory) we define.

SMT solvers are tools that automatically solve the SMT problem for a restricted set of inputs and formulas. For instance, it might be that an SMT solver only accepts formulas without quantifiers.

These restrictions make it possible for algorithms to decide whether or not a given formula is satisfiable. If an algorithm finds a formula unsatisfiable, we have the formal guarantee that it indeed is. This also means that we can check for the validity of a formula. For example, if we want to know if S(x, y) is true for any possible inputs (i.e., valid), we can ask an SMT solver if the negation of S(x, y) is satisfiable. If that is the case, then S(x, y) is not valid because we know there are values for which it doesn’t hold.

It can also happen that an SMT solver doesn’t find an answer to a problem. If a specified amount of time has passed and the solver has not been able to determine whether a statement is SAT or UNSAT, it will stop trying to find an answer and inform the user that no conclusion has been reached.

Symbolic Execution

As the name implies, symbolic execution means executing a program with symbolic parameters rather than concrete.

When concretely executing a function, we obtain either an exception (if something went wrong) or a concrete return value. In contrast, when symbolically executing a function, we receive a graph exploring every possible execution path. The graph reflects how different execution paths change the execution environment.

To successfully extract every possible execution path, we first need a model of computation for that program. The model of computation (for instance, a rewriting system) allows us to rigorously explore possible outcomes of executing a program by expressing the programming language in it and having a formalized notion of computation.

Note that “possible outcomes” may have various formats. For instance, if we are modeling the execution of a function inside a virtual machine (VM), each node in the execution path can contain a snapshot of the entire state of the VM, thus keeping track of all the effects a specific execution path has on the whole environment.

Similarly to SMT solvers, and in contrast with ITPs, symbolic execution is an automated technique. That is, it doesn’t require constant interaction with the user to be performed. Although automation softens the user experience, it has other limitations. With the increasing complexity of a program comes the increased complexity of the symbolic execution graph, leading to the path explosion problem. When the execution graph of a function is just too large for a computer to analyze feasibly, heuristics are used to perform symbolic execution and not lose its generality.

Other Verification Approaches

We have only scratched the surface of what formal methods and formal verification have to offer. We have not explored, for instance, Model Checking or Automated Theorem Provers (ATPs). There is a large body of formalisms to reason about programs, such as Hoare Logic or Separation Logic, just to name a couple. An even larger body of tools uses those formalisms to perform rigorous analysis on computer programs, but we will leave these topics for another post.

We hope this article has helped you gain an intuition of what is going on behind the scenes in formal verification, and why it is an essential tool when dealing with critical software!


  1. For references, consult the first chapter of Penelope Maddy’s Defending the Axioms: On the Philosophical Foundations of Set Theory.

  2. Or, in Plato’s terms, remember.

  3. Not to be confused with Whitehead and Russell’s Principia Mathematica.

  4. See Newton's proof of Kepler's second law.

  5. For example, modeling a fluid as continuous.

  6. Deceivingly simple, Robinson arithmetic is powerful for Gödel’s incompleteness theorems to apply to it.

  7.  This is not, by far, the longest proof of 1 + 1 = 2 that exists.

  8. It is important to note that, although ZFC has been established as the canonical foundation for mathematics, it is not by far the only theory in which one can satisfactorily formalize a large part of mathematics.