Code smell: Boolean blindness

At Runtime Verification, we are using Haskell to develop the next generation of formal verification tools based on the K Framework. This article describes how we use algebraic data types to write expressive Haskell code.

Bool represents a single bit of information:

data Bool = False | True

The popular term “boolean blindness” refers to the information lost by functions that operate on Bool when richer structures are available. Erasing such structure can give code a bad smell. Using more structure can produce interfaces that are easier to document, use, decompose, and generalize.

Example

Data.List.filter is a classic example of a function that would benefit from a richer interface type:

filter :: (a -> Bool) -> [a] -> [a]
--         ^^^^^^^^^
--         predicate: keep or discard each element

A “filter” has two uses:

  1. To collect the fluid (filtrate) and discard the particulate, such as when making coffee.
  2. To collect the particulate and discard the filtrate, such as when collecting gold.

Just as there are two uses for a filter, I can never remember if filter’s predicate means “keep” or “discard”. For the record, the Prelude definition is:

filter _    []       = []
filter keep (a : as)
  | keep a           = a : filter keep as
  | otherwise        =     filter keep as

… I think. If I had the other use of “filter” in mind when I wrote the function, I might have written instead,

filter' :: (a -> Bool) -> [a] -> [a]
filter' _       []       = []
filter' discard (a : as)
  | discard a            =     filter' discard as
  | otherwise            = a : filter' discard as

filter and filter' are both perfectly fine functions, only their intent differs slightly.

Expressive renaming

One (ahem) solution to clarify intent is to give filter a better name, one that does not have a dual identity:

select :: (a -> Bool) -> [a] -> [a]
--         ^^^^^^^^^
--         predicate: select each element
select _    []       = []
select keep (a : as)
  | keep a           = a : select keep as
  | otherwise        =     select keep as

An alternative is to supply rename the type and constructors expressively,

data Keep = Discard | Keep

filter1 :: (a -> Keep) -> [a] -> [a]
filter1 _    []       = []
filter1 keep (a : as)
  | Keep <- keep a    = a : filter1 keep as
  | otherwise         =     filter1 keep as

This popular style has a practical disadvantage: there are already many functions that work with Bool, but none will work with Keep unless we reimplement them ourselves.

Expressive structure

Including more structure in the type of filter can make the definition more general while expressing our intent more clearly in the code. The most important feature of filter is that each element of the input list corresponds to zero (Discard) or one (Keep) elements of the output. There is already a type which structurally represents zero or one elements,

data Maybe a = Nothing | Just a

We can use Maybe to express the intent that each input element yields zero or one output elements:

filter2 :: (a -> Maybe a) -> [a] -> [a]
filter2 _    []         = []
filter2 keep (a : as)
  | Just b <- keep a    = b : filter2 keep as
  | otherwise           =     filter2 keep as

filter2 is a generalization of filter because we might transform the list as we filter it:

-- | Select only the positive elements and decrement them.
decrementPositive :: [Integer] -> [Integer]
decrementPositive = filter2 (\x -> if x > 0 then Just (x - 1) else Nothing)

Although the type of filter2 is quite suggestive to the programmer, it is still not fully faithful to our intent! Consider the following implementation with the same type, which has a subtle bug:

filter2' :: (a -> Maybe a) -> [a] -> [a]
filter2' _    []         = []
filter2' keep (a : as)
  | Just b <- keep a     = a : filter2' keep as
  | otherwise            =     filter2' keep as

We can use parametric polymorphism to express our intent faithfully in the code,

filter3 :: (a -> Maybe b) -> [a] -> [b]
filter3 _    []         = []
filter3 keep (a : as)
  | Just b <- keep a    = b : filter3 keep as
  | otherwise           =     filter3 keep as

The type of filter3 ensures that its implementation only collects outputs from the predicate, because that is the only thing in scope which can produce a value of b. Note that we did not have to change the implementation because it was already correct! We only needed to change the type to express our intent more clearly. Changing the type made the implementation still more general: the predicate may now even transform the elements to a different type while filtering the list.

Generalization, composition, and decomposition

filter3 is not an evocative name, and in generalization we have strayed from the original use of “filter”. Luckily, we do not need to name the function ourselves: it already exists as Data.Witherable.mapMaybe. The original filter is implemented with a little help:

keeping :: (a -> Bool) -> (a -> Maybe a)
keeping predicate a
  | predicate a = Just a
  | otherwise   = Nothing

filter'' :: (a -> Bool) -> [a] -> [a]
filter'' predicate = mapMaybe (keeping predicate)

The generalized filter encourages to build a reusable components like keeping, and keeping also allows us to reuse all the convenient definitions Prelude provides for Bool.

Post script: How far is too far?

It might be tempting to carry on generalizing. Instead of allowing zero or one output elements per input, we might allow any number of outputs:

filter4 :: (a -> [b]) -> [a] -> [b]
filter4 _    []       = []
filter4 keep (a : as) = keep a ++ filter4 keep as

This function is probably too general to be useful as a generalization of filter: by allowing any number of outputs, we allow so many transformations that it makes the intent unclear. This generalization does appear, though, in the Monad instance of lists:

flip (>>=) :: (a -> [b]) -> [a] -> [b]
-- flip (>>=) === filter4

In the context of filter, we may have gone too far, but in another context this might be exactly the abstraction we need.

4 thoughts on “Code smell: Boolean blindness

  1. No need to bring up monads when talking about `filter4`. It's actually equivalent to the simple `concatMap` function.

  2. Would be great to have some comment on why the move from OCaml to Haskell. Was it because of advantages on the language, or availability of tools/libraries, of developers...?

    • I wouldn't describe this as a move away from OCaml. We have a concrete execution engine for K which targets OCaml, but the new work in Haskell is on a symbolic execution engine. The new work complements the existing symbolic execution engine (implemented in Java). (We also have a concrete execution engine that targets LLVM.)

Leave a Reply

Your email address will not be published. Required fields are marked *