Using RV-Predict to track down race conditions

Posted on July 2nd, 2015 by Dwight Guth
Posted in RV-Match, RV-Predict

We've all had it happen: you write some multithreaded code, but when you run it, it crashes with some kind of error every second or third or hundredth run. The traditional way of trying to debug this problem is very tedious and involves attempting to reproduce the error (difficult) and then tracing the inconsistent state backwards to find the source of the race condition (even more difficult).

Enter RV-Predict. Today I was developing code for RV-Match when I ran into an intermittently occurring NullPointerException in its parser. Instead of trying to reproduce the bug and tracking its behavior backwards, I assumed going in that it was caused by a race condition, and ran RV-Predict on the program. After it crashed, it spat out the following error report:

Data race on field org.kframework.parser.concrete2kore.kernel.Grammar$State.counter: {{{     Concurrent write in thread T1 (locks held: {})  ---->  at org.kframework.parser.concrete2kore.kernel.Grammar$State.<init>(Grammar.java:373)         at org.kframework.parser.concrete2kore.kernel.Grammar$NonTerminal.<init>(Grammar.java:287)         at org.kframework.parser.concrete2kore.kernel.KSyntax2GrammarStatesFilter.getGrammar(KSyntax2GrammarStatesFilter.java:44)         at org.kframework.parser.concrete2kore.ParseInModule.getGrammar(ParseInModule.java:100)         at org.kframework.parser.concrete2kore.ParseInModule.parseString(ParseInModule.java:107)         at org.kframework.kompile.Kompile.performParse(Kompile.java:406)         at org.kframework.kompile.Kompile.lambda$resolveBubbles$30(Kompile.java:315)         at org.kframework.kompile.Kompile$$Lambda$110.apply(Unknown:n/a)         at org.kframework.kompile.Kompile$$Lambda$102.apply(Unknown:n/a)         at org.kframework.definition.DefinitionTransformer.apply(transformers.scala:63)         at org.kframework.kompile.Kompile.parseDefinition(Kompile.java:247)         at org.kframework.kompile.Kompile.run(Kompile.java:125)         at org.kframework.kompile.KompileFrontEnd.run(KompileFrontEnd.java:89)         at org.kframework.main.FrontEnd.main(FrontEnd.java:52)         at org.kframework.main.Main.runApplication(Main.java:109)         at org.kframework.main.Main.runApplication(Main.java:99)         at org.kframework.main.Main.main(Main.java:51)     T1 is the main thread     Concurrent read in thread T16 (locks held: {})  ---->  at org.kframework.parser.concrete2kore.kernel.Grammar$State.<init>(Grammar.java:373)         at org.kframework.parser.concrete2kore.kernel.Grammar$NonTerminal.<init>(Grammar.java:287)         at org.kframework.parser.concrete2kore.kernel.KSyntax2GrammarStatesFilter.getGrammar(KSyntax2GrammarStatesFilter.java:44)         at org.kframework.parser.concrete2kore.ParseInModule.getGrammar(ParseInModule.java:100)         at org.kframework.parser.concrete2kore.ParseInModule.parseString(ParseInModule.java:107)         at org.kframework.kompile.Kompile.performParse(Kompile.java:406)         at org.kframework.kompile.Kompile.lambda$resolveBubbles$30(Kompile.java:315)         at org.kframework.kompile.Kompile$$Lambda$110.apply(Unknown:n/a)     T16 is created by n/a }}}

Turns out I had written the following snippet of code:

private static int counter = 0; final int uniqueId = counter++;

Sadly, post-increment is not an atomic operation, making this code not thread-safe.
The fix is simple, though, thankfully:

private static int counter = 0; final int uniqueId; public State() { ... synchronized (State.class) { uniqueId = counter++; } }

By using RV-Predict to localize the race down to a single read and a single write on a field, I was able to quite rapidly locate the bug and fix it without even having to be able to reproduce it.