Erlang Central

BuggyPrecondition

Revision as of 10:45, 30 December 2008 by TribbleFaith467 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Contents

Authors

John Hughes posted by Thomas Arts

http://www.quviq.com/

How to find buggy preconditions

When writing state machine models with QuickCheck's eqc_statem, you specify preconditions that should hold in order to include a command in a test. How can you find out that the precondition you wrote is buggy?

When tests pass, you never see the test data—so unless you take care to gather relevant statistics, you can end up running thousands of successful tests that test something quite different from what you intended! In particular, when you use eqc_statem to generate tests, you would like to know that your tests contain a good mix of all the commands that you are supposed to be testing—and hitherto, this hasn’t been easy to check. Sound familiar? Then read on.

Example - process registery

For example, suppose we test the process registry, as in our basic training course, generating command sequences that contain calls to spawn, register, unregister and whereis:

command(S) -> 
    oneof( [{call,?MODULE,spawn,[]}]++ 
           [{call,?MODULE,register,  
               [name(),elements(S#state.pids)]}
             || S#state.pids/=[]]++ 
           [{call,?MODULE,unregister,[name()]}, 
            {call,erlang,whereis,[name()]}]). 

Here’s the property we’re testing:

prop_registration() -> 
   ?FORALL(Cmds,commands(?MODULE), 
           begin 
             {H,S,Res} = run_commands(?MODULE,Cmds), 
             [?MODULE:unregister(N) || {N,_} <- S#state.regs], 
             [exit(P,kill) || P <- S#state.pids], 
             Res==ok 
           end). 

There’s a little clean-up code in there to unregister and kill the pids used in each test, but otherwise this is pretty standard. Now, there is a problem in the QuickCheck specification we’re using here (not in these definitions, but elsewhere). All the tests pass, but we’re not really testing what we think. Let’s see how the problem can be found.

Collecting the lengths of test sequences

Of course, we can check the lengths of the generated test cases, by adding a collect(length(Cmds),…) to our property. If we do so, we’ll see output something like this:

OK, passed 1000 tests 
8% 0 
8% 2 
8% 1 
6% 4 
6% 3 
4% 6
4% 5 
4% 8 
4% 7
3% 12 
3% 11 
3% 10 
3% 9 
2% 16 
2% 15 
2% 18 
2% 14 
… 

We can see that many different lengths of sequence were run… up to 111 commands in this run, in fact… but it’s hard to draw very firm conclusions about the test data just from this.