Erlang Central

Difference between revisions of "Generate Unique Values"

From ErlangCentral Wiki

Line 64: Line 64:
  
 
What if one wants to create a fixed amount of unique values, say a vector of  
 
What if one wants to create a fixed amount of unique values, say a vector of  
length N. Once again, you may be happy with lists:seq(1,N), if you have
+
length N. Once again, you may be happy with lists:seq(1,N), if you  
 
are happy with integers. You can even write a function from integers to your
 
are happy with integers. You can even write a function from integers to your
 
actual value that generates a unique value depending on your input.
 
actual value that generates a unique value depending on your input.
Line 90: Line 90:
 
</code>
 
</code>
  
Other approach is to define a recursive generator.
+
It might feel counter-intuitive to generate random lists until finally one finds one in which no duplicates occur...
 +
The temptation is strong to be smarter than QuickCheck and do an "efficient" implementation.
 +
One way would be to generate new elements until that element has not been generated before until enough elements are generated.
 
<code>
 
<code>
 
uvector(0,Gen) ->
 
uvector(0,Gen) ->

Revision as of 12:52, 11 December 2009


Contents

Authors

Thomas Arts with additional ideas from John Hughes

http://www.quviq.com/

Generating Unique Values

How to write a generator that produces unique values?

Normally, if you need unique values, there is some notion of 'state', only when there is state, it is important that a value differs from a previous value. For example, if you send a sequence of messages to a server and each message needs a unique tag, different from previous tags. The server 'remembers' the state and if you want to perform positive testing, you want to avoid sending the same tag twice.

Lists of unique values

It may be worth considering to just create a deterministic list and take consecutive values from that list.

unique() ->
   lists:seq(0,100).

This is very deterministic and no random testing is involved at all, but it may well suffice your purpose.

More advanced, if you do not use integers as values, you could create the list and filter out the values that occur more than once. Since all terms in Erlang can be compared, there is a very simple way to filter duplicates, just use lists:usort/1.

unique(Generator) ->
   ?LET(Values,list(Generator),
	lists:usort(Values)).

Note that efficiency is no issue in test value generation!

In some cases, having a strictly increasing lists of values may be insufficient for testing. One may want to shuffle the values around to see the subject under test handle that well.

shuffle([]) ->
    [];
shuffle(L) ->
    ?LET(X,elements(L),[X|shuffle(lists:delete(X,L))]).

One possibly nice thing about this is the way that it shrinks:

prop_shuffle() ->
    ?FORALL(L,shuffle(lists:seq(1,10)),
            false).


erlang> eqc:quickcheck(example:prop_shuffle()).
Failed! After 1 tests.
[1,10,3,8,6,2,5,9,4,7]
Shrinking......(6 times)
[1,2,3,4,5,6,7,8,9,10]
false

Which may be of value in your testing in case you are sure that in particular for sorted lists, your software should work.

Vectors of unique values

What if one wants to create a fixed amount of unique values, say a vector of length N. Once again, you may be happy with lists:seq(1,N), if you are happy with integers. You can even write a function from integers to your actual value that generates a unique value depending on your input.

Another surprisingly well working approach is to generate vectors until you get one with only unique numbers.

uvector(N,G) ->
    ?SUCHTHAT(V,vector(N,G),V--lists:usort(V)==[]).

This may seem rather inefficient, since many vectors may contain a duplicate and you throw the vector away and build a new one. It only seems, since this performs really well as long as the number of elements that G can generate is sufficiently larger than the length N of the vector.

erlang> eqc_gen:sample(example:uvector(20,eqc_gen:int()))).
[-17,27,8,21,2,-48,32,47,46,-31,5,33,-22,50,35,30,18,-37,26,37]
[2,1,-1,29,-22,4,31,-12,-30,24,-2,27,-28,-32,19,15,30,33,8,-23]
[10,36,-40,-34,9,-20,-5,29,24,-11,-8,20,19,-26,-37,21,38,40,-12,15]
[-44,20,15,38,-10,-11,28,-12,0,-23,5,44,10,4,-37,-27,30,1,16,-42]
[-27,3,-10,15,12,39,-35,-15,-13,-29,-1,-31,1,7,18,-11,-34,-8,2,34]
[17,23,0,-6,-5,21,-13,8,-19,-24,4,10,22,-21,-15,-11,-23,19,-16,-10]
[44,-36,-8,9,12,13,-33,41,24,36,6,-32,27,21,31,-24,-39,1,-40,25]
[23,2,-17,31,-5,-11,18,-38,38,-34,6,-10,-37,9,-26,-19,-14,-28,-39,-9]
[26,-9,31,-3,30,2,4,3,-13,13,32,-26,15,-5,-11,-6,-29,-2,-20,-27]
[-19,23,9,14,2,10,13,-8,-18,19,-7,15,16,-4,-13,-3,-1,4,-14,1]
[-40,-8,18,47,-34,-50,-37,36,5,22,-5,6,1,-26,23,26,14,2,51,-17]

It might feel counter-intuitive to generate random lists until finally one finds one in which no duplicates occur... The temptation is strong to be smarter than QuickCheck and do an "efficient" implementation. One way would be to generate new elements until that element has not been generated before until enough elements are generated.

uvector(0,Gen) ->
   [];
uvector(N,Gen) ->
   ?LET(Values,uvector(N-1,Gen),
        ?LET(Value,?SUCHTHAT(V,Gen, not lists:member(V,Values)),
             [Value|Values])).

A simple use of this generator could be to create unique integers or to fail on a unique vector of booleans.

9> eqc_gen:sample(example:uvector(16,eqc_gen:int())).
[-9,9,-7,-8,8,-5,2,-3,0,4,-6,6,10,-1,5,1]
[9,12,3,1,0,7,2,-9,-11,4,-2,11,-10,8,-3,10]
[13,-11,5,12,-1,10,-2,0,-10,-6,-7,3,-3,6,9,11]
[-10,9,-4,-13,1,-12,10,13,-6,-7,-14,6,-8,-11,2,-1]
[8,-2,9,-1,-11,-13,6,-8,3,15,4,-14,-7,-5,2,-3]
[-4,16,3,2,1,-13,5,0,-10,6,4,9,14,-8,-1,12]
[16,7,-5,14,-16,5,9,-10,-15,11,3,10,-2,-8,-6,-12]
[9,17,-9,-16,-3,3,-12,-13,8,7,-15,2,-5,15,13,-8]
[6,-7,18,-12,-16,-3,-17,-5,2,-6,16,4,3,11,-13,14]
[-6,18,19,20,-17,9,-14,1,15,-1,-4,-11,2,5,-18,8]
[-18,7,-7,-21,10,19,-16,-3,4,-11,3,-15,-12,-5,15,-2]
ok
10> eqc_gen:sample(example:uvector(4,eqc_gen:bool())).
** exception exit: "?SUCHTHAT failed to find a value within 100 attempts."
     in function  eqc_gen:sample/1