Erlang Central

Random Numbers Biased

Revision as of 01:52, 25 August 2006 by Cyberlync (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Problem

Instead of a truly random number, you wish to randomly select a value from a set in which some values are more likely than others. For example, you may wish to simulate a normal distribution (i.e., a "bell curve") for a set of data.

Solution

We will give a recipe for generating numbers with a normal distribution (aka Gaussian distribution, the bell shaped one). No existing library supports this functionality, though inspiration for one can be gained by viewing the Schemathics CVS. We will discuss a do-it-yourself method for explanatory purposes.

You will have to determine what kind of distribution you want, and locate the appropriate algorithm from a statistics reference.

For this recipe, we will consider the normal (Gaussian) distribution. If you need other distributions see either the CVS or consult a numerical analyst.

The function dis_var_new returns a stochastic variable (a thunk) with mean mu and standard deviance sigma.:

% derived from example in the documentation of SRFI27
% and translated to Erlang
-record( dist_state, {state, mu, sigma} ).

dist_var_new(Mu, Sigma) ->
    dist_var_new(false, Mu, Sigma).

% create the thunk
dist_var_new(State, Mu, Sigma) ->
    This = #dist_state {
        state = State,
        mu = 1.0 * Mu,
        sigma = 1.0 * Sigma},
    IntPid = spawn(cookbook, dispatch, [This]),
    fun () ->
        IntPid ! {self(), value},
        receive
            {retval, Any} -> Any
        end
    end.

dispatch(This) ->
    receive
        {Pid, value} ->
            {NewThis, Value} = value(This),
            Pid!{retval, Value},
            dispatch(This)
    end.

value(This) ->
    case This#dist_state.state of
        true -> Val = This#dist_state.mu
                    + (This#dist_state.sigma * This#dist_state.state),
                {This#dist_state{state = false}, Val};
        _    -> sigma_loop(This)
    end.
   
sigma_loop(This) ->
    V1 = 2.0 * random:uniform() - 1.0,
    V2 = 2.0 * random:uniform() - 1.0,
    S  = (V1 * V1) + (V2 * V2),
    if
        S >= 1.0 ->
            sigma_loop(This);
        true     ->
            Scale = math:sqrt( (-2.0 * math:log(S)) / S),
            Val = This#dist_state.mu
                  + (This#dist_state.sigma * Scale * V1),
            {This#dist_state{state = Scale * V2}, Val}
    end.

This is an interesting example, because it makes use of the fact that Erlang processes are so cheap to create and use. In effect, we've created a tiny server application who's only purpose is to listen for invocation requests and return a new normalized random number.

An example of usage:

1> X=dist_var_new(0, 1).      
#Fun<cookbook.4.38595032>
2> X1().
-0.873932
3> X1().
4.91005e-2
4> X1().
1.55993e-2
5> X1().
0.181456

If you are unsatisfied with the fact that you get the same numbers as above, then randomize the source of the random numbers:

6> random:seed()

The algorithm used is the polar Box Muller method. The algorithm takes two independent uniformly distributed random numbers between 0 and 1 (present in the code as random:uniform()) and generates two numbers with a mean of my and standard deviation sigma. Note that the method produces two numbers at a time. Since we only need one, the second is saved for later in the variable next.


Note that the Perl Cookbook includes an interesting discussion of converting a set of values (and weights) into a distribution. This should also be converted to Erlang and shown here.

Mathematically-inclined Erlangers should also take a good look at Schemathics, which contains these and many other statistical methods.