Erlang Central

Extracting Unique Elements From a List

Revision as of 02:32, 31 August 2006 by Cyberlync (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Problem

You want to eliminate duplicate values from a list.

Solution

There are several possible solutions for this problem. Here are some of these:

Using lists:usort

The lists module contains a wealth of list processing functionality. One possible solution to this problem is to use the lists:usort function, which takes a list and returns a sorted copy of the original list, with all duplicates removed:

1> UL = [1,2,8,7,8,10,3,12,3,99,188,3,2,1,3,5,15,72].
[1,2,8,7,8,10,3,12,3,99,188,3,2,1,3,5,15,72]
2> lists:usort(UL).
[1,2,3,5,7,8,10,12,15,72,99,188]

Using the sets Module

Erlang standard libraries includes a module, sets, with a variety of functions related to generating, creating, and manipulating mathematical sets.

10> Set = sets:from_list(UL).
{sets,12,
      16,
      16,
      8,
      80,
      48,
      {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
      {{[],[99,3],[],[],"\274\f",[15],[2],[5],"H\b",[],[],[1],[],[7],"\n",[]}}}
11> sets:to_list(Set).
[3,99,12,188,15,2,5,8,72,1,7,10]

Note that the final list produced by this function is not sorted, so this is the best approach if you want the set to contain the data in the order in which it was inserted.

Using a General Balanced Tree Set (gb_set)

Erlang's standard libraries includes an implementation of Professor Arne Andersson's General Balanced Trees. These structures are more costly than sorting lists for small sets, but this is a much more efficient implementation when working with large sets of data.

The gb_set:from_list function will produce an ordered set of elements (dropping duplicates). The set can then be extracted back to a list for other use:

3> GBSet = gb_sets:from_list(UL).
{12,
 {10,
  {5,{2,{1,nil,nil},{3,nil,nil}},{8,{7,nil,nil},nil}},
  {72,{15,{12,nil,nil},nil},{188,{99,nil,nil},nil}}}}
4> gb_sets:to_list(GBSet).
[1,2,3,5,7,8,10,12,15,72,99,188]