Erlang Central

ISO 8859 1 TO UTF8

Revision as of 14:27, 9 March 2007 by Tmster (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Problem

You want to transform a string encoded as ISO-8859-1 into UTF-8 format.

Solution

The following function does the magic:

to_utf8([H|T]) when H < 16#80 -> [H | to_utf8(T)];                                                                 
to_utf8([H|T]) when H < 16#C0 -> [16#C2,H | to_utf8(T)];                                                           
to_utf8([H|T])                -> [16#C3, H-64 | to_utf8(T)];                                                       
to_utf8([])                   -> [].                        

Example

1> "This is some extra Swedish chars: åäö".                    
"This is some extra Swedish chars: åäö"
2> iso_8859_1:to_utf8("This is some extra Swedish chars: åäö").
"This is some extra Swedish chars: åäö"