Skip to main content

L1329 - Latin-1 Encoded Binary String

Warning

motorhead() ->
<<"Motörhead">>.
%% ^^^^^^^^^^^^^ warning: binary string will be Latin-1 encoded

Explanation

A binary text segment with no /utf8, /utf16, or /utf32 specifier is encoded using one byte per character (Latin-1), for historical reasons. This is fine for characters in the ASCII range (0-127), but characters in the range 128-255 are encoded as a single Latin-1 byte that is not compatible with UTF-8, and code points above 255 cannot be represented at all (see L1330).

In the example, <<"Motörhead">> encodes ö (code point 246) as the single byte 246, which is invalid UTF-8.

%% Instead of:
motorhead() -> <<"Motörhead">>.

%% Write (UTF-8):
motorhead() -> <<"Motörhead"/utf8>>.

%% Or, since OTP 27, simply:
motorhead() -> ~"Motörhead".

If Latin-1 is really what you want, this warning can be suppressed with -compile(nowarn_latin1_binary).