Why list and charlist are confusing in Elixir

This post talks about list and charlist in Elixir, and discusses one specific issue in Elixir.

Let’s look at the confusing thing in Elixir:

iex(2)> a = [7]
‘\a’
iex(3)> b = ‘\a’
‘\a’
iex(4)> a == b
true
iex(5)> is_list(a)
true
iex(6)> is_list(b)
true

First, a list is shown as a charlist in iex; second, (without surprise if we believe in iex), the charlist equals the list; last, both the list and the charlist are lists.

Why? Because charlist in represented as a list with code point(s) inside, and those encoding scheme is UTF-8 (rather than UTF-16 in Erlang). To convince yourself:

iex(7)> Enum.map(0..255, fn x -> [x] end)
[[0], [1], [2], [3], [4], [5], [6], ‘\a’, ‘\b’, ‘\t’, ‘\n’, ‘\v’, ‘\f’, ‘\r’,
[14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26],
‘\e’, [28], [29], [30], [31], ‘ ‘, ‘!’, ‘”‘, ‘#’, ‘$’, ‘%’, ‘&’, ‘\”, ‘(‘,
‘)’, ‘*’, ‘+’, ‘,’, ‘-‘, ‘.’, ‘/’, ‘0’, ‘1’, …]

Looks familiar? Yes, those are ASCII codes. So the conclusion seems to be that list and charlist are the same in Elixir. Well, this is mostly right, as we have seen, but not exactly right. Run the file below to inspect into list and charlist internal using pry:

require IEx

value_list = [7]
value_list2 = [1]
value_char = '\a'

IEx.pry

daveti@Daves-MacBook-Pro:~/elixir/list_issue$ iex list_issue.exs
Erlang/OTP 20 [erts-9.0] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]

Request to pry #PID<0.83.0> at list_issue.exs:7

5: value_char = ‘\a’
6:
7: IEx.pry

Allow? [Yn]
Interactive Elixir (1.5.1) – press Ctrl+C to exit (type h() ENTER for help)
pry(1)> i value_list
Term
‘\a’
Data type
List
Description
This is a list of integers that is printed as a sequence of characters
delimited by single quotes because all the integers in it represent valid
ASCII characters. Conventionally, such lists of integers are referred to as
“charlists” (more precisely, a charlist is a list of Unicode codepoints,
and ASCII is a subset of Unicode).
Raw representation
[7]
Reference modules
List
Implemented protocols
IEx.Info, Collectable, Enumerable, Inspect, List.Chars, String.Chars
pry(2)>
nil
pry(3)> i value_list2
Term
[1]
Data type
List
Reference modules
List
Implemented protocols
IEx.Info, Collectable, Enumerable, Inspect, List.Chars, String.Chars
pry(4)>
nil
pry(5)> i value_char
Term
‘\a’
Data type
List
Description
This is a list of integers that is printed as a sequence of characters
delimited by single quotes because all the integers in it represent valid
ASCII characters. Conventionally, such lists of integers are referred to as
“charlists” (more precisely, a charlist is a list of Unicode codepoints,
and ASCII is a subset of Unicode).
Raw representation
[7]
Reference modules
List
Implemented protocols
IEx.Info, Collectable, Enumerable, Inspect, List.Chars, String.Chars
pry(6)>

Now you see the difference between list and charlist. Both have data type list, which explains why they are equivalent. Moreover, when the numbers inside the list could be interpreted as valid ASCII code points, Elixir shows it as a charlist rather than a list, although the raw presentation is always a list.

Why this is confusing? Let’s say you wanna print out a list, e.g., [7], but what Elixir displays is ‘\a’. Although it does not affect the internal computation, e.g., map, this display issue is really annoying and confusing. In this case, Erlang seems doing sane:

Eshell V9.0 (abort with ^G)
1> [7].
[7]
2> ‘\a’.
a
3> [7] == ‘\a’.
false

About daveti

Interested in kernel hacking, compilers, machine learning and guitars.
This entry was posted in Programming and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s