This post talks about list and charlist in Elixir, and discusses one specific issue in Elixir.
Let’s look at the confusing thing in Elixir:
iex(2)> a = [7]
‘\a’
iex(3)> b = ‘\a’
‘\a’
iex(4)> a == b
true
iex(5)> is_list(a)
true
iex(6)> is_list(b)
true
First, a list is shown as a charlist in iex; second, (without surprise if we believe in iex), the charlist equals the list; last, both the list and the charlist are lists.
Why? Because charlist in represented as a list with code point(s) inside, and those encoding scheme is UTF-8 (rather than UTF-16 in Erlang). To convince yourself:
iex(7)> Enum.map(0..255, fn x -> [x] end)
[[0], [1], [2], [3], [4], [5], [6], ‘\a’, ‘\b’, ‘\t’, ‘\n’, ‘\v’, ‘\f’, ‘\r’,
[14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26],
‘\e’, [28], [29], [30], [31], ‘ ‘, ‘!’, ‘”‘, ‘#’, ‘$’, ‘%’, ‘&’, ‘\”, ‘(‘,
‘)’, ‘*’, ‘+’, ‘,’, ‘-‘, ‘.’, ‘/’, ‘0’, ‘1’, …]
Looks familiar? Yes, those are ASCII codes. So the conclusion seems to be that list and charlist are the same in Elixir. Well, this is mostly right, as we have seen, but not exactly right. Run the file below to inspect into list and charlist internal using pry:
require IEx value_list = [7] value_list2 = [1] value_char = '\a' IEx.pry
daveti@Daves-MacBook-Pro:~/elixir/list_issue$ iex list_issue.exs
Erlang/OTP 20 [erts-9.0] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
Request to pry #PID<0.83.0> at list_issue.exs:7
5: value_char = ‘\a’
6:
7: IEx.pry
Allow? [Yn]
Interactive Elixir (1.5.1) – press Ctrl+C to exit (type h() ENTER for help)
pry(1)> i value_list
Term
‘\a’
Data type
List
Description
This is a list of integers that is printed as a sequence of characters
delimited by single quotes because all the integers in it represent valid
ASCII characters. Conventionally, such lists of integers are referred to as
“charlists” (more precisely, a charlist is a list of Unicode codepoints,
and ASCII is a subset of Unicode).
Raw representation
[7]
Reference modules
List
Implemented protocols
IEx.Info, Collectable, Enumerable, Inspect, List.Chars, String.Chars
pry(2)>
nil
pry(3)> i value_list2
Term
[1]
Data type
List
Reference modules
List
Implemented protocols
IEx.Info, Collectable, Enumerable, Inspect, List.Chars, String.Chars
pry(4)>
nil
pry(5)> i value_char
Term
‘\a’
Data type
List
Description
This is a list of integers that is printed as a sequence of characters
delimited by single quotes because all the integers in it represent valid
ASCII characters. Conventionally, such lists of integers are referred to as
“charlists” (more precisely, a charlist is a list of Unicode codepoints,
and ASCII is a subset of Unicode).
Raw representation
[7]
Reference modules
List
Implemented protocols
IEx.Info, Collectable, Enumerable, Inspect, List.Chars, String.Chars
pry(6)>
Now you see the difference between list and charlist. Both have data type list, which explains why they are equivalent. Moreover, when the numbers inside the list could be interpreted as valid ASCII code points, Elixir shows it as a charlist rather than a list, although the raw presentation is always a list.
Why this is confusing? Let’s say you wanna print out a list, e.g., [7], but what Elixir displays is ‘\a’. Although it does not affect the internal computation, e.g., map, this display issue is really annoying and confusing. In this case, Erlang seems doing sane:
Eshell V9.0 (abort with ^G)
1> [7].
[7]
2> ‘\a’.
a
3> [7] == ‘\a’.
false