Catalog of Elixir-specific code smells
Table of Contents
- Introduction
- Design-related smells
- GenServer Envy
- Agent Obsession
- Unsupervised process
- Large messages
- Unrelated multi-clause function
- Complex extractions in clauses 1
- Using exceptions for control-flow
- Untested polymorphic behaviors
- Code organization by process
- Large code generation by macros 1
- Data manipulation by migration
- Using App Configuration for libraries
- Compile-time global configuration
- "Use" instead of "import"
- Low-level concerns smells
- Traditional code smells
- About
- Acknowledgments
Introduction
Elixir is a functional programming language whose popularity is rising in the industry link. However, there are few works in the scientific literature focused on studying the internal quality of systems implemented in this language.
In order to better understand the types of sub-optimal code structures that can harm the internal quality of Elixir systems, we scoured websites, blogs, forums, and videos (grey literature review), looking for specific code smells for Elixir that are discussed by its developers.
As a result of this investigation, we have initially proposed a catalog of 18 new smells that are specific to Elixir systems. After that, 1 new smell emerged from a study with mining software repositories (MSR) performed by us, and other smells are being suggested by the community, so this catalog is constantly being updated (currently 23 smells). These code smells are categorized into two different groups (design-related and low-level concerns), according to the type of impact and code extent they affect. This catalog of Elixir-specific code smells is presented below. Each code smell is documented using the following structure:
- Name: Unique identifier of the code smell. This name is important to facilitate communication between developers;
- Category: The portion of code affected by smell and its severity;
- Problem: How the code smell can harm code quality and what impacts this can have for developers;
- Example: Code and textual descriptions to illustrate the occurrence of the code smell;
- Refactoring: Ways to change smelly code in order to improve its qualities. Examples of refactored code are presented to illustrate these changes.
In addition to the Elixir-specific code smells, our catalog also documents 12 traditional code smells discussed in the context of Elixir systems.
The objective of this catalog of code smells is to instigate the improvement of the quality of code developed in Elixir. For this reason, we are interested in knowing Elixir's community opinion about these code smells: Do you agree that these code smells can be harmful? Have you seen any of them in production code? Do you have any suggestions about some Elixir-specific code smell not cataloged by us?...
Please feel free to make pull requests and suggestions (Issues tab). We want to hear from you!
Design-related smells
Design-related smells are more complex, affect a coarse-grained code element, and are therefore harder to detect. In this section, 14 different smells classified as design-related are explained and exemplified:
GenServer Envy
-
Category: Design-related smell.
-
Problem: In Elixir, processes can be primitively created by
Kernel.spawn/1
,Kernel.spawn/3
,Kernel.spawn_link/1
andKernel.spawn_link/3
functions. Although it is possible to create them this way, it is more common to use abstractions (e.g.,Agent
,Task
, andGenServer
) provided by Elixir to create processes. The use of each specific abstraction is not a code smell in itself; however, there can be trouble when either aTask
orAgent
is used beyond its suggested purposes, being treated like aGenServer
. -
Example: As shown next,
Agent
andTask
are abstractions to create processes with specialized purposes. In contrast,GenServer
is a more generic abstraction used to create processes for many different purposes:Agent
: As Elixir works on the principle of immutability, by default no value is shared between multiple places of code, enabling read and write as in a global variable. AnAgent
is a simple process abstraction focused on solving this limitation, enabling processes to share state.Task
: This process abstraction is used when we only need to execute some specific action asynchronously, often in an isolated way, without communication with other processes.GenServer
: This is the most generic process abstraction. The main benefit of this abstraction is explicitly segregating the server and the client roles, thus providing a better API for the organization of processes communication. Besides that, aGenServer
can also encapsulate state (like anAgent
), provide sync and async calls (like aTask
), and more.
Examples of this code smell appear when
Agents
orTasks
are used for general purposes and not only for specialized ones such as their documentation suggests. To illustrate some smell occurrences, we will cite two specific situations. 1) When aTask
is used not only to async execute an action, but also to frequently exchange messages with other processes; 2) When anAgent
, beside sharing some global value between processes, is also frequently used to execute isolated tasks that are not of interest to other processes. -
Refactoring: When an
Agent
orTask
goes beyond its suggested use cases and becomes painful, it is better to refactor it into aGenServer
.
Agent Obsession
-
Category: Design-related smell.
-
Problem: In Elixir, an
Agent
is a process abstraction focused on sharing information between processes by means of message passing. It is a simple wrapper around shared information, thus facilitating its read and update from any place in the code. The use of anAgent
to share information is not a code smell in itself; however, when the responsibility for interacting directly with anAgent
is spread across the entire system, this can be problematic. This bad practice can increase the difficulty of code maintenance and make the code more prone to bugs. -
Example: The following code seeks to illustrate this smell. The responsibility for interacting directly with the
Agent
is spread across four different modules (i.e,A
,B
,C
, andD
).defmodule A do #... def update(pid) do #... Agent.update(pid, fn _list -> 123 end) #... end end
defmodule B do #... def update(pid) do #... Agent.update(pid, fn content -> %{a: content} end) #... end end
defmodule C do #... def update(pid) do #... Agent.update(pid, fn content -> [:atom_value | [content]] end) #... end end
defmodule D do #... def get(pid) do #... Agent.get(pid, fn content -> content end) #... end end
This spreading of responsibility can generate duplicated code and make code maintenance more difficult. Also, due to the lack of control over the format of the shared data, complex composed data can be shared. This freedom to use any format of data is dangerous and can induce developers to introduce bugs.
# start an agent with initial state of an empty list iex(1)> {:ok, agent} = Agent.start_link fn -> [] end {:ok, #PID<0.135.0>} # many data format (i.e., List, Map, Integer, Atom) are # combined through direct access spread across the entire system iex(2)> A.update(agent) iex(3)> B.update(agent) iex(4)> C.update(agent) # state of shared information iex(5)> D.get(agent) [:atom_value, %{a: 123}]
-
Refactoring: Instead of spreading direct access to an
Agent
over many places in the code, it is better to refactor this code by centralizing the responsibility for interacting with anAgent
in a single module. This refactoring improves the maintainability by removing duplicated code; it also allows you to limit the accepted format for shared data, reducing bug-proneness. As shown below, the moduleKV.Bucket
is centralizing the responsibility for interacting with theAgent
. Any other place in the code that needs to access shared data must now delegate this action toKV.Bucket
. Also,KV.Bucket
now only allows data to be shared inMap
format.defmodule KV.Bucket do use Agent @doc """ Starts a new bucket. """ def start_link(_opts) do Agent.start_link(fn -> %{} end) end @doc """ Gets a value from the `bucket` by `key`. """ def get(bucket, key) do Agent.get(bucket, &Map.get(&1, key)) end @doc """ Puts the `value` for the given `key` in the `bucket`. """ def put(bucket, key, value) do Agent.update(bucket, &Map.put(&1, key, value)) end end
The following are examples of how to delegate access to shared data (provided by an
Agent
) toKV.Bucket
.# start an agent through a `KV.Bucket` iex(1)> {:ok, bucket} = KV.Bucket.start_link(%{}) {:ok, #PID<0.114.0>} # add shared values to the keys `milk` and `beer` iex(2)> KV.Bucket.put(bucket, "milk", 3) iex(3)> KV.Bucket.put(bucket, "beer", 7) # accessing shared data of specific keys iex(4)> KV.Bucket.get(bucket, "beer") 7 iex(5)> KV.Bucket.get(bucket, "milk") 3
These examples are based on code written in Elixir's official documentation. Source: link
Unsupervised process
-
Category: Design-related smell.
-
Problem: In Elixir, creating a process outside a supervision tree is not a code smell in itself. However, when code creates a large number of long-running processes outside a supervision tree, this can make visibility and monitoring of these processes difficult, preventing developers from fully controlling their applications.
-
Example: The following code example seeks to illustrate a library responsible for maintaining a numerical
Counter
through aGenServer
process outside a supervision tree. Multiple counters can be created simultaneously by a client (one process for each counter), making these unsupervised processes difficult to manage. This can cause problems with the initialization, restart, and shutdown of a system.defmodule Counter do use GenServer @moduledoc """ Global counter implemented through a GenServer process outside a supervision tree. """ @doc """ Function to create a counter. initial_value: any integer value. pid_name: optional parameter to define the process name. Default is Counter. """ def start(initial_value, pid_name \\ __MODULE__) when is_integer(initial_value) do GenServer.start(__MODULE__, initial_value, name: pid_name) end @doc """ Function to get the counter's current value. pid_name: optional parameter to inform the process name. Default is Counter. """ def get(pid_name \\ __MODULE__) do GenServer.call(pid_name, :get) end @doc """ Function to changes the counter's current value. Returns the updated value. value: amount to be added to the counter. pid_name: optional parameter to inform the process name. Default is Counter. """ def bump(value, pid_name \\ __MODULE__) do GenServer.call(pid_name, {:bump, value}) get(pid_name) end ## Callbacks @impl true def init(counter) do {:ok, counter} end @impl true def handle_call(:get, _from, counter) do {:reply, counter, counter} end def handle_call({:bump, value}, _from, counter) do {:reply, counter, counter + value} end end #...Use examples... iex(1)> Counter.start(0) {:ok, #PID<0.115.0>} iex(2)> Counter.get() 0 iex(3)> Counter.start(15, C2) {:ok, #PID<0.120.0>} iex(4)> Counter.get(C2) 15 iex(5)> Counter.bump(-3, C2) 12 iex(6)> Counter.bump(7) 7
-
Refactoring: To ensure that clients of a library have full control over their systems, regardless of the number of processes used and the lifetime of each one, all processes must be started inside a supervision tree. As shown below, this code uses a
Supervisor
link as a supervision tree. When this Elixir application is started, two different counters (Counter
andC2
) are also started as child processes of theSupervisor
namedApp.Supervisor
. Both are initialized with zero. By means of this supervision tree, it is possible to manage the lifecycle of all child processes (e.g., stopping or restarting each one), improving the visibility of the entire app.defmodule SupervisedProcess.Application do use Application @impl true def start(_type, _args) do children = [ # The counters are Supervisor children started via Counter.start(0). %{ id: Counter, start: {Counter, :start, [0]} }, %{ id: C2, start: {Counter, :start, [0, C2]} } ] opts = [strategy: :one_for_one, name: App.Supervisor] Supervisor.start_link(children, opts) end end #...Use examples... iex(1)> Supervisor.count_children(App.Supervisor) %{active: 2, specs: 2, supervisors: 0, workers: 2} iex(2)> Counter.get(Counter) 0 iex(3)> Counter.get(C2) 0 iex(4)> Counter.bump(7, Counter) 7 iex(5)> Supervisor.terminate_child(App.Supervisor, Counter) iex(6)> Supervisor.count_children(App.Supervisor) %{active: 1, specs: 2, supervisors: 0, workers: 2} #only one active iex(7)> Counter.get(Counter) #Error because it was previously terminated ** (EXIT) no process: the process is not alive... iex(8)> Supervisor.restart_child(App.Supervisor, Counter) iex(9)> Counter.get(Counter) #after the restart, this process can be accessed again 0
These examples are based on codes written in Elixir's official documentation. Source: link
Large messages
-
Category: Design-related smell.
-
Note: Formerly known as "Large messages between processes".
-
Problem: In Elixir, processes run in an isolated manner, often concurrently with other Elixir. Communication between different processes is performed via message passing. The exchange of messages between processes is not a code smell in itself; however, when processes exchange messages, their contents are copied between them. For this reason, if a huge structure is sent as a message from one process to another, the sender can become blocked, compromising performance. If these large message exchanges occur frequently, the prolonged and frequent blocking of processes can cause a system to behave anomalously.
-
Example: The following code is composed of two modules which will each run in a different process. As the names suggest, the
Sender
module has a function responsible for sending messages from one process to another (i.e.,send_msg/3
). TheReceiver
module has a function to create a process to receive messages (i.e.,create/0
) and another one to handle the received messages (i.e.,run/0
). If a huge structure, such as a list with 1_000_000 different values, is sent frequently fromSender
toReceiver
, the impacts of this smell could be felt.defmodule Receiver do @doc """ Function for receiving messages from processes. """ def run do receive do {:msg, msg_received} -> msg_received {_, _} -> "won't match" end end @doc """ Create a process to receive a message. Messages are received in the run() function of Receiver. """ def create do spawn(Receiver, :run, []) end end
defmodule Sender do @doc """ Function for sending messages between processes. pid_receiver: message recipient. msg: messages of any type and size can be sent. id_msg: used by receiver to decide what to do when a message arrives. Default is the atom :msg """ def send_msg(pid_receiver, msg, id_msg \\ :msg) do send(pid_receiver, {id_msg, msg}) end end
Examples of large messages between processes:
iex(1)> pid = Receiver.create #PID<0.144.0> #Simulating a message with large content - List with length 1_000_000 iex(2)> msg = %{from: inspect(self()), to: inspect(pid), content: Enum.to_list(1..1_000_000)} iex(3)> Sender.send_msg(pid, msg) {:msg, %{ content: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, ...], from: "#PID<0.105.0>", to: "#PID<0.144.0>" }}
This example is based on a original code by Samuel Mullen. Source: link
Unrelated multi-clause function
-
Category: Design-related smell.
-
Note: Formerly known as "Complex multi-clause function".
-
Problem: Using multi-clause functions in Elixir, to group functions of the same name, is not a code smell in itself. However, due to the great flexibility provided by this programming feature, some developers may abuse the number of guard clauses and pattern matches to group unrelated functionality.
-
Example: A recurrent example of abusive use of the multi-clause functions is when we’re trying to mix too much-unrelated business logic into the function definitions. This makes it difficult to read and understand the logic involved in the functions, which may impair code maintainability. Some developers use documentation mechanisms such as
@doc
annotations to compensate for poor code readability, but unfortunately, with a multi-clause function, we can only use these annotations once per function name, particularly on the first or header function. As shown next, all other variations of the function need to be documented only with comments, a mechanism that cannot automate tests, leaving the code prone to bugs.@doc """ Update sharp product with 0 or empty count ## Examples iex> Namespace.Module.update(...) expected result... """ def update(%Product{count: nil, material: material}) when material in ["metal", "glass"] do # ... end # update blunt product def update(%Product{count: count, material: material}) when count > 0 and material in ["metal", "glass"] do # ... end # update animal... def update(%Animal{count: 1, skin: skin}) when skin in ["fur", "hairy"] do # ... end
-
Refactoring: As shown below, a possible solution to this smell is to break the business rules that are mixed up in a single unrelated multi-clause function in several different simple functions. Each function can have a specific
@doc
, describing its behavior and parameters received. While this refactoring sounds simple, it can have a lot of impact on the function's current clients, so be careful!@doc """ Update sharp product ## Parameter struct: %Product{...} ## Examples iex> Namespace.Module.update_sharp_product(%Product{...}) expected result... """ def update_sharp_product(struct) do # ... end @doc """ Update blunt product ## Parameter struct: %Product{...} ## Examples iex> Namespace.Module.update_blunt_product(%Product{...}) expected result... """ def update_blunt_product(struct) do # ... end @doc """ Update animal ## Parameter struct: %Animal{...} ## Examples iex> Namespace.Module.update_animal(%Animal{...}) expected result... """ def update_animal(struct) do # ... end
This example is based on a original code by Syamil MJ (@syamilmj). Source: link
Complex extractions in clauses
-
Category: Design-related smell.
-
Note: This smell was suggested by the community via issues (#9).
-
Problem: When we use multi-clause functions, it is possible to extract values in the clauses for further usage and for pattern matching/guard checking. This extraction itself does not represent a code smell, but when you have too many clauses or too many arguments, it becomes hard to know which extracted parts are used for pattern/guards and what is used only inside the function body. This smell is related to Unrelated multi-clause function, but with implications of its own. It impairs the code readability in a different way.
-
Example: The following code, although simple, tries to illustrate the occurrence of this code smell. The multi-clause function
drive/1
is extracting fields of an%User{}
struct for usage in the clause expression (e.g.age
) and for usage in the function body (e.g.,name
). Ideally, a function should not mix pattern matching extractions for usage in its clauses expressions and also in the function body.def drive(%User{name: name, age: age}) when age >= 18 do "#{name} can drive" end def drive(%User{name: name, age: age}) when age < 18 do "#{name} cannot drive" end
While the example is small and looks like a clear code, try to imagine a situation where
drive/1
was more complex, having many more clauses, arguments, and extractions. This is the really smelly code! -
Refactoring: As shown below, a possible solution to this smell is to extract only pattern/guard related variables in the signature once you have many arguments or multiple clauses:
def drive(%User{age: age} = user) when age >= 18 do %User{name: name} = user "#{name} can drive" end def drive(%User{age: age} = user) when age < 18 do %User{name: name} = user "#{name} cannot drive" end
This example and the refactoring are proposed by José Valim (@josevalim)
Using exceptions for control-flow
-
Category: Design-related smell.
-
Note: Formerly known as "Exceptions for control-flow".
-
Problem: This smell refers to code that forces developers to handle exceptions for control-flow. Exception handling itself does not represent a code smell, but this should not be the only alternative available to developers to handle an error in client code. When developers have no freedom to decide if an error is exceptional or not, this is considered a code smell.
-
Example: An example of this code smell, as shown below, is when a library (e.g.
MyModule
) forces its clients to usetry .. rescue
statements to capture and evaluate errors. This library does not allow developers to decide if an error is exceptional or not in their applications.defmodule MyModule do def janky_function(value) do if is_integer(value) do #... "Result..." else raise RuntimeError, message: "invalid argument. Is not integer!" end end end
defmodule Client do # Client forced to use exceptions for control-flow. def foo(arg) do try do value = MyModule.janky_function(arg) "All good! #{value}." rescue e in RuntimeError -> reason = e.message "Uh oh! #{reason}." end end end #...Use examples... iex(1)> Client.foo(1) "All good! Result...." iex(2)> Client.foo("lucas") "Uh oh! invalid argument. Is not integer!."
-
Refactoring: Library authors should guarantee that clients are not required to use exceptions for control-flow in their applications. As shown below, this can be done by refactoring the library
MyModule
, providing two versions of the function that forces clients to use exceptions for control-flow (e.g.,janky_function
). 1) a version with the raised exceptions should have the same name as the smelly one, but with a trailing!
(i.e.,janky_function!
); 2) Another version, without raised exceptions, should have a name identical to the original version (i.e.,janky_function
), and should return the result wrapped in a tuple.defmodule MyModule do @moduledoc """ Refactored library """ @doc """ Refactored version without exceptions for control-flow. """ def janky_function(value) do if is_integer(value) do #... {:ok, "Result..."} else {:error, "invalid argument. Is not integer!"} end end def janky_function!(value) do case janky_function(value) do {:ok, result} -> result {:error, message} -> raise RuntimeError, message: message end end end
This refactoring gives clients more freedom to decide how to proceed in the event of errors, defining what is exceptional or not in different situations. As shown next, when an error is not exceptional, clients can use specific control-flow structures, such as the
case
statement along with pattern matching.defmodule Client do # Clients now can also choose to use control-flow structures # for control-flow when an error is not exceptional. def foo(arg) do case MyModule.janky_function(arg) do {:ok, value} -> "All good! #{value}." {:error, reason} -> "Uh oh! #{reason}." end end end #...Use examples... iex(1)> Client.foo(1) "All good! Result...." iex(2)> Client.foo("lucas") "Uh oh! invalid argument. Is not integer!."
This example is based on code written by Tim Austin neenjaw and Angelika Tyborska angelikatyborska. Source: link
Untested polymorphic behaviors
-
Category: Design-related smell.
-
Problem: This code smell refers to functions that have protocol-dependent parameters and are therefore polymorphic. A polymorphic function itself does not represent a code smell, but some developers implement these generic functions without accompanying guard clauses, allowing to pass parameters that do not implement the required protocol or that have no meaning.
-
Example: An instance of this code smell happens when a function uses
to_string()
to convert data received by parameter. The functionto_string()
uses the protocolString.Chars
for conversions. Many Elixir data types (e.g.,BitString
,Integer
,Float
,URI
) implement this protocol. However, as shown below, other Elixir data types (e.g.,Map
) do not implement it and can cause an error indasherize/1
function. Depending on the situation, this behavior can be desired or not. Besides that, it may not make sense to dasherize aURI
or a number as shown next.defmodule CodeSmells do def dasherize(data) do to_string(data) |> String.replace("_", "-") end end #...Use examples... iex(1)> CodeSmells.dasherize("Lucas_Vegi") "Lucas-Vegi" iex(2)> CodeSmells.dasherize(10) #<= Makes sense? "10" iex(3)> CodeSmells.dasherize(URI.parse("http://www.code_smells.com")) #<= Makes sense? "http://www.code-smells.com" iex(4)> CodeSmells.dasherize(%{last_name: "vegi", first_name: "lucas"}) ** (Protocol.UndefinedError) protocol String.Chars not implemented for %{first_name: "lucas", last_name: "vegi"} of type Map
-
Refactoring: There are two main alternatives to improve code affected by this smell. 1) You can either remove the protocol use (i.e.,
to_string/1
), by adding multi-clauses ondasherize/1
or just remove it; or 2) You can document thatdasherize/1
uses the protocolString.Chars
for conversions, showing its consequences. As shown next, we refactored using the first alternative, removing the protocol and restrictingdasherize/1
parameter only to desired data types (i.e.,BitString
andAtom
). Besides that, we use@doc
to validatedasherize/1
for desired inputs and to document the behavior to some types that we think don't make sense for the function (e.g.,Integer
andURI
).defmodule CodeSmells do @doc """ Function that converts underscores to dashes. ## Parameter data: only BitString and Atom are supported. ## Examples iex> CodeSmells.dasherize(:lucas_vegi) "lucas-vegi" iex> CodeSmells.dasherize("Lucas_Vegi") "Lucas-Vegi" iex> CodeSmells.dasherize(%{last_name: "vegi", first_name: "lucas"}) ** (FunctionClauseError) no function clause matching in CodeSmells.dasherize/1 iex> CodeSmells.dasherize(URI.parse("http://www.code_smells.com")) ** (FunctionClauseError) no function clause matching in CodeSmells.dasherize/1 iex> CodeSmells.dasherize(10) ** (FunctionClauseError) no function clause matching in CodeSmells.dasherize/1 """ def dasherize(data) when is_atom(data) do dasherize(Atom.to_string(data)) end def dasherize(data) when is_binary(data) do String.replace(data, "_", "-") end end #...Use examples... iex(1)> CodeSmells.dasherize(:lucas_vegi) "lucas-vegi" iex(2)> CodeSmells.dasherize("Lucas_Vegi") "Lucas-Vegi" iex(3)> CodeSmells.dasherize(10) ** (FunctionClauseError) no function clause matching in CodeSmells.dasherize/1
This example is based on code written by José Valim (@josevalim). Source: link
Code organization by process
-
Category: Design-related smell.
-
Problem: This smell refers to code that is unnecessarily organized by processes. A process itself does not represent a code smell, but it should only be used to model runtime properties (e.g., concurrency, access to shared resources, event scheduling). When a process is used for code organization, it can create bottlenecks in the system.
-
Example: An example of this code smell, as shown below, is a library that implements arithmetic operations (e.g., add, subtract) by means of a
GenSever
processlink. If the number of calls to this single process grows, this code organization can compromise the system performance, therefore becoming a bottleneck.defmodule Calculator do use GenServer @moduledoc """ Calculator that performs two basic arithmetic operations. This code is unnecessarily organized by a GenServer process. """ @doc """ Function to perform the sum of two values. """ def add(a, b, pid) do GenServer.call(pid, {:add, a, b}) end @doc """ Function to perform subtraction of two values. """ def subtract(a, b, pid) do GenServer.call(pid, {:subtract, a, b}) end def init(init_arg) do {:ok, init_arg} end def handle_call({:add, a, b}, _from, state) do {:reply, a + b, state} end def handle_call({:subtract, a, b}, _from, state) do {:reply, a - b, state} end end # Start a generic server process iex(1)> {:ok, pid} = GenServer.start_link(Calculator, :init) {:ok, #PID<0.132.0>} #...Use examples... iex(2)> Calculator.add(1, 5, pid) 6 iex(3)> Calculator.subtract(2, 3, pid) -1
-
Refactoring: In Elixir, as shown next, code organization must be done only by modules and functions. Whenever possible, a library should not impose specific behavior (such as parallelization) on its clients. It is better to delegate this behavioral decision to the developers of clients, thus increasing the potential for code reuse of a library.
defmodule Calculator do def add(a, b) do a + b end def subtract(a, b) do a - b end end #...Use examples... iex(1)> Calculator.add(1, 5) 6 iex(2)> Calculator.subtract(2, 3) -1
This example is based on code provided in Elixir's official documentation. Source: link
Large code generation by macros
-
Category: Design-related smell.
-
Note: This smell was suggested by the community via issues (#13).
-
Problem: This code smell is related to
macros
that generate too much code. When amacro
provides a large code generation, it impacts how the compiler or the runtime works. The reason for this is that Elixir may have to expand, compile, and execute a code multiple times, which will make compilation slower. -
Example: The code shown below is an example of this smell. Imagine you are defining a router for a web application, where you could have macros like
get/2
. On every invocation of the macro, which can be hundreds, the code insideget/2
will be expanded and compiled, which can generate a large volume of code in total.defmodule Routes do ... defmacro get(route, handler) do quote do route = unquote(route) handler = unquote(handler) if not is_binary(route) do raise ArgumentError, "route must be a binary" end if not is_atom(handler) do raise ArgumentError, "route must be a module" end @store_route_for_compilation {route, handler} end end end
-
Refactoring: To remove this code smell, the developer must simplify the
macro
, delegating to other functions part of its work. As shown below, by encapsulating in the function__define__/3
the functionality pre-existing inside thequote
, we reduce the code that is expanded and compiled on every invocation of themacro
, and instead we dispatch to a function to do the bulk of the work.defmodule Routes do ... defmacro get(route, handler) do quote do Routes.__define__(__MODULE__, unquote(route), unquote(handler)) end end def __define__(module, route, handler) do if not is_binary(route) do raise ArgumentError, "route must be a binary" end if not is_atom(handler) do raise ArgumentError, "route must be a module" end Module.put_attribute(module, :store_route_for_compilation, {route, handler}) end end
This example and the refactoring are proposed by José Valim (@josevalim)
Data manipulation by migration
-
Category: Design-related smell.
-
Problem: This code smell refers to modules that perform both data and structural changes in a database schema via
Ecto.Migration
link. Migrations must be used exclusively to modify a database schema over time (e.g., by including or excluding columns and tables). When this responsibility is mixed with data manipulation code, the module becomes less cohesive, more difficult to test, and therefore more prone to bugs. -
Example: An example of this code smell is when an
Ecto.Migration
is used simultaneously to alter a table, adding a new column to it, and also to update all pre-existing data in that table, assigning a value to this new column. As shown below, in addition to adding theis_custom_shop
column in theguitars
table, thisEcto.Migration
changes the value of this column for some specific guitar models.defmodule GuitarStore.Repo.Migrations.AddIsCustomShopToGuitars do use Ecto.Migration import Ecto.Query alias GuitarStore.Inventory.Guitar alias GuitarStore.Repo @doc """ A function that modifies the structure of table "guitars", adding column "is_custom_shop" to it. By default, all data pre-stored in this table will have the value false stored in this new column. Also, this function updates the "is_custom_shop" column value of some guitar models to true. """ def change do alter table("guitars") do add :is_custom_shop, :boolean, default: false end create index("guitars", ["is_custom_shop"]) custom_shop_entries() |> Enum.map(&update_guitars/1) end @doc """ A function that updates values of column "is_custom_shop" to true. """ defp update_guitars({make, model, year}) do from(g in Guitar, where: g.make == ^make and g.model == ^model and g.year == ^year, select: g ) |> Repo.update_all(set: [is_custom_shop: true]) end @doc """ Function that defines which guitar models that need to have the values of the "is_custom_shop" column updated to true. """ defp custom_shop_entries() do [ {"Gibson", "SG", 1999}, {"Fender", "Telecaster", 2020} ] end end
You can run this smelly migration above by going to the root of your project and typing the next command via console:
mix ecto.migrate
-
Refactoring: To remove this code smell, it is necessary to separate the data manipulation in a
mix task
link different from the module that performs the structural changes in the database viaEcto.Migration
. This separation of responsibilities is a best practice for increasing code testability. As shown below, the moduleAddIsCustomShopToGuitars
now useEcto.Migration
only to perform structural changes in the database schema:defmodule GuitarStore.Repo.Migrations.AddIsCustomShopToGuitars do use Ecto.Migration @doc """ A function that modifies the structure of table "guitars", adding column "is_custom_shop" to it. By default, all data pre-stored in this table will have the value false stored in this new column. """ def change do alter table("guitars") do add :is_custom_shop, :boolean, default: false end create index("guitars", ["is_custom_shop"]) end end
Furthermore, the new mix task
PopulateIsCustomShop
, shown next, has only the responsibility to perform data manipulation, thus improving testability:defmodule Mix.Tasks.PopulateIsCustomShop do @shortdoc "Populates is_custom_shop column" use Mix.Task import Ecto.Query alias GuitarStore.Inventory.Guitar alias GuitarStore.Repo @requirements ["app.start"] def run(_) do custom_shop_entries() |> Enum.map(&update_guitars/1) end defp update_guitars({make, model, year}) do from(g in Guitar, where: g.make == ^make and g.model == ^model and g.year == ^year, select: g ) |> Repo.update_all(set: [is_custom_shop: true]) end defp custom_shop_entries() do [ {"Gibson", "SG", 1999}, {"Fender", "Telecaster", 2020} ] end end
You can run this
mix task
above by typing the next command via console:mix populate_is_custom_shop
This example is based on code originally written by Carlos Souza. Source: link
Using App Configuration for libraries
-
Category: Design-related smells.
-
Note: Formerly known as "App configuration for code libs".
-
Problem: The
Application Environment
link is a mechanism that can be used to parameterize values that will be used in several different places in a system implemented in Elixir. This parameterization mechanism can be very useful and therefore is not considered a code smell by itself. However, whenApplication Environments
are used as a mechanism for configuring a library's functions, this can make these functions less flexible, making it impossible for a library-dependent application to reuse its functions with different behaviors in different places in the code. Libraries are created to foster code reuse, so this limitation imposed by this parameterization mechanism can be problematic in this scenario. -
Example: The
DashSplitter
module represents a library that configures the behavior of its functions through the globalApplication Environment
mechanism. These configurations are concentrated in theconfig/config.exs
file, shown below:import Config config :app_config, parts: 3 import_config "#{config_env()}.exs"
One of the functions implemented by the
DashSplitter
library issplit/1
. This function has the purpose of separating a string received via parameter into a certain number of parts. The character used as a separator insplit/1
is always"-"
and the number of parts the string is split into is defined globally by theApplication Environment
. This value is retrieved by thesplit/1
function by callingApplication.fetch_env!/2
, as shown next:defmodule DashSplitter do def split(string) when is_binary(string) do parts = Application.fetch_env!(:app_config, :parts) # <= retrieve parameterized value String.split(string, "-", parts: parts) # <= parts: 3 end end
Due to this type of parameterized value used by the
DashSplitter
library, all applications dependent on it can only use thesplit/1
function with identical behavior in relation to the number of parts generated by string separation. Currently, this value is equal to 3, as we can see in the use examples shown below:iex(1)> DashSplitter.split("Lucas-Francisco-Vegi") ["Lucas", "Francisco", "Vegi"] iex(2)> DashSplitter.split("Lucas-Francisco-da-Matta-Vegi") ["Lucas", "Francisco", "da-Matta-Vegi"]
-
Refactoring: To remove this code smell and make the library more adaptable and flexible, this type of configuration must be performed via parameters in function calls. The code shown below performs the refactoring of the
split/1
function by adding a new optional parameter of typeKeyword list
. With this new parameter it is possible to modify the default behavior of the function at the time of its call, allowing multiple different ways of usingsplit/2
within the same application:defmodule DashSplitter do def split(string, opts \\ []) when is_binary(string) and is_list(opts) do parts = Keyword.get(opts, :parts, 2) # <= default config of parts == 2 String.split(string, "-", parts: parts) end end #...Use examples... iex(1)> DashSplitter.split("Lucas-Francisco-da-Matta-Vegi", [parts: 5]) ["Lucas", "Francisco", "da", "Matta", "Vegi"] iex(2)> DashSplitter.split("Lucas-Francisco-da-Matta-Vegi") #<= default config is used! ["Lucas", "Francisco-da-Matta-Vegi"]
These examples are based on code provided in Elixir's official documentation. Source: link
Compile-time global configuration
-
Category: Design-related smells.
-
Note: Formerly known as "Compile-time app configuration".
-
Problem: As explained in the description of Using App Configuration for libraries, the
Application Environment
can be used to parameterize values in an Elixir system. Although it is not a good practice to use this mechanism in the implementation of libraries, sometimes this can be unavoidable. If these parameterized values are assigned tomodule attributes
, it can be especially problematic. Asmodule attribute
values are defined at compile-time, when trying to assignApplication Environment
values to these attributes, warnings or errors can be triggered by Elixir. This happens because, when defining module attributes at compile time, theApplication Environment
is not yet available in memory. -
Example: The
DashSplitter
module represents a library. This module has an attribute@parts
that has its constant value defined at compile-time by callingApplication.fetch_env!/2
. Thesplit/1
function, implemented by this library, has the purpose of separating a string received via parameter into a certain number of parts. The character used as a separator insplit/1
is always"-"
and the number of parts the string is split into is defined by the module attribute@parts
, as shown next:defmodule DashSplitter do @parts Application.fetch_env!(:app_config, :parts) # <= define module attribute # at compile-time def split(string) when is_binary(string) do String.split(string, "-", parts: @parts) #<= reading from a module attribute end end
Due to this compile-time configuration based on the
Application Environment
mechanism, Elixir can raise warnings or errors, as shown next, during compilation:warning: Application.fetch_env!/2 is discouraged in the module body, use Application.compile_env/3 instead... ** (ArgumentError) could not fetch application environment :parts for application :app_config because the application was not loaded nor configured
-
Refactoring: To remove this code smell, when it is really unavoidable to use the
Application Environment
mechanism to configure library functions, this should be done at runtime and not during compilation. That is, instead of callingApplication.fetch_env!(:app_config, :parts)
at compile-time to set@parts
, this function must be called at runtime withinsplit/1
. This will mitigate the risk thatApplication Environment
is not yet available in memory when it is necessary to use it. Another possible refactoring, as shown below, is to replace the use of theApplication.fetch_env!/2
function to define@parts
, with theApplication.compile_env/3
. The third parameter ofApplication.compile_env/3
defines a default value that is returned whenever thatApplication Environment
is not available in memory during the definition of@parts
. This prevents Elixir from raising an error at compile-time:defmodule DashSplitter do @parts Application.compile_env(:app_config, :parts, 3) # <= default value 3 prevents an error! def split(string) when is_binary(string) do String.split(string, "-", parts: @parts) #<= reading from a module attribute end end
These examples are based on code provided in Elixir's official documentation. Source: link
-
Remark: This code smell can be detected by Credo, a static code analysis tool. During its checks, Credo raises this warning when this smell is found.
"Use" instead of "import"
-
Category: Design-related smells.
-
Note: Formerly known as "Dependency with "use" when an "import" is enough".
-
Problem: Elixir has mechanisms such as
import
,alias
, anduse
to establish dependencies between modules. Establishing dependencies allows a module to call functions from other modules, facilitating code reuse. A code implemented with these mechanisms does not characterize a smell by itself; however, while theimport
andalias
directives have lexical scope and only facilitate that a module to use functions of another, theuse
directive has a broader scope, something that can be problematic. Theuse
directive allows a module to inject any type of code into another, including propagating dependencies. In this way, using theuse
directive makes code readability worse, because to understand exactly what will happen when it references a module, it is necessary to have knowledge of the internal details of the referenced module. -
Example: The code shown below is an example of this smell. Three different modules were defined --
ModuleA
,Library
, andClientApp
.ClientApp
is reusing code from theLibrary
via theuse
directive, but is unaware of its internal details. Therefore, whenLibrary
is referenced byClientApp
, it injects intoClientApp
all the content present in its__using__/1
macro. Due to the decreased readability of the code and the lack of knowledge of the internal details of theLibrary
,ClientApp
defines a local functionfoo/0
. This will generate a conflict asModuleA
also has a functionfoo/0
; whenClientApp
referencedLibrary
via theuse
directive, it has a dependency forModuleA
propagated to itself:defmodule ModuleA do def foo do "From Module A" end end
defmodule Library do defmacro __using__(_opts) do quote do import ModuleA # <= propagating dependencies! def from_lib do "From Library" end end end def from_lib do "From Library" end end
defmodule ClientApp do use Library def foo do "Local function from client app" end def from_client_app do from_lib() <> " - " <> foo() end end
When we try to compile
ClientApp
, Elixir will detect the conflict and throw the following error:iex(1)> c("client_app.ex") ** (CompileError) client_app.ex:4: imported ModuleA.foo/0 conflicts with local function
-
Refactoring: To remove this code smell, it may be possible to replace
use
withalias
orimport
when creating a dependency between an application and a library. This will make code behavior clearer, due to improved readability. In the following code,ClientApp
was refactored in this way, and with that, the conflict as previously shown no longer exists:defmodule ClientApp do import Library def foo do "Local function from client app" end def from_client_app do from_lib() <> " - " <> foo() end end #...Uses example... iex(1)> ClientApp.from_client_app() "From Library - Local function from client app"
These examples are based on code provided in Elixir's official documentation. Source: link
Low-level concerns smells
Low-level concerns smells are more simple than design-related smells and affect a small part of the code. Next, all 9 different smells classified as low-level concerns are explained and exemplified:
Working with invalid data
-
Category: Low-level concerns smells.
-
Problem: This code smell refers to a function that does not validate its parameters' types and therefore can produce internal non-predicted behavior. When an error is raised inside a function due to an invalid parameter value, this can confuse the developers and make it harder to locate and fix the error.
-
Example: An example of this code smell is when a function receives an invalid parameter and then passes it to a function from a third-party library. This will cause an error (raised deep inside the library function), which may be confusing for the developer who is working with invalid data. As shown next, the function
foo/1
is a client of a third-party library and doesn't validate its parameters at the boundary. In this way, it is possible that invalid data will be passed fromfoo/1
to the library, causing a mysterious error.defmodule MyApp do alias ThirdPartyLibrary, as: Library def foo(invalid_data) do #...some code... Library.sum(1, invalid_data) #...some code... end end #...Use examples... # with valid data is ok iex(1)> MyApp.foo(2) 3 #with invalid data cause a confusing error deep inside iex(2)> MyApp.foo("Lucas") ** (ArithmeticError) bad argument in arithmetic expression: 1 + "Lucas" :erlang.+(1, "Lucas") library.ex:3: ThirdPartyLibrary.sum/2
-
Refactoring: To remove this code smell, client code must validate input parameters at the boundary with the user, via guard clauses or pattern matching. This will prevent errors from occurring deeply, making them easier to understand. This refactoring will also allow libraries to be implemented without worrying about creating internal protection mechanisms. The next code illustrates the refactoring of
foo/1
, removing this smell:defmodule MyApp do alias ThirdPartyLibrary, as: Library def foo(data) when is_integer(data) do #...some code... Library.sum(1, data) #...some code... end end #...Use examples... #with valid data is ok iex(1)> MyApp.foo(2) 3 # with invalid data errors are easy to locate and fix iex(2)> MyApp.foo("Lucas") ** (FunctionClauseError) no function clause matching in MyApp.foo/1 The following arguments were given to MyApp.foo/1: # 1 "Lucas" my_app.ex:6: MyApp.foo/1
This example is based on code provided in Elixir's official documentation. Source: link
Complex branching
-
Category: Low-level concerns smell.
-
Note: Formerly known as "Complex API error handling".
-
Problem: When a function assumes the responsibility of handling multiple errors alone, it can increase its cyclomatic complexity (metric of control-flow) and become incomprehensible. This situation can configure a specific instance of "Long function", a traditional code smell, but has implications of its own. Under these circumstances, this function could get very confusing, difficult to maintain and test, and therefore bug-proneness.
-
Example: An example of this code smell is when a function uses the
case
control-flow structure or other similar constructs (e.g.,cond
, orreceive
) to handle multiple variations of response types returned by the same API endpoint. This practice can make the function more complex, long, and difficult to understand, as shown next.def get_customer(customer_id) do case get("/customers/#{customer_id}") do {:ok, %Tesla.Env{status: 200, body: body}} -> {:ok, body} {:ok, %Tesla.Env{body: body}} -> {:error, body} {:error, _} = other -> other end end
Although
get_customer/1
is not really long in this example, it could be. Thinking about this more complex scenario, where a large number of different responses can be provided to the same endpoint, is not a good idea to concentrate all on a single function. This is a risky scenario, where a little typo, or any problem introduced by the programmer in handling a response type, could eventually compromise the handling of all responses from the endpoint (if the function raises an exception, for example). -
Refactoring: As shown below, in this situation, instead of concentrating all handlings within the same function, creating a complex branching, it is better to delegate each branch (handling of a response type) to a different private function. In this way, the code will be cleaner, more concise, and readable.
def get_customer(customer_id) when is_integer(customer_id) do case get("/customers/#{customer_id}") do {:ok, %Tesla.Env{status: 200, body: body}} -> success_api_response(body) {:ok, %Tesla.Env{body: body}} -> x_error_api_response(body) {:error, _} = other -> y_error_api_response(other) end end defp success_api_response(body) do {:ok, body} end defp x_error_api_response(body) do {:error, body} end defp y_error_api_response(other) do other end
While this example of refactoring
get_customer/1
might seem quite more verbose than the original code, remember to imagine a scenario whereget_customer/1
is responsible for handling a number much larger than three different types of possible responses. This is the smelly scenario!This example is based on code written by Zack MrDoops and Dimitar Panayotov dimitarvp. Source: link. We got suggestions from José Valim (@josevalim) on the refactoring.
Complex else clauses in with
-
Category: Low-level concerns smell.
-
Note: This smell was suggested by the community via issues (#7).
-
Problem: This code smell refers to
with
statements that flatten all its error clauses into a single complexelse
block. This situation is harmful to the code readability and maintainability because difficult to know from which clause the error value came. -
Example: An example of this code smell, as shown below, is a function
open_decoded_file/1
that read a base 64 encoded string content from a file and returns a decoded binary string. This function uses awith
statement that needs to handle two possible errors, all of which are concentrated in a single complexelse
block.def open_decoded_file(path) do with {:ok, encoded} <- File.read(path), {:ok, value} <- Base.decode64(encoded) do value else {:error, _} -> :badfile :error -> :badencoding end end
-
Refactoring: As shown below, in this situation, instead of concentrating all error handlings within a single complex
else
block, it is better to normalize the return types in specific private functions. In this way, due to its organization, the code will be cleaner and more readable.def open_decoded_file(path) do with {:ok, encoded} <- file_read(path), {:ok, value} <- base_decode64(encoded) do value end end defp file_read(path) do case File.read(path) do {:ok, contents} -> {:ok, contents} {:error, _} -> :badfile end end defp base_decode64(contents) do case Base.decode64(contents) do {:ok, contents} -> {:ok, contents} :error -> :badencoding end end
This example and the refactoring are proposed by José Valim (@josevalim)
Alternative return types
-
Category: Low-level concerns smell.
-
Note: This smell was suggested by the community via issues (#6).
-
Problem: This code smell refers to functions that receive options (e.g.,
keyword list
) parameters that drastically change its return type. Because options are optional and sometimes set dynamically, if they change the return type it may be hard to understand what the function actually returns. -
Example: An example of this code smell, as shown below, is when a library (e.g.
AlternativeInteger
) has a multi-clause functionparse/2
with many alternative return types. Depending on the options received as a parameter, the function will have a different return type.defmodule AlternativeInteger do def parse(string, opts) when is_list(opts) do case opts[:discard_rest] do true -> #only an integer value convert from string parameter _ -> #another return type (e.g., tuple) end end def parse(string, opts \\ :default) do #another return type (e.g., tuple) end end #...Use examples... iex(1)> AlternativeInteger.parse("13") {13, "..."} iex(2)> AlternativeInteger.parse("13", discard_rest: true) 13 iex(3)> AlternativeInteger.parse("13", discard_rest: false) {13, "..."}
-
Refactoring: To refactor this smell, as shown next, it's better to add in the library a specific function for each return type (e.g.,
parse_no_rest/1
), no longer delegating this to an options parameter.defmodule AlternativeInteger do def parse_no_rest(string) do #only an integer value convert from string parameter end def parse(string) do #another return type (e.g., tuple) end end #...Use examples... iex(1)> AlternativeInteger.parse("13") {13, "..."} iex(2)> AlternativeInteger.parse_no_rest("13") 13
This example and the refactoring are proposed by José Valim (@josevalim)
Accessing non-existent Map/Struct fields
-
Category: Low-level concerns smells.
-
Note: Formerly known as "Map/struct dynamic access".
-
Problem: In Elixir, it is possible to access values from
Maps
, which are key-value data structures, either strictly or dynamically. When trying to dynamically access the value of a key from aMap
, if the informed key does not exist, a null value (nil
) will be returned. This return can be confusing and does not allow developers to conclude whether the key is non-existent in theMap
or just has no bound value. In this way, this code smell may cause bugs in the code. -
Example: The code shown below is an example of this smell. The function
plot/1
tries to draw a graphic to represent the position of a point in a cartesian plane. This function receives a parameter ofMap
type with the point attributes, which can be a point of a 2D or 3D cartesian coordinate system. To decide if a point is 2D or 3D, this function uses dynamic access to retrieve values of theMap
keys:defmodule Graphics do def plot(point) do #...some code... # Dynamic access to use point values {point[:x], point[:y], point[:z]} #...some code... end end #...Use examples... iex(1)> point_2d = %{x: 2, y: 3} %{x: 2, y: 3} iex(2)> point_3d = %{x: 5, y: 6, z: nil} %{x: 5, y: 6, z: nil} iex(3)> Graphics.plot(point_2d) {2, 3, nil} # <= ambiguous return iex(4)> Graphics.plot(point_3d) {5, 6, nil}
As can be seen in the example above, even when the key
:z
does not exist in theMap
(point_2d
), dynamic access returns the valuenil
. This return can be dangerous because of its ambiguity. It is not possible to conclude from it whether theMap
has the key:z
or not. If the function relies on the return value to make decisions about how to plot a point, this can be problematic and even cause errors when testing the code. -
Refactoring: To remove this code smell, whenever a
Map
has keys ofAtom
type, replace the dynamic access to its values per strict access. When a non-existent key is strictly accessed, Elixir raises an error immediately, allowing developers to find bugs faster. The next code illustrates the refactoring ofplot/1
, removing this smell:defmodule Graphics do def plot(point) do #...some code... # Strict access to use point values {point.x, point.y, point.z} #...some code... end end #...Use examples... iex(1)> point_2d = %{x: 2, y: 3} %{x: 2, y: 3} iex(2)> point_3d = %{x: 5, y: 6, z: nil} %{x: 5, y: 6, z: nil} iex(3)> Graphics.plot(point_2d) ** (KeyError) key :z not found in: %{x: 2, y: 3} # <= explicitly warns that graphic.ex:6: Graphics.plot/1 # <= the z key does not exist! iex(4)> Graphics.plot(point_3d) {5, 6, nil}
As shown below, another alternative to refactor this smell is to replace a
Map
with astruct
(named map). By default, structs only support strict access to values. In this way, accesses will always return clear and objective results:defmodule Point do @enforce_keys [:x, :y] defstruct [x: nil, y: nil] end #...Use examples... iex(1)> point = %Point{x: 2, y: 3} %Point{x: 2, y: 3} iex(2)> point.x # <= strict access to use point values 2 iex(3)> point.z # <= trying to access a non-existent key ** (KeyError) key :z not found in: %Point{x: 2, y: 3} iex(4)> point[:x] # <= by default, struct does not support dynamic access ** (UndefinedFunctionError) ... (Point does not implement the Access behaviour)
These examples are based on code written by José Valim (@josevalim). Source: link
Speculative Assumptions
-
Category: Low-level concerns smells.
-
Note: Formerly known as "Unplanned value extraction".
-
Problem: Overall, Elixir application’s are composed of many supervised processes, so the effects of an error will be localized in a single process, not propagating to the entire application. A supervisor will detect the failing process, and restart it at that level. For this type of design to behave well, it's important that problematic code crashes when it fails to fulfill its purpose. However, some code may have undesired behavior making many assumptions we have not really planned for, such as being able to return incorrect values instead of forcing a crash. These speculative assumptions can give a false impression that the code is working correctly.
-
Example: The code shown below is an example of this smell. The function
get_value/2
tries to extract a value from a specific key of a URL query string. As it is not implemented using pattern matching,get_value/2
always returns a value, regardless of the format of the URL query string passed as a parameter in the call. Sometimes the returned value will be valid; however, if a URL query string with an unexpected format is used in the call,get_value/2
will extract incorrect values from it:defmodule Extract do @doc """ Extract value from a key in a URL query string. """ def get_value(string, desired_key) do parts = String.split(string, "&") Enum.find_value(parts, fn pair -> key_value = String.split(pair, "=") Enum.at(key_value, 0) == desired_key && Enum.at(key_value, 1) end) end end #...Use examples... # URL query string according to with the planned format - OK! iex(1)> Extract.get_value("name=Lucas&university=UFMG&lab=ASERG", "lab") "ASERG" iex(2)> Extract.get_value("name=Lucas&university=UFMG&lab=ASERG", "university") "UFMG" # Unplanned URL query string format - Unplanned value extraction! iex(3)> Extract.get_value("name=Lucas&university=institution=UFMG&lab=ASERG", "university") "institution" # <= why not "institution=UFMG"? or only "UFMG"?
-
Refactoring: To remove this code smell,
get_value/2
can be refactored through the use of pattern matching. So, if an unexpected URL query string format is used, the function will be crash instead of returning an invalid value. This behavior, shown below, will allow clients to decide how to handle these errors and will not give a false impression that the code is working correctly when unexpected values are extracted:defmodule Extract do @doc """ Extract value from a key in a URL query string. Refactored by using pattern matching. """ def get_value(string, desired_key) do parts = String.split(string, "&") Enum.find_value(parts, fn pair -> [key, value] = String.split(pair, "=") # <= pattern matching key == desired_key && value end) end end #...Use examples... # URL query string according to with the planned format - OK! iex(1)> Extract.get_value("name=Lucas&university=UFMG&lab=ASERG", "name") "Lucas" # Unplanned URL query string format - Crash explaining the problem to the client! iex(2)> Extract.get_value("name=Lucas&university=institution=UFMG&lab=ASERG", "university") ** (MatchError) no match of right hand side value: ["university", "institution", "UFMG"] extract.ex:7: anonymous fn/2 in Extract.get_value/2 # <= left hand: [key, value] pair iex(3)> Extract.get_value("name=Lucas&university&lab=ASERG", "university") ** (MatchError) no match of right hand side value: ["university"] extract.ex:7: anonymous fn/2 in Extract.get_value/2 # <= left hand: [key, value] pair
These examples are based on code written by José Valim (@josevalim). Source: link
Modules with identical names
-
Category: Low-level concerns smells.
-
Problem: This code smell is related to possible module name conflicts that can occur when a library is implemented. Due to a limitation of the Erlang VM (BEAM), also used by Elixir, only one instance of a module can be loaded at a time. If there are name conflicts between more than one module, they will be considered the same by BEAM and only one of them will be loaded. This can cause unwanted code behavior.
-
Example: The code shown below is an example of this smell. Two different modules were defined with identical names (
Foo
). When BEAM tries to load both simultaneously, only the module defined in the file (module_two.ex
) stay loaded, redefining the current version ofFoo
(module_one.ex
) in memory. That makes it impossible to callfrom_module_one/0
, for example:defmodule Foo do @moduledoc """ Defined in `module_one.ex` file. """ def from_module_one do "Function from module one!" end end
defmodule Foo do @moduledoc """ Defined in `module_two.ex` file. """ def from_module_two do "Function from module two!" end end
When BEAM tries to load both simultaneously, the name conflict causes only one of them to stay loaded:
iex(1)> c("module_one.ex") [Foo] iex(2)> c("module_two.ex") warning: redefining module Foo (current version defined in memory) module_two.ex:1 [Foo] iex(3)> Foo.from_module_two() "Function from module two!" iex(4)> Foo.from_module_one() # <= impossible to call due to name conflict ** (UndefinedFunctionError) function Foo.from_module_one/0 is undefined...
-
Refactoring: To remove this code smell, a library must standardize the naming of its modules, always using its own name as a prefix (namespace) for all its module's names (e.g.,
LibraryName.ModuleName
). When a module file is within subdirectories of a library, the names of the subdirectories must also be used in the module naming (e.g.,LibraryName.SubdirectoryName.ModuleName
). In the refactored code shown below, this module naming pattern was used. For this, theFoo
module, defined in the filemodule_two.ex
, was also moved to theutils
subdirectory. This refactoring, in addition to eliminating the internal conflict of names within the library, will prevent the occurrence of name conflicts with client code:defmodule MyLibrary.Foo do @moduledoc """ Defined in `module_one.ex` file. Name refactored! """ def from_module_one do "Function from module one!" end end
defmodule MyLibrary.Utils.Foo do @moduledoc """ Defined in `module_two.ex` file. Name refactored! """ def from_module_two do "Function from module two!" end end
When BEAM tries to load them simultaneously, both will stay loaded successfully:
iex(1)> c("module_one.ex") [MyLibrary.Foo] iex(2)> c("module_two.ex") [MyLibrary.Utils.Foo] iex(3)> MyLibrary.Foo.from_module_one() "Function from module one!" iex(4)> MyLibrary.Utils.Foo.from_module_two() "Function from module two!"
This example is based on the description provided in Elixir's official documentation. Source: link
Unnecessary macros
-
Category: Low-level concerns smells.
-
Problem:
Macros
are powerful meta-programming mechanisms that can be used in Elixir to extend the language. While implementingmacros
is not a code smell in itself, this meta-programming mechanism should only be used when absolutely necessary. Whenever a macro is implemented, and it was possible to solve the same problem using functions or other pre-existing Elixir structures, the code becomes unnecessarily more complex and less readable. Becausemacros
are more difficult to implement and understand, their indiscriminate use can compromise the evolution of a system, reducing its maintainability. -
Example: The code shown below is an example of this smell. The
MyMath
module implements thesum/2
macro to perform the sum of two numbers received as parameters. While this code has no syntax errors and can be executed correctly to get the desired result, it is unnecessarily more complex. By implementing this functionality as a macro rather than a conventional function, the code became less clear and less objective:defmodule MyMath do defmacro sum(v1, v2) do quote do unquote(v1) + unquote(v2) end end end #...Use examples... iex(1)> require MyMath MyMath iex(2)> MyMath.sum(3, 5) 8 iex(3)> MyMath.sum(3+1, 5+6) 15
-
Refactoring: To remove this code smell, the developer must replace the unnecessary macro with structures that are simpler to write and understand, such as named functions. The code shown below is the result of the refactoring of the previous example. Basically, the
sum/2
macro has been transformed into a conventional named function. Note that therequire
command is no longer needed:defmodule MyMath do def sum(v1, v2) do # <= macro became a named function! v1 + v2 end end #...Use examples... # No need to require anymore! iex(1)> MyMath.sum(3, 5) 8 iex(2)> MyMath.sum(3+1, 5+6) 15
This example is based on the description provided in Elixir's official documentation. Source: link
Dynamic atom creation
-
Category: Low-level concerns smells.
-
Note: This smell emerged from a study with mining software repositories (MSR).
-
Problem: An
atom
is a basic data type of Elixir whose value is its own name. They are often useful to identify resources or to express the state of an operation. The creation of anatom
do not characterize a smell by itself; however,atoms
are not collected by Elixir's Garbage Collector, so values of this type live in memory while an application is executing, during its entire lifetime. Also, BEAM limit the number ofatoms
that can exist in an application (1_048_576
) and eachatom
has a maximum size limited to 255 Unicode code points. For these reasons, the dynamic atom creation is considered a code smell, since in this way the developer has no control over how manyatoms
will be created during the execution of the application. This unpredictable scenario can expose an app to unexpected behavior caused by excessive memory usage, or even by reaching the maximum number ofatoms
possible. -
Example: The code shown below is an example of this smell. Imagine that you are implementing a code that performs the conversion of
string
values intoatoms
to identify resources. Thesestrings
can come from user input or even have been received as response from requests to an API. As this is a dynamic and unpredictable scenario, it is possible for identicalstrings
to be converted into newatoms
that are repeated unnecessarily. This kind of conversion, in addition to wasting memory, can be problematic for an application if it happens too often.defmodule Identifier do ... def generate(id) when is_bitstring(id) do String.to_atom(id) #<= dynamic atom creation!! end end #...Use examples... iex(1)> string_from_user_input = "my_id" "my_id" iex(2)> string_from_API_response = "my_id" "my_id" iex(3)> Identifier.generate(string_from_user_input) :my_id iex(4)> Identifier.generate(string_from_API_response) :my_id #<= atom repeated was created!
When we use the
String.to_atom/1
function to dynamically create anatom
, it is created regardless of whether there is already another one with the same value in memory, so when this happens automatically, we will not have control over meeting the limits established by BEAM. -
Refactoring: To remove this smell, as shown below, first you must ensure that all the identifier
atoms
are created statically, only once, at the beginning of an application's execution:# statically created atoms... _ = :my_id _ = :my_id2 _ = :my_id3 _ = :my_id4
Next, you should replace the use of the
String.to_atom/1
function with theString.to_existing_atom/1
function. This will allow string-to-atom conversions to just map the strings to atoms already in memory (statically created at the beginning of the execution), thus preventing repeatedatoms
from being created dynamically. This second part of the refactoring is presented below.defmodule Identifier do ... def generate(id) when is_bitstring(id) do String.to_existing_atom(id) #<= just maps a string to an existing atom! end end #...Use examples... iex(1)> Identifier.generate("my_id") :my_id iex(2)> Identifier.generate("my_id2") :my_id2 iex(3)> Identifier.generate("non_existent_id") ** (ArgumentError) errors were found at the given arguments: * 1st argument: not an already existing atom
Note that in the third use example, when a
string
different from an already existingatom
is given, Elixir shows an error instead of performing the conversion. This demonstrates that this refactoring creates a more controlled and predictable scenario for the application in terms of memory usage.This example and the refactoring are based on the Elixir's official documentation. Sources: 1, 2
About
This catalog was proposed by Lucas Vegi and Marco Tulio Valente, from ASERG/DCC/UFMG.
For more info see the following paper:
- Code Smells in Elixir: Early Results from a Grey Literature Review, International Conference on Program Comprehension (ICPC), 2022. [slides] [video] [podcast (pt-BR) - English subtitles available]
Please feel free to make pull requests and suggestions (Issues tab).
Acknowledgments
We are supported by FinbitsTM, a Brazilian Elixir-based fintech:
Our research is also part of the initiative called Research with Elixir (in portuguese).