Component
The Component Library makes it easy to create simple servers. It is an attempt to make it so easy to write trivial standalone servers that people will just naturally split their applications up that way.
A component is a simple module, containing what look like function definitions. This library generates from it an API module, a GenServer module, and an implementation module.
โ Developer Health Warningโ The component library is a work in progress. It seems to work, but it is not yet battle tested. As people play with it, we'll end up making changes to fix problems and add cool facilities. Please experiment with it. But don't bet your business on it.
๐บ README Roadmap
Sometimes you want your palate to be tempted. Sometimes you just want to eat.
The first part of this README is the motivation for this library. It's a quick read, but feel free to skip it if you're looking for the main course.
Still here? Cool. Here's a storyโฆ
Let's Grow a Service
Monday starts with a new user story. The UI folks want to keep a list of which users get "page not found" responses from our app. Someone else is modifying the controller chain: our job is to record the data.
You decide to implement a simple map where the keys are the user IDs and the values are a list of the URLs that 404'd for that user.
defmodule FourOhFour do
def create() do
%{}
end
def record_404(history, user, url) do
Map.update(history, user, [ url ], &[ url | &1 ])
end
def for_user(history, user) do
Map.get(history, user, [])
end
end
You're a thoughtful developer: you decided that the users of your
module shouldn't have to know about its internal state, so you provided
a create
function that returns the initial empty map.
You submit the PR, and the reviewers come back with "where's the GenServer?". You refrain from the obvious "you never mentioned it should be a server" and instead modify your module:
defmodule FourOhFour do
use GenServer
def start_link() do
GenServer.start_link(__MODULE__, %{})
end
def record_404(pid, user, url) do
GenServer.cast(pid, { :record_404, user, url })
end
def for_user(pid, user) do
GenServer.call(pid, { :for_user, user })
end
def init(empty_history) do
{ :ok, empty_history }
end
def handle_cast({ :record_404, user, url }, history) do
new_history = Map.update(history, user, [ url ], &[ url | &1 ])
{ :noreply, new_history }
end
def handle_call({ :for_user, user }, _from, history) do
result = Map.get(history, user, [])
{ :reply, result, history }
end
end
This is the canonical Elixir GenServer, drawn straight from the original Erlang. You've always felt uncomfortable with the way it intermixes the API, the implementation, and all the housekeeping, but everyone does it that way....
Another day, another code review. Someone just realized that there's only one instance of this 404 store, so we can make it a named process and stop having to pass the pid around. You sigh and fire up the editor:
defmodule FourOhFour do
use GenServer
@me __MODULE__
def start_link(_) do
GenServer.start_link(__MODULE__, %{}, name: @me)
end
def record_404(user, url) do
GenServer.cast(@me, { :record_404, user, url })
end
def for_user(user) do
GenServer.call(@me, { :for_user, user })
end
def init(empty_history) do
{ :ok, empty_history }
end
def handle_cast({ :record_404, user, url }, history) do
new_history = Map.update(history, user, [ url ], &[ url | &1 ])
{ :noreply, new_history }
end
def handle_call({ :for_user, user }, _from, history) do
result = Map.get(history, user, [])
{ :reply, result, history }
end
end
That's something else that's always bugged you: the way the API code has to change even though we just changed the implementation. Oh well....
A month later the project lead for a different application comes over. "We really like the results that folks are seeing with your 404 logging." she says. Can you turn it into a standalone Elixir application so we can include it as a dependency?
You start to work on your resumรฉ.
The Start of a Moral
That's a lot of code churn. And none of it involved the actual logic of the module; it was all the boilerplate surrounding code that changed.
Clearly, this is the kind of stuff we do all the time, and the changes are so minor that we just shrug them off as a cost of doing business.
But I think the real cost is nothing to do with writing all those
handle_xxx
functions. Instead the cost is in the way we think about
our code.
When we come to write something in Elixir, we're forced to answer two questions at the same time: how does it work, and how does it run? What's the logic, and what's the lifecycle? And we have to know both before we start. Switching lifecycle models has a (small) cost, and that means we try to guess it right up front. Changing from a library module to a server is fairly mechanical, but it still doubles the size of the code. And changing from a server to a free-standing component is a fairly big deal.
An aside: Application/Project/Component/Service/...?
Elixir has unfortunately adopted some of the bad naming history from Erlang. As a result, we have words such as project and application that can mean many different things, even within the same codebase.
I'm proposing we clarify things. Let's call the thing created when we run
mix new
a component. A component is an entity that can be shared and deployed. It has its own set of dependencies and configuration. It can be stored in its own source control repository or hex project (although it needn't be)When we create something that delivers business value, we package together a number of components. One of these is nominated to be the code's entry point (using
mod:
inmix.exs
). Let's call this thing that we built an assembly.
Back to the story...
We all know that highly coupled code is hard to change, and that the need to accommodate change is why we spend time thinking about good design. If we came from a Rails background, we've heard stories of (or participated in) Monorail projects: single Rails applications with hundreds of classes, tens or hundreds of thousands of lines of code, and a dependency map that looks like the wad of hair you pull out of the shower drain.
Rails apps get that way because it's easier to add new code into the existing code base than to split it out as a separate entity.
It's convenience over conscience.
I see a lot of evidence that we're falling into the same habits in the Elixir world. I've seen many multi-thousand line modules. I've rarely seen a Phoenix app where the developers have implemented the business logic in other, free-standing apps (and I don't count the things in umbrella apps as being free standing, firstly because the individual components are not sharable, and secondly because that fact that all the code is in one place tempts developers to just call randomly between the child apps.)
So the Component library is an attempt to start an exploration of alternatives. It's a first try at a framework that guides us to think of our code as self-contained components. It does this by making components as easy to write and use as any other code.
Components and the 404 Logger
Let's go back to the original 404 component. The initial implementation stays the same:
defmodule FourOhFour do
def create() do
%{}
end
def record_404(history, user, url) do
Map.update(history, user, [ url ], &[ url | &1 ])
end
def for_user(history, user) do
Map.get(history, user, [])
end
end
Now someone says they want it to be a server. We use the component framework to add all the boilerplate for us:
defmodule FourOhFour do
use Component.Strategy.Dynamic,
state_name: :history,
initial_state: %{}
one_way record_404(history, user, url) do
Map.update(history, user, [ url ], &[ url | &1 ])
end
two_way for_user(history, user) do
Map.get(history, user, [])
end
end
The use Component...
stuff says that this module is a GenServer (by
default named the same as the module). The variable history
is used
to pass around the state, and the initial value of the state for each
server we create is the empty map. We start its supervisor running with
FourOhFour.initialize()
and create new server processes with
FourOhFour.create()
The only other change to the original is that we changed the def
of
the record_404
function to be one_way
, and the def
of for_user
to be two_way
.
A one-way function's prime job is to update state. Its return value becomes the new state of our server. It is implemented under the covers using a GenServer cast.
A two-way function returns a value (and so is a GenServer call). Its return value is what is given back to the caller of the API. If you don't need to update state, that's all you have to do. If you do need to change the state as well as return a value, you can do that as well.
Now the second code review asks for this to become a singleton named server. We sigh at the magnitude of the request and change the code:
defmodule FourOhFour do
use Component.Strategy.Global,
state_name: :history,
initial_state: %{}
one_way record_404(history, user, url) do
Map.update(history, user, [ url ], &[ url | &1 ])
end
two_way for_user(history, user) do
Map.get(history, user, [])
end
end
Yup: the only change is to use the Global
strategy.
Finally, we're asked to make this into an independent component. That's also a simple change:
defmodule FourOhFour do
use Component.Strategy.Global,
state_name: :history,
initial_state: %{},
top_level: true
one_way record_404(history, user, url) do
Map.update(history, user, [ url ], &[ url | &1 ])
end
two_way for_user(history, user) do
Map.get(history, user, [])
end
end
The top_level: true
parameter adds Application
behaviour to this
module and adds a top-level supervisor. Just add mod: FourOhFour
to
your mix.exs
and your 404 logger will be started automatically when it
is included in any other assembly.
So...
Using the Component library has changed the way I write Elixir. I now break my code into lots of small components, each an Elixir/Erlang application). I then assemble these together using regular dependencies. (During development, when things are fluid, I use path dependencies. Later I may change these to git dependencies. I could also use hex.)
I'd like to encourage you to think about your code the same way, as assemblies of simple components.
I'd also like to hear your feedback. This is just an experiment: it's the starting point of an ongoing discussion. For now, let's use the issues list for this.
I'll consider all this as time well spent if we manage to get people thinking about how they structure applications.
And they all lived happily ever after.
The Details
Component Types
We support a number of component types:
- global: a singleton process
- dynamic: on-demand processes
- pooled: a pool of processes that typically represent limited resources
- hungry: a pool of processes that process a collection in parallel
Global Components
A global component runs as a singleton process, accessed by name. All calls to it are resolved to this single process, and the state is persisted across calls. A logging facility might be implemented as a global component.
Here's a global component that stores a list of words in its state, exporting a function that returns a random word.
defmodule Dictionary do
use Component.Strategy.Global,
state_name: :word_list,
initial_state: read_word_list()
two_way random_word() do # <- this is the externally accessible interface
word_list |> Enum.random()
end
# helper
defp read_word_list() do
"../assets/words.txt"
|> Path.expand(__DIR__)
|> File.read!
|> String.split("\n", trim: true)
end
end
To get it running, you call
Dictionary.create()
Then, anywhere in the application, you can get a random word using
word = Dictionary.random_word()
Dynamic Components
A dynamic component is a factory that creates worker processes on demand. The workers run the code declared in the component's module. Each worker maintains its own state. When you're done with a worker, you destroy it. You could create dynamic components when someone first connects to your web app, and use it to maintain that person's state for the lifetime of their session.
Here's a dynamic component that implements a set of counters:
defmodule Counter do
use Component.Strategy.Dynamic,
state_name: :count,
initial_state: 0
one_way increment(by \\ 1) do
count + by
end
two_way value() do
count
end
end
Because the dynamic component has multiple workers, you must first initialize the overall component. This is a one-time thing:
Counter.initialize()
Whenever you need a new counter, you first create it. You then call its functions:
acc1 = Counter.create
acc2 = Counter.create
Counter.increment(acc1, 2)
Counter.value(acc1) #=> 2
Counter.value(acc2) #=> 0
Pooled Components
A pooled component represents a pool of worker processes. When you call a pooled worker, it handles your request using its existing state, and any updates to that state are retained: the worker is a resource that is shared on a call-by-call basis. Workers may be automatically created and destroyed as demand dictates. You might use pooled workers to manage access to limited resources (database connections are a common example).
defmodule StockQuoteConnection do
use Component.Strategy.Pooled,
state_name: :quote_connection,
initial_state: Quotes.connect_to_service()
two_way get_quote(symbol) do
Quotes.get_quote(quote_connection, symbol)
end
end
Pooled resources are always called transactionally, so there's no need to create a worker. You still have to initialize the component, though.
StockQuoteConnection.initialize()
values = pmap(symbols, &StockQuoteConnection.get_quote(&1))
Hungry Components
A hungry component defines a way to process a collection, where the processing of items in the collection is automatically parallelized.
defmodule FaceRecognizer do
use Component.Strategy.Hungry
def process(%JPeg{ image: image }) do
image |> jpeg_to_bitmap |> Vision.recognize_face()
end
def process(%PNG{ image: image }) do
image |> png_to_bitmap |> Vision.recognize_face()
end
end
Unlike the other components, you define the action to be taken on a
member of the collection by writing a function called process
. This
can use pattern matching and guard clauses to vary the behaviour
depending on the value passed in.
You invoke the hungry component using
people = FaceRecognizer.consume(collection_of_images)
By default, the results are returned as a list, where each entry is
the value of applying the processing to the corresponding value in the
input collection. You can override this by providing an into:
parameter.
contacts = ContactCollection.new
people = FaceRecognizer.consume(collection_of_images, into: contacts)
A hungry consumer will normally run a worker process for each of the
process schedulers available on the current node (which is normally
the number of available CPUs). You can override this globally for a
particular consumer with the default_concurrency
option:
defmodule FaceRecognizer do
use Component.Strategy.Hungry,
default_concurrency: 10
. . .
You can also override it on a particular call to consume
using the
concurrency:
option.
people = FaceRecognizer.consume(collection_of_images, concurrency: 5)
Choosing a Component Type
It's all about the state. Shared state.
If you don't share your state with anybody then good news, you don't need processes and you don't need this library (for now). You will live a happier life than the rest of us.
Is there a single state shared between all users of your component (for example, it acts as a registry, logger, or other singleton resource)? If so, you need a global component.
Does your component maintain state across multiple calls, and do you need multiple versions of that state? For example, are you representing a user session, or the state of games being played? If so, use a dynamic component, where each component maintains state for the session/game/....
Do you have a limited set of external resources that you need to share across your application (for example, database connections, access to rate-limited services, and so on)? If so, use a pooled component, where each component's state represents one of the external resources, and each time you call a component you claim that resource for the duration of the call.
Do you have work that needs doing against a collection of data (for example, analyzing a bunch of images, reducing a large amount of data statistically, or other large-scale data mapping operations)? If so, use the hungry component, which holds no state between calls.
One and Two Way Functions
A component defines its external interface using the one_way
and
two_way
declarations. These look and behave precisely like functions
defined using def
, except they do not support guard clauses.
As its name implies, a one way function does not send a response to
its caller. It is also asynchronous. (Internally, it is implemented
using GenServer.cast
. The return value of a one_way
function is the
updated state.
A two way function returns a result to its caller, and so is synchronous
(yup, it uses GenServer.call
).
By default, the value returned by a two way function is the value returned to the caller. In this case, the state is not changed.
You update the state using one of the set_state
functions. The first
form takes the new state and a block as parameters. It sets the state
from the first parameter, and the value returned by the block becomes
the value returned by the function. For example:
# return the current value, and increment the state
two_way return_current_and_update(n) do
set_state(tally + n) do
tally
end
end
The second variant is set_state_and_return
. This takes a single value
and sets both the state and return value from it:
# increment the current state and return the new value
two_way update_and_return(n) do
set_state_and_return(tally + n)
end
State
With the exception of hungry consumers, all component types run one or more worker processes, and those workers maintain state.
The Component library makes you use the same name for this state in all
your one_way
and two_way
functions. This name is state
by default,
but can be changed using the state_name:
option.
defmodule Dictionary do
use Component.Strategy.Global,
state_name: :word_list, # <- our state is called `word_list`
initial_state: read_word_list()
two_way random_word(word_list) do
word_list |> Enum.random() # <- and we can refer to it by name
end
defp read_word_list(word_list) do
"../assets/words.txt"
|> Path.expand(__DIR__)
|> File.read!
|> String.split("\n", trim: true)
end
end
Controversy Trigger Alert!
People with a strong abhorrence of magic should skip the next section.
Because you declare the name to be used as the state variable, you can
omit it as a parameter to one_way
and two_way
and the component
library will add it in for you:
defmodule Dictionary do
use Component.Strategy.Global,
state_name: :word_list, # <- our state is called `word_list`
initial_state: read_word_list()
two_way random_word() do # <- no explicit parameter
word_list |> Enum.random() # but we can refer to it by name
end
# ...
end
Why would I even countenance such an evil use of the dark arts? It's
because I wanted to be able to write the one- and two-way functions to
reflect the way they are called and not the way they're implemented. In
a global component you'd call Dictionary.random_word()
with no
parameter, and I wanted the code in the module to look like this.
The library doesn't mind if you include the state variable or not: it's up to you
Initial State
The initial state of a component is set by a combination of things.
First, when you write a component, you can specify an initial state as an option. For example, the following code sets the initial state of the component to the result of reading the word list:
use Component.Strategy.Global,
state_name: :word_list,
initial_state: read_word_list() # <- run this each time a worker is created
You can override this initial state when you create a component by
passing a value to create()
.
Second, you can specify the default initial state using a function of arity one.
When you call create
for such a component, the override value you give
will be passed to this function, and the function's value becomes the
initial state. If you don't pass an override to create, the function
will receive nil
.
The following component has a two element map as a state. The
initial_state
function allows these elements to be individually
overwritten by create:
use Component.Strategy.Dynamic,
initial_state: fn overrides ->
Map.merge(
%{ one: :default_one, two: :default_two },
overrides || %{})
end
The code associated with the initial_state
option is invoked to set
the state each time a new worker process is created. This evaluation is
lazy. In this example the read_word_list
function is not called when
the module is defined. Instead, the code is saved and run when each
worker gets started.
The second way to set the state is when you create a worker.
defmodule Counter do
use Component.Strategy.Dynamic,
state_name: :count,
initial_state: 0
one_way increment(by \\ 1) do
count + by
end
two_way value() do
count
end
end
Here, if you call Counter.create()
, the initial state will be set to
0
, the value in the using
clause. If instead you pass a value, such
as Counter.create(99)
, that value will be used to set the state.
Name Scope
You can inspect the code created by component by adding the show_code: true
option. Here's the code for the Counter module:
defmodule FourOhFour do
@name Counter
def initialize() do
Component.Strategy.Dynamic.Supervisor.run(worker_module: __MODULE__.Worker, name: @name)
end
def create(override_state \\ CA.no_overrides()) do
spec = {__MODULE__.Worker, Common.derive_state(override_state, 0)}
Component.Strategy.Dynamic.Supervisor.create(@name, spec)
end
def destroy(worker) do
Component.Strategy.Dynamic.Supervisor.destroy(@name, worker)
end
nil
def increment(worker_pid, by) do
GenServer.cast(worker_pid, {:increment, by})
end
def value(worker_pid) do
GenServer.call(worker_pid, {:value}, 5000)
end
def wrapped_create() do
initialize()
end
defmodule(Worker) do
use(GenServer)
def start_link(args) do
GenServer.start_link(__MODULE__, args)
end
def init(state) do
{:ok, state}
end
def handle_cast({:increment, by}, ศศฤ
ศษ) do
count = ศศฤ
ศษ
new_state = __MODULE__.Implementation.increment(count, by)
{:noreply, new_state}
end
def handle_call({:value}, _, ศศฤ
ศษ) do
count = ศศฤ
ศษ
__MODULE__.Implementation.value(count) |> Common.create_genserver_response(ศศฤ
ศษ)
end
defmodule(Implementation) do
def increment(count, by) do
_ = var!(count)
count + by
end
def value(count) do
_ = var!(count)
count
end
end
end
end
Notice that we have three modules here. The top-level FourOhFour
contains the external API. The nested Worker
module is the Genserver
code, and the Implementation
module contains the code that you wrote
inside the one-way and two-way functions.
This structure reflects the way I've been writing GenServers by hand
(although I put Worker
and Implementation
into their own files).
However, it has a side-effect. The code inside your one- and two-way functions actually executes inside its own module. As a result this code won't work:
defmodule SalesTax do
use Component.Strategy.Dynamic,
state_name: :count,
initial_state: 0
two_way calculate_tax(item, quantity) do
sales_tax_calculation(item.price, item.tax_type, quantity)
end
def sales_tax_calculation(item.price, item.tax_type, quantity) do
# ...
end
end
The problem is that the call to sales_tax_calculation
happens inside
the SalesTax.Implementation
module and the function itself is defined
in SalesTax
.
Originally I solved this issue by automatically moving all functions
defined at the top-level into the Implementation
module. But I took
that out after I'd used it for a while. The reason is that I found it
tempted me into writing large modules containing the entire
implementation. I'd add just one more wafer-thin function
because it
was easy.
Now I simply write all the support code in one or more separate modules.
If there are only one or two of these support functions, I might just
put them into a Helpers
module inside the top-level:
defmodule SalesTax do
use Component.Strategy.Dynamic,
state_name: :count,
initial_state: 0
two_way calculate_tax(item, quantity) do
Helpers.sales_tax_calculation(item.price, item.tax_type, quantity)
end
defmodule Helpers do
def sales_tax_calculation(item.price, item.tax_type, quantity) do
# ...
end
end
end
However, as soon as this module threatens to become larger than a handful of lines I'll split it out into its own file.
GenServer Callbacks
Regardless of the component type, the code you write in the one_way
and two_way
declarations ends up running in a GenServer. The Component
library takes care of the housekeeping, so you can normally just ignore all
that. However, sometimes you need to be able to add code to the
GenServer that Component generates for you. In particular, you may need
to implement one or more of the GenServer callbacks (code_change/3
,
format_status/2
, handle_continue/2
, handle_info/2
, init/1
, and
terminate/2
).
You do this by writing this code in a callbacks
block. For example,
here's a simple module that reports on how many times its
record_event/0
function is called in each 5 second period.
defmodule Callbacks do
use Component.Strategy.Global,
top_level: true,
show_code: true,
state_name: :count,
initial_state: 0
one_way record_event() do
count + 1
end
callbacks do
def init(s) do
:timer.send_interval(5_000, :tick)
{ :ok, s }
end
def handle_info(:tick, count) do
IO.puts "#{count} events in the last 5 seconds"
{ :noreply, 0 }
end
end
end
Component Lifecycle
A global component must be created before use. Once created, it may be accessed by simply calling the functions it contains. There is no need to identify a particular worker, as there is only one per component. A global component may be destroyed, in which case it must be recreated before being used again.
Dynamic and pooled components must be initialized. This process does not necessarily create any worker processes; it simply prepares the component for use.
With dynamic components you gain access to a worker by telling the component to create it. This returns an identifier for that worker process, which you must pass to subsequent calls to functions in the component. You should eventually destroy workers that you create.
Pooled components are automatically created when needed, so there's no
need to call their create
function.
Type | Initialize | Create/destroy | Call |
---|---|---|---|
Global | โ | ||
Dynamic | |||
Pooled | โ | โ | |
Hungry | โ | consume() |
Hungry components have no state, and do not need to be created or destroyedโthis is handled automatically.
Components as Top-Level Applications
Part of the impetus for creating this was to encourage folks to write
single-responsibility components, one per mix project. To make this even
easier, if you have a single component in a mix project, you no longer
need an application.ex
. Instead
-
Add the option
top_level: true
to your component definition, and -
Point the
mod
option in yourmix.exs
directly at your component's module.
Here's a runnable example that implements a simple event counter: