ZenMonitor
ZenMonitor allows for the efficient monitoring of remote processes with minimal use of ERTS Distribution.
Installation
Add ZenMonitor
to your dependencies
def deps do
[
{:zen_monitor, "~> 2.1.0"}
]
end
Using ZenMonitor
ZenMonitor strives to be a drop-in replacement for Process.monitor/1
. To those ends, the
programming interface and all the complexities of how it carries out its task are simplified by a
simple unified programming interface. All the functions that the caller needs to use have
convenient delegates available in the top-level ZenMonitor
module. The interface is detailed
below.
ZenMonitor.monitor/1
This is a drop-in replacement for Process.monitor/1
when it comes to processes. It is
compatible with the various ways that Process.monitor/1
can establish monitors and will accept
one of a pid
, a name
which is the atom
that a local process is registered under, or a tuple
of {name, node}
for a registered process on a remote node. These are defined as the
ZenMonitor.destination
type.
ZenMonitor.monitor/1
returns a standard reference that can be used to demonitor
and can be
matched against the reference provided in the :DOWN
message.
Similar to Process.monitor/1
, the caller is allowed to monitor the same process multiple times,
each monitor will be provided with a unique reference and all monitors will fire :DOWN
messages
when the monitored process goes down. Even though the caller can establish multiple monitors,
ZenMonitor is designed to handle this efficiently, the only cost is an additional ETS row on the
local node and additional processing time at fan-out.
ZenMonitor.demonitor/2
This is a mostly drop-in replacement for Process.demonitor/2
when it comes to processes. The
first argument is the reference returned by ZenMonitor.monitor/1
. It accepts a list of option
atoms, but only honors the :flush
option at this time. Passing the :info
option is allowed
but has no effect, this function always returns true
.
ZenMonitor.compatibility/1
When operating in a mixed environment where some nodes are ZenMonitor compatible and some are not,
it may be necessary to check the compatibility of a remote node. ZenMonitor.compatibility/1
accepts any ZenMonitor.destination
and will report back one of :compatible
or :incompatible
for the remote's cached compatibility status.
All remotes start off as :incompatible
until a positively acknowledged connection is
established. See the ZenMonitor.connect/1
function for more information on connecting nodes.
ZenMonitor.compatibility_for_node/1
Performs the same operation as ZenMonitor.compatibility/1
but it accepts a node atom instead of
a ZenMonitor.destination
.
ZenMonitor.connect/1
Attempts a positive connection with the provided remote node. Connections are established by
using the @gen_module
's call/4
method to send a :ping
message to the process registered
under the atom ZenMonitor.Proxy
on the remote. If this process responds with a :pong
atom
then the connection is positively established and the node is marked as :compatible
. Any other
response or error condition (timeout / noproc / etc) will be considered negative acknowledgement.
ZenMonitor.connect/1
is actually a delegate for ZenMonitor.Local.Connector.connect/1
see the
documentation there for more information about how connect behaves.
Handling Down Messages
Any :DOWN
message receivers (most commonly GenServer.handle_info/2
callbacks) that match on
the reason should be updated to include an outer {:zen_monitor, original_match}
wrapper.
def handle_info({:DOWN, ref, :process, pid, :specific_reason}, state) do
...
end
Should be updated to the following.
def handle_info({:DOWN, ref, :process, pid, {:zen_monitor, :specific_reason}}, state) do
...
end
Why?
ZenMonitor
was developed at Discord to improve the stability of our
real-time communications infrastructure. ZenMonitor
improves stability in a couple of
different ways.
Traffic Calming
When a process is being monitored by a large number of remote processes, that process going down can cause both the node hosting the downed process and the node hosting the monitoring processes to be suddenly flooded with an large amount of work. This is commonly referred to as a thundering herd and can overwhelm either node depending on the situation.
ZenMonitor relies on interval batching and GenStage
to help calm the deluge into a throttled
stream of :DOWN
messages that may take more wall clock time to process but has more predictable
scheduler utilization and network consumption.
Message Interspersing
In the inverse scenario, a single process monitoring a large number of remote processes, a
systemic failure of a large number of monitored processes can result in blocking the message
queue. This can cause other messages being sent to the process to backup behind the :DOWN
messages.
Here's what a message queue might look like if 100,000 monitors fired due to node failure.
+------------------------------------------------+
| {:DOWN, ref, :process, pid_1, :nodedown} |
+------------------------------------------------+
| {:DOWN, ref, :process, pid_2, :nodedown} |
+------------------------------------------------+
... snip 99,996 messages ...
+------------------------------------------------+
| {:DOWN, ref, :process, pid_99_999, :nodedown} |
+------------------------------------------------+
| {:DOWN, ref, :process, pid_100_000, :nodedown} |
+------------------------------------------------+
| :work |
+------------------------------------------------+
| :work |
+------------------------------------------------+
| :work |
+------------------------------------------------+
... etc ...
The process has to process the 100,000 :DOWN
messages before it can get back to doing work, if
the processing of a :DOWN
message is non-trivial then this could result in the process
effectively appearing unresponsive to callers expecting it to do :work
.
ZenMonitor.Local.Dispatcher
provides a configurable batch sweeping system that dispatches a
fixed demand_amount of :DOWN
messages every demand_interval (See the documentation for
ZenMonitor.Local.Dispatcher
for configuration and defaults). Using ZenMonitor
the message
queue would look like this.
+------------------------------------------------+
| {:DOWN, ref, :process, pid_1, :nodedown} |
+------------------------------------------------+
... snip 4,998 messages ...
+------------------------------------------------+
| {:DOWN, ref, :process, pid_5000, :nodedown} |
+------------------------------------------------+
| :work |
+------------------------------------------------+
... snip messages during demand_interval ...
+------------------------------------------------+
| :work |
+------------------------------------------------+
| {:DOWN, ref, :process, pid_5001, :nodedown} |
+------------------------------------------------+
... snip 4,998 messages ...
+------------------------------------------------+
| {:DOWN, ref, :process, pid_10_000, :nodedown} |
+------------------------------------------------+
| :work |
+------------------------------------------------+
... snip messages during demand_interval ...
+------------------------------------------------+
| :work |
+------------------------------------------------+
... etc ...
This means that the process can continue processing work messages while working through more
manageable batches of :DOWN
messages, this improves the effective responsiveness of the process.
Message Truncation
:DOWN
messages include a reason
field that can include large stack traces and GenServer state
dumps. Large reason
s generally don't pose an issue, but in a scenario where thousands of
processes are monitoring a process that generates a large reason
the cumulative effect of
duplicating the large reason
to each monitoring process can consume all available memory on a
node.
When a :DOWN
message is received for dispatch to remote subscribers, the first step is to
truncate the message using ZenMonitor.Truncator
, see the module documentation for more
information about how truncation is performed and what configuration options are supported.
This prevents the scenario where a single process with a large stack trace or large state gets amplified on the receiving node and consumes an large amount of memory.
Design
ZenMonitor is constructed of two cooperating systems, the Local ZenMonitor System and the Proxy ZenMonitor System. When a process wishes to monitor a remote process, it should inform the Local ZenMonitor System which will efficiently dispatch the monitoring request to the remote node's Proxy ZenMonitor System.
Local ZenMonitor System
The Local ZenMonitor System is composed of a few processes, these are managed by the ZenMonitor.Local.Supervisor. The processes that comprise the Local ZenMonitor System are described in detail in the following section.
ZenMonitor.Local
ZenMonitor.Local is responsible for accepting monitoring and demonitoring requests from local processes. It will send these requests to the Connector processes for efficient transmission to the responsible ZenMonitor.Proxy processes.
When a monitored process dies, the ZenMonitor.Proxy will send this information in a summary message to the ZenMonitor.Local.Connector process which will use the send down_dispatches to ZenMonitor.Local for eventual delivery by the ZenMonitor.Local.Dispatcher.
ZenMonitor.Local is also responsible for monitoring the local interested process and performing clean-up if the local interested process crashes for any reason, this prevents the Local ZenMonitor System from leaking memory.
ZenMonitor.Local.Tables
This is a simple process that is responsible for owning shared ETS tables used by various parts of the Local ZenMonitor System.
It maintains two tables, ZenMonitor.Local.Tables.Nodes
and ZenMonitor.Local.Tables.References
these tables are public and are normally written to and read from by the ZenMonitor.Local and
ZenMonitor.Local.Connector processes.
ZenMonitor.Local.Connector
ZenMonitor.Local.Connector is responsible for batching monitoring requests into summary requests for the remote ZenMonitor.Proxy. The Connector handles the actual distribution connection to the remote ZenMonitor.Proxy including dealing with incompatible and down nodes.
When processes go down on the remote node, the Proxy ZenMonitor System will report summaries of these down processes to the corresponding ZenMonitor.Local.Connector.
There will be one ZenMonitor.Local.Connector per remote node with monitored processes.
ZenMonitor.Local.Dispatcher
When a remote node or remote processes fail, messages will be enqueued for delivery. The ZenMonitor.Local.Dispatcher is responsible for processing these enqueued messages at a steady and controlled rate.
Proxy ZenMonitor System
The Proxy ZenMonitor System is composed of a few processes, these are managed by the
ZenMonitor.Proxy.Supervisor
. The processes that comprise the Proxy ZenMonitor System are
described in detail in the following section.
ZenMonitor.Proxy
ZenMonitor.Proxy
is responsible for handling subscription requests from the
Local ZenMonitor System and for maintaining the ERTS Process Monitors on the processes local to
the remote node.
ZenMonitor.Proxy
is designed to be efficient with local monitors and will guarantee that for any
local process there is, at most, one ERTS monitor no matter the number remote processes and remote
nodes are interested in monitoring that process.
When a local process goes down ZenMonitor.Proxy
will enqueue a new death certificate to the
ZenMonitor.Proxy.Batcher
processes that correspond to the interested remotes.
ZenMonitor.Proxy.Tables
This is a simple process that is responsible for owning shared ETS tables used by various parts of the Proxy ZenMonitor System.
It maintains a single table, ZenMonitor.Proxy.Tables.Subscribers
. This table is used by both
the ZenMonitor.Proxy
and ZenMonitor.Proxy.Batcher
processes.
ZenMonitor.Proxy.Batcher
This process has two primary responsibilities, collecting and summarizing death certificates and monitoring the remote process.
For every remote ZenMonitor.Local.Connector
that is interested in monitoring processes on this
node, a corresponding ZenMonitor.Proxy.Batcher
is spawned that will collect and ultimately
deliver death certificates. The ZenMonitor.Proxy.Batcher
will also monitor the remote
ZenMonitor.Local.Connector
and clean up after it if it goes down for any reason.
Running a Compatible Node
ZenMonitor ships with an Application, ZenMonitor.Application
which will start the overall
supervisor, ZenMonitor.Supervisor
. This creates a supervision tree as outlined below.
-------------------------
+----| ZenMonitor.Local.Tables |
| -------------------------
|
| ------------------
+----| ZenMontior.Local |
----------------------------- | ------------------
+----| ZenMonitor.Local.Supervisor |----|
| ----------------------------- | ------------- ----------------------------
| +----| GenRegistry |--N--| ZenMonitor.Local.Connector |
| | ------------- ----------------------------
| |
| | -----------------------------
| +----| ZenMonitor.Local.Dispatcher |
| -----------------------------
----------------------- |
| ZenMonitor.Supervisor |----|
----------------------- | -------------------------
| +----| ZenMonitor.Proxy.Tables |
| | -------------------------
| |
| ----------------------------- | ------------------
+----| ZenMonitor.Proxy.Supervisor |----+----| ZenMonitor.Proxy |
----------------------------- | ------------------
|
| ------------- --------------------------
+----| GenRegistry |--M--| ZenMonitor.Proxy.Batcher |
------------- --------------------------