eventmachine-tail
This project contains two EventMachine extensions.
First, it adds an event-driven file-following similar to the unix ‘tail -f’
command. For example, you could use it to follow /var/log/messages the same way
tail -f would.
Second, it adds event-driven file patterns allowing you to watch a given file
pattern for new or removed files. For example, you could watch /var/log/*.log
for new/deleted files.
Why?
For logstash, the log agents were
event-driven using EventMachine. The log agents mainly get their data from
logfiles. To that end, we needed a way to treat log files as a stream.
There’s a ruby gem ‘file-tail’ that implements tailing, but not in an
event-driven way. This makes it hard to use in EventMachine programs like
logstash.
Thus, eventmachine-tail was born.
Further, the usage patterns for logstash required the ability to watch a
directory (or a file pattern) for new log files.
Get eventmachine-tail
To install eventmachine-tail, you only need to use gem:
gem install eventmachine-tail
This will also install the rtail tool described below.
Implementation Thoughts
“Don’t block the reactor”
EventMachine::FileTail will only read one chunk (64K by default) from any file
for each tick of the reactor. This helps ensure we don’t spend too much time
reading from the file.
A consequence of not reading the file “as fast as possible until eof” can cause
longer data sets to take longer to read since there will be a small delay
between each read since the reactor only runs about once every 50ms (depending
on the polling features used).
EOF Handling
When we are at the end of file, we rely on EventMachine::FileWatch to notify us
when the file changes. On Linux this uses inotify and on OS X and FreeBSD will
use kqueue to get event-driven notifications of file changes.
File truncation
If the file length ever shortens, we assume this means the file was truncated.
When this happens, we seek to the beginning of the file and continue reading.
File rotation
If we notice that the inode (or underlying device) has changed, we will reopen
the file by pathname. This change is generally an indication that our file has
been rotated as part of some periodic log rotation.
Globs
You can also use file glob patterns
to select files to tail. This pattern is checked periodically to see if new
files are found.
For example, if your java app writes to logfiles with generated filenames that
are hard to predict, you should use EventMachine::FileGlobWatch to match any
generated files as they are generated.
Globs are implemented in a polling manner because of the “non-obviousness” of
implementing a glob notifier with file watches. It is on my todo list to
implement in this way, however.
Example code
For a simple file tailing example, look at tail.rb
For an example of using glob watching file patterns, look at globwatch.rb
For an example of using both glob watching and file tailing, check out glob-tail.rb – this project provides a glue class that helps you easily tail globs.
rtail tool
A script comes with eventmachine-tail called ‘rtail’
This script allows you to tail files similar to tail(1) but allows you to use
the glob watching feature of eventmachine-tail to watch file path patterns
(globs) in a tail-ey way.
For example, if you want to ‘tail -f’ all files, recursively, in /var/log
except ones matching ‘*.gz’, you would use:
rtail -x "*.gz" "/var/log/**/*"
This will follow any existing files and follow newly-created ones that match
the glob given as they are created.
By default, rtail checks the glob pattern every 5 seconds. You can change this
value with the ‘-i’ flag.
You can also tell rtail to not prefix each line with the filename by giving the
‘-n’ flag.