map
Map lines from stdin to commands.
Description
Map lets you process each line from stdin with a command of your choice. For example:
$ ls
LICENSE README.md makefile map.1 map.c
$ ls | map f 'echo $f $f'
LICENSE LICENSE
README.md README.md
makefile makefile
map.1 map.1
map.c map.c
Note that the command must be wrapped in single quotes to prevent the variable from being expanded by the shell.
Installation
Install map into /usr/local/bin
with the following command:
$ make install
You can use make PREFIX=/some/other/directory install
if you wish
to use a different destination. If you want to remove map from
your system, use make uninstall
.
Motivation
There are many ways to accomplish what you can do with map
,
including find
, xargs
, awk
, and shell for-loops. The approach
taken by map
is extremely pragmatic and allows me to express
concisely what I want. Given the fact that it's designed as a filter,
it can operate on any kind of list, not only lists of files.
The problem that prompted me to think about map
was the following:
given a list of files, I wanted to execute two commands on each.
Here's how you can do it with different tools:
With map
:
ls *.c | map f 'foo $f; bar $f'
With xargs
:
ls *.c | xargs -I % sh -c 'foo %; bar %;'
With awk
:
ls *.c | awk '{ system("foo "$0"; bar "$0) }'
With find
:
find . -name \*.c -maxdepth 1 -exec foo {} \; -exec bar {} \;
With a bash
for-loop:
for f in $(ls *.c)
do
foo $f
bar $f
done
With a csh
for-loop:
foreach f (*.c)
foo $f
bar $f
end
Map's modest claim is that it improves on the ergonomics of existing tools. It's not only that sometimes it allows you to type less, but also the conceptual model needed to operate it is simpler.
Let's consider these tasks:
- Execute a command
foo
on each C file:
ls *.c | map f 'foo $f'
ls *.c | xargs foo
find . -name *.c -maxdepth 1 -exec foo {} \;
- Execute commands
foo
andbar
on each C file:
ls *.c | map f 'foo $f; bar $f'
ls *.c | xargs -I % sh -c 'foo %; bar %;'
find . -name *.c -maxdepth 1 -exec foo {} \; -exec bar {} \;
- Download files from a list of URLs in a file:
cat urls | map u 'curl -O $u'
cat urls | xargs -n 1 curl -O
- Sleep three times for one second and say "done" after each elapsed second:
printf "1\n1\n1\n" | map t 'sleep $t && say done'
printf "1\n1\n1\n" | xargs -n 1 -I % sh -c 'sleep % && say done'
- Same as #4, but run the commands in parallel:
printf "1\n1\n1\n" | map t 'sleep $t && say done &'
printf "1\n1\n1\n" | xargs -n 1 -P 3 -I % sh -c 'sleep % && say done'
The last three examples are not possible with find
, because it
only operates on file hierarchies.
When using map
, the commands don't vary much because the second
argument is a template for a well known syntax. On the other hand,
there's more variation in the invocations to xargs
and find
,
which means you may need to remember those command line options if
you want an idiomatic solution.
As with anything in life, familiarity helps and if you use a tool
in a certain way over and over it will seem simple to operate, but
we can still analyze the conceptual models and determine how much
information is needed in each case. I would say the advantage of
map
is that it requires less knowledge.
Of course xargs
and find
are much larger tools, with a bigger
feature set. Map has around 20 lines of code. For comparison, here's
the source code of GNU xargs. No doubt xargs
will offer
a lot more features, but so far with map
I've completely stopped
using xargs
and for-loops. Another way to think about map
vs
xargs
: if map
had been specified in POSIX and xargs
was just
released, I'm not sure I would install it unless map
proved to
be unfit for a given use case.
Something important to keep in mind is that map
works on lines,
not on files. My reasoning was that in the context of a POSIX
environment, a map function can be expected to filter lines from
stdin. In that regard, it is very generic because a line can represent
anything.
Known issues
In the examples above I frequently use the output of ls
. In my
projects, filenames don't contain spaces, newlines, or other special
characters because I find them inconvenient, even though they are
valid. As map
works on lines, filenames that contain newlines
should be handled separately. For files that contain whitespace,
the common solution is to wrap the variable in double quotes, so
instead of $f
you would use "$f"
.
Contributing
If you find a bug, please create an issue detailing the ways to reproduce it. If you have a suggestion, create an issue detailing the use case.