• Stars
    star
    664
  • Rank 67,903 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 13 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Rake, for Python

Shovel

Build Status Status: Production Team: Big Data Scope: External Open Source: MIT Critical: Yes

Shovel is like Rake for python. Turn python functions into tasks simply, and access and invoke them from the command line. 'Nuff said. New Shovel also now has support for invoking the same tasks in the browser you'd normally run from the command line, without any modification to your shovel scripts.

Philosophy of Shovel

  • Tasks should be easy to define -- one decorator, no options
  • Turning a function into task should change as little as possible -- we don't want you to have to change the function's interface at all
  • Arguments are strings -- rather than guess about the type of input parameters, we're just going to pass them into your function as strings. There's one exception, and that's for flags.
  • We'll inspect your tasks as much as we can -- the inspect module is extremely powerful, and we'll glean as much as possible about the arg spec, documentation, name, etc. of your function as possible. You shouldn't be burdened by telling us what we can find out programatically
  • Tasks should be accessible -- whether it's the command-line, through the browser, or through a chat client, the tasks you define should be easily accessed

Installing Shovel

Like most python projects, build with the included setup.py:

python setup.py install

Currently there are two dependencies:

# You'll need argparse
pip install argparse
# If you want to run the web app, you'll need bottle, too
pip install bottle

Using Shovel

Shovel looks for a file in the current working directory called shovel.py and executes it to find tasks you've defined (more on that in a second). If you'd like to modularize your tasks, create a shovel directory, and put as many python files as you'd like in that directory, and make as many files as you'd like in there. Do this recursively if you'd like. For example:

shovel/
	foo.py
	bar.py
	testing/
		foo.py
		zod.py
	util/
		hello.py

In this way, you can 'modularize' your tasks. By 'modularize,' these are not full python modules (and we don't currently examing __init__.py in each directory), but it makes it convenient for organization. Each task is prepended with the file and directory names. For example, if your shovel/testing/foo.py defined a task bar, then that task would have the name testing.foo.bar.

If you define tasks in shovel.py instead of using a shovel directory, those tasks will be in the global namespace. If shovel.py defines a task bar, then that task's name would simply be bar.

In these python files, use shovel by importing shovel's task decorator. Then, apply as needed:

from shovel import task

@task
def hello(name):
	'''Prints hello and the provided name'''
	print 'Hello, %s' % name

def not_a_task():
	'''Print I'm not considered a task in shovel'''
	pass

Global Tasks

You can now also keep a ~/.shovel.py file or ~/.shovel directory and to make tasks globally available.

Command Line Utility

Invoke shovel with the shovel command. If you would like to know more about what functions that shovel knows about, ask shovel help or shovel tasks.

If you'd like more information on a specific task or module, you can ask for more information with shovel help. Shovel can figure out lots of things about your tasks. Their names, file, line number, arguments, default arguments, if they take a variable number of parameters, and so forth. When you ask for help on a specific task, everything we know about that task will be presented to you.

# List the tasks in the testing directory
shovel help testing
# Get more help on the testing.test task
shovel help testing.test

Execute tasks with shovel and then the task name

shovel foo.hello

Arguments are passed in a strings, and we really try to give you the same semantics as when you'd normally invoke a function in python. For example, arguments are considered positional arguments by default, but you can provide a keyword name for specificity. For example, to execute foo.bar in a way equivalent to foo('1', '2', '3', hello='7'), you would invoke it:

shovel foo.bar 1 2 3 --hello 7

Keyword names are merely stripped of the leading dashes when parsed. Also be warned that shovel options (like --verbose and --dry-run) will not be available to your function. Speaking of which, if you would like shovel to be extra talkative (for debugging, perhaps), use the --verbose switch:

shovel --verbose foo.bar 1 2 3 --hello 7

Shovel has a dry-run option that will accept all the parameters you would normally pass into a task, but merely tells you how it would invoke a task. This can be helpful if you want to inspect the arguments that your task would get, to make sure that it's correctly invoked:

shovel --dry-run foo.bar 1 2 3 --hello 7

The one exception to arguments not being interpreted as strings is that orphan keyword arguments are interpreted as flags meaning 'True.' For example, if we executed the following, then a and b would be passed as True:

shovel foo.bar --a --b

The reason for this is that flags are common for tasks, and it's a relatively unambiguous syntax. To a human, the meaning is clear, and now it is to shovel.

Server and Campfire

The shovel utility used to ship with a server for making shovel tasks availble through the browser, as well as campfire bot. These have now been moved into their own repos for clarity and modularity: shovel-server and shovel-campfire. It's unclear how much updating will need to be done to those projects, but they can now be developed independently.

Command Line Auto-Complete

Because typing is no fun, the completions/ directory has information on how to set up auto-completion with different shells (currently only zsh). Thanks to philadams for starting this set of helpers!

Motivation

We had a project that had a fair number of semi-regularly used operational tasks, and we got sick of copy-and-paste, and we also didn't want to have a standalone script complete with argparse for each and every one. We didn't like the alternatives out there, and so, shovel. The original version constituted a weekend of work, and we've been eating our dog food ever since.

Recently, we realized that a lot of these operational details were intuitive enough that we thought some of our support staff would want to make use of them. Rather than make them keep a copy of the code checked out locally, and use the command line, we figured it would be easiest to make HTTP endpoints for them. That way, we could just add buttons to existing interfaces, and life would be good.

We soon realized that while a nice interface, it's a pain to maintain endpoints and command line tasks. So, why not make an interface that just runs those same tasks and does a little bit of presentation to make it a web interface? So now, without any additional work, you can start up the shovel-server and have access to all of the tasks you've been using from the command line. In this way, as developers we can keep one machine up to date and ready to run code, and still provide access to staff outside of the project.

Contributing

Pull requests and bug reports are welcome. For bugs, please check that the issue exists on the master branch before submitting a bug. Also, please include an example along with the current behavior and the expected behavior. Bonus points for adding a failing test.

For pull requests, you'll need to add or change tests in support of your proposed change. To run the tests:

nose2 -v

This installs all the packages required to run tests, runs the tests and provides coverage information.

More Repositories

1

simhash-py

Simhash and near-duplicate detection
Python
377
star
2

qless

Queue / Pipeline Management
Ruby
292
star
3

pyreBloom

Fast Redis Bloom Filters in Python
Python
286
star
4

interpol

A toolkit for working with API endpoint definition files, giving you a stub app, a schema validation middleware, and browsable documentation.
HTML
187
star
5

word2gauss

Gaussian word embeddings
Python
186
star
6

reppy

Modern robots.txt Parser for Python
Python
178
star
7

SEOmozAPISamples

Mozscape API sample code
Java
158
star
8

simhash-cpp

Simhashing in C++
C++
121
star
9

url-py

URL Transformation, Sanitization
Python
102
star
10

qless-core

Core Lua Scripts for qless
Python
83
star
11

simhash-db-py

Python API for Various DB-Backed Simhash Clusters
Python
63
star
12

qless-py

Python Bindings for qless
Python
48
star
13

qdr

Query-Document Relevance
Python
43
star
14

dragnet_data

Training/test data for Dragnet
Shell
41
star
15

publicsuffix-elixir

Elixir library providing public suffix logic based on publicsuffix.org data
Elixir
38
star
16

linkscape-gem

Provides an interface to SEOmoz's suite of APIs, including the free and site intelligence APIs.
Ruby
38
star
17

simhash-cluster

A cluster implementation of simhash near-duplicate detection
Python
33
star
18

Social-Authority-SDK

Ruby
33
star
19

s3po

Your Friendly Asynchronous S3 Upload Protocol Droid
Python
30
star
20

GWT-keyword-analysis

Analysis of Google Webmaster Tools search data
Python
25
star
21

g-crawl-py

Gevent Crawling in Python, with Utilities
Python
23
star
22

mozsci

Data science tools from Moz
Python
22
star
23

url-cpp

C++ bindings for url parsing and sanitization
C++
19
star
24

vocab

Vocabulary using n-grams
Python
16
star
25

uri_parser

A fast URI parser that wraps Google's chromium URL canonicalization library
C++
13
star
26

downpour

Fetch urls quickly and asynchronously with Twisted, honoring politeness.
Python
13
star
27

rep-cpp

Robot exclusion protocol in C++
C++
12
star
28

mltk

mltk - Moz Language Tool Kit
Python
12
star
29

plines

Easily create job pipelines out of declared job dependencies using Qless.
Ruby
10
star
30

awssh

AWSSH Config
Python
9
star
31

roger-mesos

A complete mesos cluster setup with automatic load balancing
Python
8
star
32

linkscape-py

Python Bindings for Linkscape's API
Python
5
star
33

qless-js

Node.js bindings for qless
JavaScript
5
star
34

roger-bamboo

Roger's internal load balancer and frontend proxy. Based on https://github.com/QubitProducts/bamboo
Go
5
star
35

gzippy

Gzip files in python
Python
4
star
36

asis

Lightweight As-Is Server
Python
4
star
37

awscpp

AWS C++ Bindings
C++
3
star
38

rack-authenticate

Rack middleware that handles basic auth and HMAC auth
Ruby
3
star
39

elasticsearch-utils

Some elasticsearch utilities I've put together / been using in investigating elasticsearch performance
Python
3
star
40

pyjudy

Python bindings to libJudy
Python
3
star
41

resque-unfairly

A Resque plugin for processing queues from random jobs based on queue weightings. Inspired by resque-fairly.
Ruby
3
star
42

roger-monitoring

Monitoring stack for RogerOS
Python
3
star
43

crawl-curio-cabinet

A Curio Cabinet of the Odd Behaviors We've Seen on the Internet
HTML
3
star
44

qless-docker

Create a qless docker image!
Ruby
2
star
45

irobot

robots.txt file inspection
Ruby
2
star
46

bloomfilter-py

Simple and fast Bloom filter
Python
2
star
47

docker-sortdb

Docker setup for SortDB
Shell
1
star
48

qless-java

qless java binding
Java
1
star
49

zendesk-search

Search for tags and such in zendesk
JavaScript
1
star
50

deb-swift

1
star
51

fiji

Cell schemas and schema versioning for HBase
HTML
1
star
52

p5-Webservice-Followerwonk-SocialAuthority

Perl Client for The Followerwonk Social Authority API
Perl
1
star
53

qless-util-py

Utilities for use with qless-py
Python
1
star
54

process_tree_dictionary

Implements a dictionary that is scoped to a process tree for Erlang and Elixir.
Elixir
1
star
55

moz_nav

DEPRECATED. Common navigation and layout across all SEOmoz applications
Ruby
1
star
56

logtools

Stuff for reading crawler log files. Probably not of much interest to those outside of SeoMOZ.
Python
1
star