• Stars
    star
    149
  • Rank 243,584 (Top 5 %)
  • Language
    Python
  • Created over 11 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Tool to easily start up an IPython cluster on different schedulers.

ipython-cluster-helper

Quickly and easily parallelize Python functions using IPython on a cluster, supporting multiple schedulers. Optimizes IPython defaults to handle larger clusters and simultaneous processes.

Example

Lets say you wrote a program that takes several files in as arguments and performs some kind of long running computation on them. Your original implementation used a loop but it was way too slow

from yourmodule import long_running_function
import sys

if __name__ == "__main__":
    for f in sys.argv[1:]:
        long_running_function(f)

If you have access to one of the supported schedulers you can easily parallelize your program across 5 nodes with ipython-cluster-helper

from cluster_helper.cluster import cluster_view
from yourmodule import long_running_function
import sys

if __name__ == "__main__":
    with cluster_view(scheduler="lsf", queue="hsph", num_jobs=5) as view:
        view.map(long_running_function, sys.argv[1:])

That's it! No setup required.

To run a local cluster for testing purposes pass run_local as an extra parameter to the cluster_view function

with cluster_view(scheduler=None, queue=None, num_jobs=5,
                  extra_params={"run_local": True}) as view:
    view.map(long_running_function, sys.argv[1:])

How it works

ipython-cluster-helper creates a throwaway parallel IPython profile, launches a cluster and returns a view. On program exit it shuts the cluster down and deletes the throwaway profile.

Supported schedulers

Platform LSF ("lsf"), Sun Grid Engine ("sge"), Torque ("torque"), SLURM ("slurm").

Credits

The cool parts of this were ripped from bcbio-nextgen.

Contributors

  • Brad Chapman (@chapmanb)
  • Mario Giovacchini (@mariogiov)
  • Valentine Svensson (@vals)
  • Roman Valls (@brainstorm)
  • Rory Kirchner (@roryk)
  • Luca Beltrame (@lbeltrame)
  • James Porter (@porterjamesj)
  • Billy Ziege (@billyziege)
  • ink1 (@ink1)
  • @mjdellwo
  • @matthias-k
  • Andrew Oler (@oleraj)
  • Alain PΓ©teut (@peteut)
  • Matt De Both (@mdeboth)
  • Vlad Saveliev (@vladsaveliev)

More Repositories

1

tiny-test-data

Super small biological datasets for unit testing
Python
60
star
2

bcbio.rnaseq

Mostly deprecated in favor of : https://github.com/hbc/bcbioRNASeq. Quality control, differential gene/transcript expression and pathway analysis for bcbio RNA-seq runs
Clojure
25
star
3

junkdrawer

collection of small scripts or how-tos that don't fall under a separate project
Python
21
star
4

Delicious

Source code to Delicious bookmarking site, pull requests welcome
Python
17
star
5

quantum-diceware

Diceware random password generation using the ANU quantum random number server as the randomness source
AGS Script
17
star
6

bipy

DEPRECATED Lightweight bioinformatics pipeline tools using iPython
Python
14
star
7

spp-idr

Wrappers around NGS peak callers to both call peaks and run IDR
Python
11
star
8

Quarkeon-Express

HParty 2011 hackathon project
Objective-C
3
star
9

fusion-gene-blacklist

blacklist of fusion genes
Python
3
star
10

chipseq-greylist

Python implementation of the GreyListChIP Bioconductor package
Python
3
star
11

python-gcon

Python API for interacting with several genomics storage solutions.
Python
2
star
12

bcbio-test-data

2
star
13

rna-editing-classifier

Classify A->I RNA editing events from RNA-seq variants
Shell
2
star
14

PitchWeather

Links Weather Underground weather data to a PitchFx database
Python
2
star
15

singlecell

single-cell data analysis incubator
Python
2
star
16

gffutils-utils

Collection of one off type scripts using a gffutils database
Python
2
star
17

miso-hg38-events

hg38 event level annotations for use with MISO, translated from hg19 from Crossmap.
2
star
18

singlecell-barcodes

whitelisted singlecell barcodes and information regarding where molecular/sample/cellular barcodes are in each read, for various singlecell protocols
2
star
19

zinbwave-deseq2-indrop

HTML
1
star
20

bcbio-nextgen-test-data

Unit test data for bcbio-nextgen
1
star
21

salmon-cwl

Example CWL workflow for running Salmon
1
star
22

simple-rnaseq-sim

1
star
23

spliced-blog

data for the -spliced- blog
1
star
24

spacemacs-private

private layers for spacemacs
Emacs Lisp
1
star
25

bik-manipulation

Training set of manipulated images from open access publications identified by Elisabeth Bik
1
star
26

cnvkit

Copy number variant detection from targeted DNA sequencing
Python
1
star