• Stars
    star
    142
  • Rank 258,495 (Top 6 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created almost 11 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Easy to use map and starmap python equivalents

parmap

conda-forge version Documentation Status https://codecov.io/github/zeehio/parmap/coverage.svg?branch=master Code Climate

This small python module implements four functions: map and starmap, and their async versions map_async and starmap_async.

What does parmap offer?

  • Provide an easy to use syntax for both map and starmap.
  • Parallelize transparently whenever possible.
  • Pass additional positional and keyword arguments to parallelized functions.
  • Show a progress bar (requires tqdm as optional package)

Installation:

pip install tqdm # for progress bar support
pip install parmap

Usage:

Here are some examples with some unparallelized code parallelized with parmap:

Simple parallelization example:

import parmap
# You want to do:
mylist = [1,2,3]
argument1 = 3.14
argument2 = True
y = [myfunction(x, argument1, mykeyword=argument2) for x in mylist]
# In parallel:
y = parmap.map(myfunction, mylist, argument1, mykeyword=argument2)

Show a progress bar:

Requires pip install tqdm

# You want to do:
y = [myfunction(x) for x in mylist]
# In parallel, with a progress bar
y = parmap.map(myfunction, mylist, pm_pbar=True)
# Passing extra options to the tqdm progress bar
y = parmap.map(myfunction, mylist, pm_pbar={"desc": "Example"})

Passing multiple arguments:

# You want to do:
z = [myfunction(x, y, argument1, argument2, mykey=argument3) for (x,y) in mylist]
# In parallel:
z = parmap.starmap(myfunction, mylist, argument1, argument2, mykey=argument3)

# You want to do:
listx = [1, 2, 3, 4, 5, 6]
listy = [2, 3, 4, 5, 6, 7]
param = 3.14
param2 = 42
listz = []
for (x, y) in zip(listx, listy):
    listz.append(myfunction(x, y, param1, param2))
# In parallel:
listz = parmap.starmap(myfunction, zip(listx, listy), param1, param2)

Advanced: Multiple parallel tasks running in parallel

In this example, Task1 uses 5 cores, while Task2 uses 3 cores. Both tasks start to compute simultaneously, and we print a message as soon as any of the tasks finishes, retreiving the result.

import parmap
def task1(item):
    return 2*item

def task2(item):
    return 2*item + 1

items1 = range(500000)
items2 = range(500)

with parmap.map_async(task1, items1, pm_processes=5) as result1:
    with parmap.map_async(task2, items2, pm_processes=3) as result2:
        data_task1 = None
        data_task2 = None
        task1_working = True
        task2_working = True
        while task1_working or task2_working:
            result1.wait(0.1)
            if task1_working and result1.ready():
                print("Task 1 has finished!")
                data_task1 = result1.get()
                task1_working = False
            result2.wait(0.1)
            if task2_working and result2.ready():
                print("Task 2 has finished!")
                data_task2 = result2.get()
                task2_working = False
#Further work with data_task1 or data_task2

map and starmap already exist. Why reinvent the wheel?

The existing functions have some usability limitations:

  • The built-in python function map [1] is not able to parallelize.
  • multiprocessing.Pool().map [3] does not allow any additional argument to the mapped function.
  • multiprocessing.Pool().starmap allows passing multiple arguments, but in order to pass a constant argument to the mapped function you will need to convert it to an iterator using itertools.repeat(your_parameter) [4]

parmap aims to overcome this limitations in the simplest possible way.

Additional features in parmap:

  • Create a pool for parallel computation automatically if possible.
  • parmap.map(..., ..., pm_parallel=False) # disables parallelization
  • parmap.map(..., ..., pm_processes=4) # use 4 parallel processes
  • parmap.map(..., ..., pm_pbar=True) # show a progress bar (requires tqdm)
  • parmap.map(..., ..., pm_pool=multiprocessing.Pool()) # use an existing pool, in this case parmap will not close the pool.
  • parmap.map(..., ..., pm_chunksize=3) # size of chunks (see multiprocessing.Pool().map)

Limitations:

parmap.map() and parmap.starmap() (and their async versions) have their own arguments (pm_parallel, pm_pbar...). Those arguments are never passed to the underlying function. In the following example, myfun will receive myargument, but not pm_parallel. Do not write functions that require keyword arguments starting with pm_, as parmap may need them in the future.

parmap.map(myfun, mylist, pm_parallel=True, myargument=False)

Additionally, there are other keyword arguments that should be avoided in the functions you write, because of parmap backwards compatibility reasons. The list of conflicting arguments is: parallel, chunksize, pool, processes, callback, error_callback and parmap_progress.

Acknowledgments:

This package started after this question, when I offered this answer, taking the suggestions of J.F. Sebastian for his answer

Known works using parmap

References

[1]http://docs.python.org/dev/library/functions.html#map
[2]http://docs.python.org/dev/library/multiprocessing.html#multiprocessing.pool.Pool.starmap
[3]http://docs.python.org/dev/library/multiprocessing.html#multiprocessing.pool.Pool.map
[4]http://docs.python.org/dev/library/itertools.html#itertools.repeat

More Repositories

1

facetscales

facet_grid with different scales per facet
89
star
2

ggpipe

ggplot with the pipe
R
58
star
3

condformat

R package to apply conditional formatting rules to a data.frame
R
24
star
4

festival

Personal changes to Festival I hope they reach upstream at some point
C++
13
star
5

speech-tools

Changes to Edinburgh speech-tools. Some from Debian, others by myself. I hope they reach upstream at some point.
C++
8
star
6

festival-debian

Festival speech synthesis repository
C++
8
star
7

festvox

Personal changes to festvox. I hope they reach upstream at some point.
Scheme
8
star
8

sgolay

Efficient Savitzky-Golay filtering for R
C
7
star
9

MassSpecWavelet

MassSpecWavelet Bioconductor package
R
6
star
10

ising

An Ising Model simulator using the Metropolis algorithm
C
5
star
11

mendeleyr

R Package that interfaces with Mendeley's API. Not official, not complete, not very supported (limited time). Patches welcome.
R
4
star
12

festival_suite

Unofficial Festival and Edinburgh Speech Tools repository. I just want to try things
C++
4
star
13

podman-blog-entry-tmp

1
star
14

festvox-ca-ona-hts

Catalan female speaker for festival, 16kHz HTS
Scheme
1
star
15

hassio-addons

JavaScript
1
star
16

speech-tools-debian

Speech Tools repository
C++
1
star
17

pyms

Automatically exported from code.google.com/p/pyms
Python
1
star
18

festivald

[WIP] A festival speech synthesis server running on a socket with systemd integration
C++
1
star
19

suitesparse_conda

Conda recipe for suitesparse
Shell
1
star
20

my-kodi-binaries

Kodi compiled for arm64 on alpine edge (personal unsupported tests)
Dockerfile
1
star