Dask GPU Dataframes
A partitioned gpu-backed dataframe, using Dask.
Setup from source
Setup from source repo:
-
Install dependencies into a new conda environment where
CUDA_VERSION
is either 9.2 or 10conda create -n dask-cudf \ -c rapidsai -c numba -c conda-forge -c defaults \ cudf dask cudatoolkit={CUDA_VERSION}
-
Activate conda environment:
source activate dask-cudf
-
Clone
dask-cudf
repo:git clone https://github.com/rapidsai/dask-cudf
-
Install from source:
cd dask-cudf pip install .
Test
-
Install
pytest
conda install pytest
-
Run all tests:
py.test dask_cudf
-
Or, run individual tests:
py.test dask_cudf/tests/test_file.py
Style
For style we use black
, isort
, and flake8
. These are available as
pre-commit hooks that will run every time you are about to commit code.
From the root directory of this project run the following:
pip install pre-commit
pre-commit install