• Stars
    star
    121
  • Rank 292,212 (Top 6 %)
  • Language
    Dockerfile
  • License
    MIT License
  • Created over 4 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Docker Images For Pangeo Jupyter Environment

Pangeo Docker Images

Documentation build status Build Status Publish Status DockerHub Version

The images defined in this repository capture reproducible computing environments used by Pangeo Cloud. They build on top of the Ubuntu operating system and include conda environments with a curated set of Python packages for geospatial analysis. While initially intended for Pangeo Cloud, they can be used outside of Pangeo infrastructure too!

More details can be found in our documentation.

Images are hosted on DockerHub and on Quay.io

Image Description Size Pulls
base-image Foundational Dockerfile for builds
base-notebook minimally functional image for pangeo hubs
pangeo-notebook base-notebook + core earth science analysis packages
pytorch-notebook pangeo-notebook + GPU-enabled pytorch
ml-notebook pangeo-notebook + GPU-enabled tensorflow2

Click on the image name in the table above for a current list of installed packages and versions

graph TD;
    base-image-->base-notebook;
    base-notebook-->pangeo-notebook;
    pangeo-notebook-->pytorch-notebook;
    pangeo-notebook-->ml-notebook;
    click base-image "https://hub.docker.com/r/pangeo/base-image" "Open this in a new tab" _blank
    click base-notebook "https://hub.docker.com/r/pangeo/base-notebook" "Open this in a new tab" _blank
    click pangeo-notebook "https://hub.docker.com/r/pangeo/pangeo-notebook" "Open this in a new tab" _blank
    click pytorch-notebook "https://hub.docker.com/r/pangeo/pytorch-notebook" "Open this in a new tab" _blank
    click ml-notebook "https://hub.docker.com/r/pangeo/ml-notebook" "Open this in a new tab" _blank

Other notes

  • Since 2020.10.16, mamba is installed into the base-image and conda-lock environment and is used by default to solve for a compatible environment (see #146)
  • For a simple list of packages for a given image, you can use a link like this: https://github.com/pangeo-data/pangeo-docker-images/blob/2020.10.08/pangeo-notebook/packages.txt
  • To compare changes between two images, you can use a link like this: https://github.com/pangeo-data/pangeo-docker-images/compare/2020.10.03..2020.10.08
  • Our ml-notebook image now contains JAX and TensorFlow with XLA enabled. Due to licensing issues, conda-forge does not have ptxas, but ptxas is needed for XLA to work correctly. Should you like to use JAX and/or TensorFlow with XLA optimization, please install ptxas on your own, for example, by conda install -c nvidia cuda-nvcc. At the time of writing (October 2022), JAX throws a compilation error if the ptxas version is higher than the driver version. There does not exist an easy solution for K80 GPUs, but in the case of T4 GPUs, you should install conda install -c nvidia cuda-nvcc==11.6.* to be safe. Alternatively for any GPU, you could set an environment variable to resolve the error caused by JAX: XLA_FLAGS="--xla_gpu_force_compilation_parallelism=1". The aforementioned error will be removed (and likely turned into a warning) in a future version of JAX. See google/jax#12776 (comment)
  • There used to be a pangeo/forge image, built for use with pangeo-forge. It is no longer actively maintained or used, but you can still use the historical tags if you wish.

Dask-gateway compatibility

The primary use of these Docker images is running on Pangeo Cloud deployments with dask-gateway. Generally, the dask-gateway library version built into the image must match the dask-gateway version deployed in the cloud environment. The follow table keeps track of the first time a new dask-gateway version appears in a tagged image:

dask-gateway Image tag
0.9 2020.11.06
0.8 2020.07.28
0.7 2020.04.22

More Repositories

1

pangeo

Pangeo website + discussion of general issues related to the project.
Jupyter Notebook
698
star
2

WeatherBench

A benchmark dataset for data-driven weather forecasting
Jupyter Notebook
684
star
3

awesome-open-climate-science

Awesome Open Atmospheric, Ocean, and Climate Science
529
star
4

climpred

🌎 Verification of weather and climate forecasts 🌍
Python
227
star
5

xESMF

Universal Regridder for Geospatial Data
Python
188
star
6

scikit-downscale

Statistical climate downscaling in Python
Python
182
star
7

rechunker

Disk-to-disk chunk transformation for chunked arrays.
Jupyter Notebook
162
star
8

pangeo-example-notebooks

Pangeo Example Notebooks
Jupyter Notebook
104
star
9

pangeo-tutorial

Interactive jupyter notebooks for pangeo tutorial events
Jupyter Notebook
89
star
10

cog-best-practices

Best practices with cloud-optimized-geotiffs (COGs)
Jupyter Notebook
77
star
11

pangeo-cloud-federation

Deployment automation for Pangeo JupyterHubs on AWS, Google, and Azure
JavaScript
58
star
12

pangeo-cmip6-examples

Examples of analysis of CMIP6 data using xarray and dask
Jupyter Notebook
55
star
13

mldata

ML Datasets Catalog
Python
54
star
14

pangeo-datastore

Pangeo Cloud Datastore
Python
48
star
15

education-material

An organizational meta-repo with pointers to all of the myriad educational materials available today (in any form)
32
star
16

pangeo-tutorial-sea-2018

Pangeo Tutorial for 2018 NCAR SEA Conference
Jupyter Notebook
31
star
17

jupyter-earth

Jupyter meets the Earth: combining research use cases in geosciences with technical developments within the Jupyter and Pangeo ecosystems.
Dockerfile
28
star
18

xcmocean

xarray accessor for automating choosing colormaps, aimed at geosciences
Python
22
star
19

ml-workflow-examples

Simple examples of data pipelines from xarray to ML training
Jupyter Notebook
22
star
20

pangeo-data.github.io

JavaScript
22
star
21

helm-chart

Pangeo helm charts
Shell
21
star
22

pangeo-ocean-examples

Examples of analysis of ocean data and simulation outputs using xarray, xgcm, and pangeo.
Jupyter Notebook
21
star
23

terraform-deploy

deployment of pangeo jupyterhub infrastructure with terraform
HCL
19
star
24

pangeo-binder

Pangeo + Binder (dev repo for a binder/pangeo fusion concept)
Python
18
star
25

pangeo-stacks

Curated Docker images for use with Jupyter and Pangeo
Python
17
star
26

pangeo-cmip6-cloud

Documentation for Pangeo CMIP6 data stored in GCP/AWS cloud
Python
17
star
27

pangeo-julia-examples

Working with pangeo cloud-based data with Julia
Jupyter Notebook
16
star
28

landsat-8-tutorial-gallery

Gallery repo for the pangeo-tutorial landsat-8 notebook on Pangeo Gallery http://gallery.pangeo.io/index.html
Jupyter Notebook
13
star
29

escience-2022

eScience 2022 course on Tools in Climate Science: Linking Observations with Modelling
Jupyter Notebook
13
star
30

llc4320_pangeo

Python codes reading and processing LLC4320 model
Jupyter Notebook
13
star
31

storage-benchmarks

testing performance of different storage layers
Jupyter Notebook
12
star
32

benchmarking

Benchmarking & Scaling Studies of the Pangeo Platform
Jupyter Notebook
12
star
33

distributed-array-examples

12
star
34

pangeo-era5

scripts and tools for ingesting ERA5 into cloud storage
Jupyter Notebook
11
star
35

pangeo-openeo-BiDS-2023

Pangeo & OpenEO Joint tutorial for BiDS23 - "Scaling Big Data Analysis with Pangeo and OpenEO: Unlocking the Power of Space Data"
Jupyter Notebook
10
star
36

pangeo-tutorial-gallery

Repo to house pangeo-tutorial notebooks for pangeo-gallery
Jupyter Notebook
10
star
37

zarr-proxy

A proxy for Zarr stores that allows for chunking overrides.
Python
9
star
38

swot_adac_ogcms

Documentation and notebooks for the SWOT Adopt-a-Crossover Model Intercomparison
Jupyter Notebook
9
star
39

pangeo-tools

Pangeo Tools RISE Slideshow
Jupyter Notebook
7
star
40

cmr

convergence pangeo + NASA CMR + NASA data on the cloud
Jupyter Notebook
6
star
41

pangeo-datastore-flask

Dynamic implementation of pangeo-datastore using Flask
CSS
5
star
42

foss4g-2021

Pangeo tutorial at FOSS4G 2021
Jupyter Notebook
5
star
43

openoceancloud

Website for openocean.cloud
HTML
5
star
44

atmos.pangeo.io-deploy

Deployment automation for atmos.pangeo.io
Jupyter Notebook
5
star
45

testcase_on_cnn

Experiment on CNN to climate data
Jupyter Notebook
5
star
46

astro.pangeo.io-deploy

Deployment automation for astro.pangeo.io
Jupyter Notebook
4
star
47

clivar-2022

Arctic Processes in CMIP6 Bootcamp 2022
Jupyter Notebook
4
star
48

cookiecutter-pangeo-binder

Pangeo-Binder Cookiecutter Template
Jupyter Notebook
4
star
49

esgf2xarray

utilities for loading esgf archives as xarray datasets
Python
4
star
50

governance

Governance Documents for Pangeo
3
star
51

pangeo-eosc

Pangeo for the European Open Science cloud
Jupyter Notebook
3
star
52

pangeo-datastore-stac

STAC implementation of Pangeo Catalog
Jupyter Notebook
3
star
53

multicloud-demo

Notebooks and infrastructure for Earthcube2020: Multi-Cloud workflows with Pangeo and Dask Gateway
Jupyter Notebook
3
star
54

geo-open-hack-2024

Event for geo-coders to explore open tools and approaches for enhancing geospatial analysis
3
star
55

bids2023_codesprint

Repository for the joint OSGEO and Pangeo code sprint at ESA BIDS in November 2023
Jupyter Notebook
3
star
56

pangeo-astro-examples

Binder for astronomy stuff on pangeo
Jupyter Notebook
2
star
57

notebook-binder

image configurations for pangeo-binder
2
star
58

storage-intern-projects

Command line utility for migrating netcdf datasets to cloud storage
2
star
59

example.pangeo.io-deploy

Deployment automation for example.pangeo.io
Jupyter Notebook
2
star
60

pangeo-ecco-llc

Demos of the ECCO LLC Reader
Jupyter Notebook
2
star
61

foss4g-2022

Pangeo tutorial at FOSS4G 2022
Jupyter Notebook
2
star
62

open-source-geoscience

A Binder-ready repo highlighting popular open-source goescience software tools
Jupyter Notebook
1
star
63

jupyterhub-monitoring

Grafana data and analysis products for the monitoring data on the research hubs
Jupyter Notebook
1
star
64

pangeo-geospatial-examples

Pangeo Geospatial Imagery Examples
Jupyter Notebook
1
star
65

pangeo-integration-tests

Integration testing for the Pangeo cloud ecosystem
Python
1
star
66

climpred-data

Data repository for climpred examples
Python
1
star
67

pangeo-for-hpc

Instructions and boilerplate for running Pangeo on HPC platforms
1
star
68

pangeo-igarss2024

Earthly marvels revealed: Pangeo, AI, and Copernicus in action
1
star
69

pangeo-binder-template

template repository for pangeo binder configuration
Jupyter Notebook
1
star