• Stars
    star
    266
  • Rank 153,243 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 5 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

experiment-impact-tracker

The experiment-impact-tracker is meant to be a simple drop-in method to track energy usage, carbon emissions, and compute utilization of your system. Currently, on Linux systems with Intel chips (that support the RAPL or powergadget interfaces) and NVIDIA GPUs, we record: power draw from CPU and GPU, hardware information, python package versions, estimated carbon emissions information, etc. In California we even support realtime carbon emission information by querying caiso.com!

Once all this information is logged, you can generate an online appendix which shows off this information like seen here:

https://breakend.github.io/RL-Energy-Leaderboard/reinforcement_learning_energy_leaderboard/pongnoframeskip-v4_experiments/ppo2_stable_baselines,_default_settings/0.html

Installation

To install:

pip install experiment-impact-tracker

Usage

Please go to the docs page for detailed info on the design, usage, and contributing: https://breakend.github.io/experiment-impact-tracker/

If you think the docs aren't helpful or need more expansion, let us know with a Github Issue!

Below we will walk through an example together.

Add Tracking

We included a simple example in the project which can be found in examples/my_experiment.py

As show in my_experiment.py, you just need to add a few lines of code!

from experiment_impact_tracker.compute_tracker import ImpactTracker
tracker = ImpactTracker(<your log directory here>)
tracker.launch_impact_monitor()

This will launch a separate python process that will gather compute/energy/carbon information in the background.

NOTE: Because of the way python multiprocessing works, this process will not interrupt the main one even if the monitoring process errors out. To address this, you can add the following to periodically read the latest info from the log file and check for any errors that might've occurred in the tracking process. If you have a better idea on how to handle exceptions in the tracking thread please open an issue or submit a pull request!

info = tracker.get_latest_info_and_check_for_errors()

Alternatively, you can use context management!

experiment1 = tempfile.mkdtemp()
experiment2 = tempfile.mkdtemp()

with ImpactTracker(experiment1):
    do_something()

with ImpactTracker(experiment2):
    do_something_else()

To kick off our simple experiment, run python my_experiment.py. You will see our training starts and in the end the script will output something like Please find your experiment logs in: /var/folders/n_/9qzct77j68j6n9lh0lw3vjqcn96zxl/T/tmpcp7sfese

Now let's go over to the temp dir, we can see our logging there!

$ log_path=/var/folders/n_/9qzct77j68j6n9lh0lw3vjqcn96zxl/T/tmpcp7sfese
$ cd $log_path
$ tree 
.
└── impacttracker
    β”œβ”€β”€ data.json
    β”œβ”€β”€ impact_tracker_log.log
    └── info.pkl

You can then access the information via the DataInterface:

from experiment_impact_tracker.data_interface import DataInterface

data_interface1 = DataInterface([experiment1_logdir])
data_interface2 = DataInterface([experiment2_logdir])

data_interface_both = DataInterface([experiment1_logdir, experiment2_logdir])

assert data_interface1.kg_carbon + data_interface2.kg_carbon == data_interface_both.kg_carbon
assert data_interface1.total_power + data_interface2.total_power == data_interface_both.total_power

Creating a carbon impact statement

We can also use a script to automatically generate a carbon impact statement for your paper! Just call this, we'll find all the logfiles generated by the tool and calculate emissions information! Specify your ISO3 country code as well to get a dollar amount based on the per-country cost of carbon.

generate-carbon-impact-statement my_directories that_contain all_my_experiments "USA"

Custom PUE

Some people may know the PUE of their data center, while we use a PUE of 1.58 in our calculations. To set a different PUE, do:

OVERRIDE_PUE=1.1 generate-carbon-impact-statement my_directories that_contain all_my_experiments "USA"

Generating an HTML appendix

After logging all your experiments into a dir, we can automatically search for the impact tracker's logs and generate an HTML appendix.

First, create a json file with the structure of the website you'd like to see (this lets you create hierarchies of experiment as web pages).

For an example of all the capabilities of the tool you can see the json structure here: https://github.com/Breakend/RL-Energy-Leaderboard/blob/master/leaderboard_generation_format.json

Basically, you can group several runs together and specify variables to summarize. You should probably just copypaste the example above and remove what you don't need, but here are some descriptions of what is being specified:

"Comparing Translation Methods" : {
  # FILTER: this regex we use to look through the directory 
  # you specify and find experiments with this in the directory structure,
  "filter" : "(translation)", 
 
  # Use this to talk about your experiment
  "description" : "An experiment on translation.", 
  
  # executive_summary_variables: this will aggregate the sums and averages across these metrics.
  # you can see available metrics to summarize here: 
  # https://github.com/Breakend/experiment-impact-tracker/blob/master/experiment_impact_tracker/data_info_and_router.py
  "executive_summary_variables" : ["total_power", "exp_len_hours", "cpu_hours", "gpu_hours", "estimated_carbon_impact_kg"],   
  
  # The child experiments to group together
  "child_experiments" : 
        {
            "Transformer Network" : {
                                "filter" : "(transformer)",
                                "description" : "A subset of experiments for transformer experiments"
                            },
            "Conv Network" : {
                                "filter" : "(conv)",
                                "description" : "A subset of experiments for conv experiments"
                            }
                   
        }
}

Then you just run this script, pointing to your data, the json file and an output directory.

create-compute-appendix ./data/ --site_spec leaderboard_generation_format.json --output_dir ./site/

To see this in action, take a look at our RL Energy Leaderboard.

The specs are here: https://github.com/Breakend/RL-Energy-Leaderboard

And the output looks like this: https://breakend.github.io/RL-Energy-Leaderboard/reinforcement_learning_energy_leaderboard/

Looking up cloud provider emission info

Based on energy grid locations, we can estimate emission from cloud providers using our tools. A script to do that is here:

lookup-cloud-region-info aws

Or you can look up emissions information for your own address!

% get-region-emissions-info address --address "Stanford, California"

({'geometry': <shapely.geometry.multipolygon.MultiPolygon object at 0x1194c3b38>,
  'id': 'US-CA',
  'properties': {'zoneName': 'US-CA'},
  'type': 'Feature'},
 {'_source': 'https://github.com/tmrowco/electricitymap-contrib/blob/master/config/co2eq_parameters.json '
             '(ElectricityMap Average, 2019)',
  'carbonIntensity': 250.73337617853463,
  'fossilFuelRatio': 0.48888711737336304,
  'renewableRatio': 0.428373256377554})
  

Asserting certain hardware

It may be the case that you're trying to run two sets of experiments and compare emissions/energy/etc. In this case, you generally want to ensure that there's parity between the two sets of experiments. If you're running on a cluster you might not want to accidentally use a different GPU/CPU pair. To get around this we provided an assertion check that you can add to your code that will kill a job if it's running on a wrong hardware combo. For example:

from experiment_impact_tracker.gpu.nvidia import assert_gpus_by_attributes
from experiment_impact_tracker.cpu.common import assert_cpus_by_attributes

assert_gpus_by_attributes({ "name" : "GeForce GTX TITAN X"})
assert_cpus_by_attributes({ "brand": "Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz" })

Building docs

sphinx-build -b html docsrc docs

Compatible Systems

Right now, we're only compatible with Linux and Mac OS X systems running NVIDIA GPU's and Intel processors (which support RAPL or PowerGadget).

If you'd like support for your use-case or encounter missing/broken functionality on your system specs, please open an issue or better yet submit a pull request! It's almost impossible to cover every combination on our own!

Mac OS X Suppport

Currently, we support only CPU and memory-related metrics on Mac OS X for Intel-based CPUs. However, these require the Intel PowerGadget driver and the Intel PowerGadget tool. The easiest way to install this is:

$ brew cask install intel-power-gadget
$ which "/Applications/Intel Power Gadget/PowerLog"

or for newer versions of OS X

$ brew install intel-power-gadget
$ which "/Applications/Intel Power Gadget/PowerLog"

You can also see here: https://software.intel.com/content/www/us/en/develop/articles/intel-power-gadget.html

This will install a tool called PowerLog that we rely on to get power measurements on Mac OS X systems.

Tested Successfully On

GPUs:

  • NVIDIA Titan X
  • NVIDIA Titan V

CPUs:

  • Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz
  • Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
  • 2.7 GHz Quad-Core Intel Core i7

OS:

  • Ubuntu 16.04.5 LTS
  • Mac OS X 10.15.6

Testing

To test, run:

pytest 

Citation

If you use this work, please cite our paper:

@misc{henderson2020systematic,
    title={Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning},
    author={Peter Henderson and Jieru Hu and Joshua Romoff and Emma Brunskill and Dan Jurafsky and Joelle Pineau},
    year={2020},
    eprint={2002.05651},
    archivePrefix={arXiv},
    primaryClass={cs.CY}
}

Also, we rely on a number of downstream packages and work to make this work possible. For carbon accounting, we relied on open source code from https://www.electricitymap.org/ as an initial base. psutil provides many of the compute metrics we use. nvidia-smi and Intel RAPL provide energy metrics.

More Repositories

1

gym-extensions

This repo is intended as an extension for OpenAI Gym for auxiliary tasks (multitask learning, transfer learning, inverse reinforcement learning, etc.)
Python
213
star
2

DeepReinforcementLearningThatMatters

Accompanying code for "Deep Reinforcement Learning that Matters"
Python
153
star
3

PileOfLaw

A dataset for pretraining language models targeted for legal tasks.
Jupyter Notebook
113
star
4

DialogDatasets

A repository linking to publicly available dialog datasets. Feel free to send pull requests.
HTML
66
star
5

MotionDetection

A project on motion detection in a noisy environment (shaky or moving camera), through background subtraction with single Gaussian models.
C++
47
star
6

OptionGAN

Code accompanying the OptionGAN paper.
Python
43
star
7

echo

Android Mesh Networking Chat with WiFI-Direct
Java
36
star
8

RLSSContinuousControlTutorial

Tutorial on continuous control at Reinforcement Learning Summer School 2017.
Python
34
star
9

ReproducibilityInContinuousPolicyGradientMethods

These are experiments for examining reproducibility in Policy Gradient RL algorithms in Continuous domains. Mainly using the Rllab implementation.
Python
18
star
10

EthicsInDialogue

OpenEdge ABL
15
star
11

MultiStepBootstrappingInRL

Here, we compare Q(\sigma) learning presented by Sutton and Barto in [1] to Tree-Backup, n-step Expected Sarsa, and n-step Sarsa.
Python
14
star
12

SocraticSwarm

A simulator and algorithms using deccentralized receding horizon control for coordinating autonomous UAV systems in completing a search task.
C#
14
star
13

SelfDestructingModels

Python
12
star
14

SarsaVsExpectedSarsa

An a bias-variance tradeoff of Sarsa vs. Expected Sarsa with experiments.
Jupyter Notebook
8
star
15

BayesianPolicyGradients

Python
7
star
16

CMACvTileCode

Python
7
star
17

ValuePolicyIterationVariations

Experiments testing variants of Value and Policy iterations.
Jupyter Notebook
5
star
18

ExperimentsInIRL

Python
4
star
19

TemporalYolo

Experiments on temporal YOLO
Python
4
star
20

WhatShouldICite

This is an informal record of original citations that I'm aware of for key terms in scientific literature. It started because I didn't know what's the original work to cite for eligibility traces and it seems important to do proper credit assignment.
4
star
21

orion-pytorch-ppo-acktr-a2c

An adapted version of the ikostrikov RL algorithm implementation for use with the OrΓ­on hyperparameter optimization framework.
Python
3
star
22

DeepMultiObjectTracking

Python
2
star
23

ClimateChangeFromMachineLearningResearch

Python
2
star
24

drqawrapper

Python
2
star
25

AdversarialGain

Python
2
star
26

echo-laptop

This is the laptop client to to connect to echo nodes
Java
1
star
27

LLM-Tuning-Safety.github.io

CSS
1
star
28

TARProtocols

Dataset of Discovery Validation Protocols
HTML
1
star
29

NeurIPS

A mirror for some of the NeurIPS website content with a new acronym.
HTML
1
star
30

Option-Critic-Turing-Machines

A development toybox and pitch for integrating the option-critic architecture with neural turing machines.
Jupyter Notebook
1
star
31

RL-Energy-Leaderboard

Python
1
star
32

AquaBoxDataset

A dataset for bounding box prediction in underwater environments of the Aqua-family of hexapod robots.
1
star
33

Vulnerabilities-In-Discovery-Tech-Experiment-1

Python
1
star
34

NLPAssignment1

Code for Comp599 Assignment 1 (TAC document classification using simple algos and uni/bigram models)
Python
1
star
35

TemporalDeepQLearning

Experiments in temporal deep Q learning
Python
1
star