• Stars
    star
    2,936
  • Rank 15,436 (Top 0.4 %)
  • Language
    Python
  • License
    MIT License
  • Created over 6 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ

Ethereum ETL

Build Status Join the chat at https://gitter.im/ethereum-eth Telegram Discord

Ethereum ETL lets you convert blockchain data into convenient formats like CSVs and relational databases.

Do you just want to query Ethereum data right away? Use the public dataset in BigQuery.

Full documentation available here.

Quickstart

Install Ethereum ETL:

pip3 install ethereum-etl

Export blocks and transactions (Schema, Reference):

> ethereumetl export_blocks_and_transactions --start-block 0 --end-block 500000 \
--blocks-output blocks.csv --transactions-output transactions.csv \
--provider-uri https://mainnet.infura.io/v3/7aef3f0cd1f64408b163814b22cc643c

Export ERC20 and ERC721 transfers (Schema, Reference):

> ethereumetl export_token_transfers --start-block 0 --end-block 500000 \
--provider-uri file://$HOME/Library/Ethereum/geth.ipc --output token_transfers.csv

Export traces (Schema, Reference):

> ethereumetl export_traces --start-block 0 --end-block 500000 \
--provider-uri file://$HOME/Library/Ethereum/parity.ipc --output traces.csv

Stream blocks, transactions, logs, token_transfers continually to console (Reference):

> pip3 install ethereum-etl[streaming]
> ethereumetl stream --start-block 500000 -e block,transaction,log,token_transfer --log-file log.txt \
--provider-uri https://mainnet.infura.io/v3/7aef3f0cd1f64408b163814b22cc643c

Find other commands here.

For the latest version, check out the repo and call

> pip3 install -e . 
> python3 ethereumetl.py

Useful Links

Running Tests

> pip3 install -e .[dev,streaming]
> export ETHEREUM_ETL_RUN_SLOW_TESTS=True
> export PROVIDER_URL=<your_porvider_uri>
> pytest -vv

Running Tox Tests

> pip3 install tox
> tox

Running in Docker

  1. Install Docker: https://docs.docker.com/get-docker/

  2. Build a docker image

     > docker build -t ethereum-etl:latest .
     > docker image ls
    
  3. Run a container out of the image

     > docker run -v $HOME/output:/ethereum-etl/output ethereum-etl:latest export_all -s 0 -e 5499999 -b 100000 -p https://mainnet.infura.io
     > docker run -v $HOME/output:/ethereum-etl/output ethereum-etl:latest export_all -s 2018-01-01 -e 2018-01-01 -p https://mainnet.infura.io
    
  4. Run streaming to console or Pub/Sub

     > docker build -t ethereum-etl:latest .
     > echo "Stream to console"
     > docker run ethereum-etl:latest stream --start-block 500000 --log-file log.txt
     > echo "Stream to Pub/Sub"
     > docker run -v /path_to_credentials_file/:/ethereum-etl/ --env GOOGLE_APPLICATION_CREDENTIALS=/ethereum-etl/credentials_file.json ethereum-etl:latest stream --start-block 500000 --output projects/<your-project>/topics/crypto_ethereum
    

If running on Apple M1 chip add the --platform linux/x86_64 option to the build and run commands e.g.:

docker build --platform linux/x86_64 -t ethereum-etl:latest .
docker run --platform linux/x86_64 ethereum-etl:latest stream --start-block 500000

Projects using Ethereum ETL

  • Google - Public BigQuery Ethereum datasets
  • Nansen - Analytics platform for Ethereum

More Repositories

1

bitcoin-etl

ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
Python
405
star
2

ethereum-etl-airflow

Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. How to get any Ethereum smart contract into BigQuery https://towardsdatascience.com/how-to-get-any-ethereum-smart-contract-into-bigquery-in-8-mins-bab5db1fdeee
Python
405
star
3

awesome-bigquery-views

Useful SQL queries for Blockchain ETL datasets in BigQuery.
205
star
4

public-datasets

The list of public blockchain datasets in BigQuery
187
star
5

ethereum-etl-postgres

ETL for moving Ethereum data to PostgreSQL database
Shell
137
star
6

polygon-etl

ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Python
100
star
7

blockchain-etl-streaming

Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Python
77
star
8

ethereum2-etl

Python scripts for ETL (extract, transform and load) jobs for Ethereum 2.0 beacon blocks, attestations, deposits, slashings, validators, committees. Data is available in Google BigQuery
Python
68
star
9

blockchain-etl-architecture

Blockchain ETL Architecture
44
star
10

ethers.js-bigquery

ethers.js library, compiled for use in Google BigQuery
JavaScript
39
star
11

solana-etl-airflow

ETL for Solana. Contributions are welcome. Join the Telegram channel https://t.me/joinchat/GsMpbA3mv1OJ6YMp3T5ORQ
Python
32
star
12

bitcoin-etl-airflow

Airflow DAGs for https://github.com/blockchain-etl/bitcoin-etl
Python
31
star
13

blockchain-kubernetes

Kubernetes manifests for running blockchain nodes
Smarty
26
star
14

ethereum-etl-neo4j

ETL for moving Ethereum data to Neo4j database
Shell
20
star
15

bigquery-to-pubsub

A tool for streaming time series data from a BigQuery table to a Pub/Sub topic
Python
16
star
16

bitcoin-etl-airflow-neo4j

Airflow DAGs for ingesting Bitcoin blockchain data to Neo4j
Python
14
star
17

tezos-etl

Python scripts for ETL (extract, transform and load) jobs for Tezos blocks, balance updates, and operations
Python
13
star
18

blockchain-etl-table-definition-cli

CLI for generating table definitions for https://github.com/blockchain-etl/ethereum-etl-airflow
Python
12
star
19

hedera-etl

ETL scripts for Hedera Hashgraph
Java
11
star
20

blockchain-streaming-analytics

Blockchain streaming analytics
Java
9
star
21

eos-etl

ETL scripts for EOS.
Python
9
star
22

ethereum-export-pipeline

UNMAINTAINED! AWS CloudFormation scripts for Ethereum ETL export pipeline
Python
8
star
23

blockchain-etl-dataflow

Dataflow pipelines for Blockchain ETL. Connects Pub/Sub topics with BigQuery tables.
Java
8
star
24

data-studio-connectors

Connect Google BigQuery crypto public datasets to Google Data Studio
JavaScript
7
star
25

abi-functions

7
star
26

blockchain-terraform-deployment

Template repository for deploying https://github.com/blockchain-etl/blockchain-terraform
HCL
6
star
27

icon-etl

Python scripts for ETL (extract, transform and load) jobs for ICON blocks, transactions, receipts, and logs.
Python
6
star
28

ethereum2-etl-airflow

Airflow DAGs for exporting Ethereum 2.0 blockchain data to Google BigQuery
Python
5
star
29

blockchain-terraform

Terraform configuration files for running blockchain nodes
HCL
5
star
30

blockchain-etl-common

Common utils for blockchain-etl
Python
5
star
31

abi

EVM public good - pull requests welcome for any ABI from any EVM
5
star
32

band-etl

ETL (extract, transform and load) tools for ingesting Band Protocol blockchain data to Google BigQuery and Pub/Sub
Python
5
star
33

tezos-kubernetes

Kubernetes manifests for running Tezos node
Shell
5
star
34

abi-parser

Web app which parses smart contracts and outputs queries and tables for Ethereum-ETL
JavaScript
5
star
35

iotex-etl

ETL (extract, transform and load) tools for ingesting IoTeX blockchain data to Google BigQuery and Pub/Sub
Python
5
star
36

ordinals-etl

Python
4
star
37

anomalous-transactions-detector-dataflow

Dataflow pipeline for detecting anomalous transactions on the Ethereum and Bitcoin blockchains
Java
4
star
38

solana-etl

Rust
4
star
39

etl-rust

Rust
4
star
40

twitter-bot-cloud-function

Google Cloud Function for tweeting Blockchain ETL alerts
JavaScript
3
star
41

zilliqa-etl

Python scripts for ETL (extract, transform and load) jobs for Zilliqa blockchain data
Python
3
star
42

pubsub-to-firestore-dataflow

Dataflow pipeline that pulls messages from a Pub/Sub topic and saves them in a Firestore collection
Java
2
star
43

eos-etl-airflow

Airflow DAGs for https://github.com/blockchain-etl/eos-etl
Python
2
star
44

bitcoin-rpc

Bitcoin JSON RPC client in Python
2
star
45

icon-etl-airflow

Airflow DAGs for exporting, loading, and parsing the ICON blockchain data.
Python
2
star
46

tezos-etl-airflow

Airflow DAGs for exporting and loading the Tezos blockchain data to Google BigQuery
Python
2
star
47

throttle-pubsub-cloud-function

Google Cloud Function that can throttle messages in a Pub/Sub topic
JavaScript
1
star
48

theta-etl

Python
1
star
49

iotex-kubernetes

Helm charts for running IoTeX node
Shell
1
star