A collection of public data sets for testing out visualization methods. These data sets are at various stages of preparation, some are just raw data, some are CSV files, and some are exposed as AMD modules. This collection is messy, but with some digging you may find hidden gems.
Interesting Datasets
Most recently added on the top.
- Data sources by EfoxMaps
- RawGraphs Sample Datasets
- nomis - UK Census data from 2021 (via Twitter)
- Cantometrics Data, from The Global Jukebox, music database
- Observable Curated Datasets | Old Observable Curated Datasets
- MFRED Electricity Usage Dataset
- BBC Shared Data Unit
- Most Popular Operating Systems
- Remaking Figures from Semiology of Graphics
- Awesome Public Datasets
- FMA: A Dataset For Music Analysis
- Food Nutrition Data
- Historical Weather Warnings
- PM2.5 Air Quality by Country over Time
- US Energy Information Administration Data (see also Analysis & Projections)
- Climate.gov Datasets
- The Economist Graphic Detail data
- Dataset collection: SPORTS DATA SETS FOR DATA MODELING, VISUALIZATION, PREDICTIONS, MACHINE-LEARNING.
- Dataset collection: information is beautiful - Data
- Dataset collection: R for Data Science Tidy Tuesdays
- Stranger Things Ratings
- SIPRI Arms Transfers Database
- CWUR - World University Rankings 2019-2020
- TopoJSON Collection World countries and subdivisions
- Classic datasets from Petra Isenberg et. al.
- Soul of the Community (American Statistical Association)
- World Population Prospects (United Nations)
- Employment (Bureau of Labor Statistics)
- Healthy People (Centers for Disease Control)
- GapMinder Data
- NASA Satellite-Derived Environmental Indicators
- IMF Public Finances in Modern History Database
- Executions in the US by type over time
- Datasets used in the book, An Introduction to Categorical Data Analysis
- Energy Information Administration Open Data
- Data sets from Five Thirty Eight
- Data sets in the Infovis Wiki
- Data sets from Andy Kirk's Link Archive
- Makeover Monday Datasets
- SOCR Datasets
- UCI Machine Learning Repository Datasets
- BrightKite User Check-insΒ (57.2 MB)
- ACLED (Armed Conflict Location and Event Data Project)Β (35MB)
- SafecastΒ (3.2GB)
- Statistical Computing Statistical Graphics Data expo Airline on-time performanceΒ (12GB)
- The GDELT Data SetΒ (~100GB)
- The Indian Census 2011
- Best Buy Developer API
Leads
These are "leads" to find interesting datasets. They have teasers of cool data, but it will take some work to find the data behind them.