• Stars
    star
    152
  • Rank 244,685 (Top 5 %)
  • Language
    Python
  • Created over 3 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Small script to import the CSV files from https://datacatalog.worldbank.org/dataset/world-development-indicators into an SQLite database.

The dataset is licensed under CC-BY 4.0.

  1. Download the CSV zip (in the above link, go to Data & Resources, then select CSV)
  2. run create_db.py

The resulting database is around 600MB (7 million indicator values).

Examples

select count(*) as num_countries from wdi_country

num_countries
0 263

select country_code, long_name, currency_unit from wdi_country limit 5

country_code long_name currency_unit
0 ABW Aruba Aruban florin
1 AFG Islamic State of Afghanistan Afghan afghani
2 AGO People's Republic of Angola Angolan kwanza
3 ALB Republic of Albania Albanian lek
4 AND Principality of Andorra Euro

30 random indictaors

select series_code, indicator_name, substr(long_definition, 0, 500)||'...' as definition from wdi_series order by random() limit 30
series_code indicator_name definition
0 NY.GNP.MKTP.PP.KD GNI, PPP (constant 2017 international $) PPP GNI (formerly PPP GNP) is gross national income (GNI) converted to international dollars using purchasing power parity rates. An international dollar has the same purchasing power over GNI as a U.S. dollar has in the United States. Gross national income is the sum of value added by all resident producers plus any product taxes (less subsidies) not included in the valuation of output plus net receipts of primary income (compensation of employees and property income) from abroad. Data are in ...
1 TM.VAL.MRCH.CD.WT Merchandise imports (current US$) Merchandise imports show the c.i.f. value of goods received from the rest of the world valued in current U.S. dollars....
2 SE.XPD.CPRM.ZS Current education expenditure, primary (% of total expenditure in primary public institutions) Current expenditure is expressed as a percentage of direct expenditure in public educational institutions (instructional and non-instructional) of the specified level of education. Financial aid to students and other transfers are excluded from direct expenditure. Current expenditure is consumed within the current year and would have to be renewed if needed in the following year. It includes staff compensation and current expenditure other than for staff compensation (ex. on teaching materials,...
3 DT.ODA.ODAT.GI.ZS Net ODA received (% of gross capital formation) Net official development assistance (ODA) consists of disbursements of loans made on concessional terms (net of repayments of principal) and grants by official agencies of the members of the Development Assistance Committee (DAC), by multilateral institutions, and by non-DAC countries to promote economic development and welfare in countries and territories in the DAC list of ODA recipients. It includes loans with a grant element of at least 25 percent (calculated at a rate of discount of 10 per...
4 SE.SEC.ENRL.GC.FE.ZS Secondary education, general pupils (% female) Secondary general pupils are the number of secondary students enrolled in general education programs, including teacher training....
5 SE.SEC.CUAT.UP.ZS Educational attainment, at least completed upper secondary, population 25+, total (%) (cumulative) The percentage of population ages 25 and over that attained or completed upper secondary education....
6 SP.URB.TOTL Urban population Urban population refers to people living in urban areas as defined by national statistical offices. It is calculated using World Bank population estimates and urban ratios from the United Nations World Urbanization Prospects. Aggregation of urban and rural population may not add up to total population because of different country coverages....
7 SP.M15.2024.FE.ZS Women who were first married by age 15 (% of women ages 20-24) Women who were first married by age 15 refers to the percentage of women ages 20-24 who were first married by age 15....
8 SP.RUR.TOTL.ZG Rural population growth (annual %) Rural population refers to people living in rural areas as defined by national statistical offices. It is calculated as the difference between total population and urban population....
9 NE.GDI.FTOT.CN Gross fixed capital formation (current LCU) Gross fixed capital formation (formerly gross domestic fixed investment) includes land improvements (fences, ditches, drains, and so on); plant, machinery, and equipment purchases; and the construction of roads, railways, and the like, including schools, offices, hospitals, private residential dwellings, and commercial and industrial buildings. According to the 1993 SNA, net acquisitions of valuables are also considered capital formation. Data are in current local currency....
10 IC.FRM.OUTG.ZS Value lost due to electrical outages (% of sales for affected firms) Average losses due to electrical outages, as percentage of total annual sales. The value represents average losses for all firms which reported outages (please see indicator IC.ELC.OUTG.ZS)....
11 EP.PMP.SGAS.CD Pump price for gasoline (US$ per liter) Fuel prices refer to the pump prices of the most widely sold grade of gasoline. Prices have been converted from the local currency to U.S. dollars....
12 EG.ELC.LOSS.ZS Electric power transmission and distribution losses (% of output) Electric power transmission and distribution losses include losses in transmission between sources of supply and points of distribution and in the distribution to consumers, including pilferage....
13 IQ.SCI.PRDC Periodicity and timeliness assessment of statistical capacity (scale 0 - 100) The periodicity and timeliness indicator assesses the availability and periodicity of key socioeconomic indicators. It measures the extent to which data are made accessible to users through transformation of source data into timely statistical outputs. The periodicity score is calculated as the weighted average of 10 underlying indicator scores. The final periodicity score contributes 1/3 of the overall Statistical Capacity Indicator score....
14 SE.SEC.CUAT.UP.FE.ZS Educational attainment, at least completed upper secondary, population 25+, female (%) (cumulative) The percentage of population ages 25 and over that attained or completed upper secondary education....
15 IC.TAX.METG Number of visits or required meetings with tax officials (average for affected firms) Average number of visits or required meetings with tax officials during the year. The value represents the average number of visits for all firms which reported being visited or required to meet with tax officials (please see indicator IC.FRM.METG.ZS)....
16 SL.TLF.0714.WK.MA.TM Average working hours of children, working only, male, ages 7-14 (hours per week) Average working hours of children working only refers to the average weekly working hours of those children who are involved in economic activity and not attending school....
17 SP.POP.TOTL.FE.ZS Population, female (% of total population) Female population is the percentage of the population that is female. Population is based on the de facto definition of population, which counts all residents regardless of legal status or citizenship....
18 SP.DYN.TO65.MA.ZS Survival to age 65, male (% of cohort) Survival to age 65 refers to the percentage of a cohort of newborn infants that would survive to age 65, if subject to age specific mortality rates of the specified year....
19 EG.ELC.PETR.ZS Electricity production from oil sources (% of total) Sources of electricity refer to the inputs used to generate electricity. Oil refers to crude oil and petroleum products....
20 SE.ADT.1524.LT.FM.ZS Literacy rate, youth (ages 15-24), gender parity index (GPI) Gender parity index for youth literacy rate is the ratio of females to males ages 15-24 who can both read and write with understanding a short simple statement about their everyday life....
21 CM.MKT.TRAD.GD.ZS Stocks traded, total value (% of GDP) The value of shares traded is the total number of shares traded, both domestic and foreign, multiplied by their respective matching prices. Figures are single counted (only one side of the transaction is considered). Companies admitted to listing and admitted to trading are included in the data. Data are end of year values....
22 EN.CO2.MANF.ZS CO2 emissions from manufacturing industries and construction (% of total fuel combustion) CO2 emissions from manufacturing industries and construction contains the emissions from combustion of fuels in industry. The IPCC Source/Sink Category 1 A 2 includes these emissions. However, in the 1996 IPCC Guidelines, the IPCC category also includes emissions from industry autoproducers that generate electricity and/or heat. The IEA data are not collected in a way that allows the energy consumption to be split by specific end-use and therefore, autoproducers are shown as a separate item (Un...
23 SH.TBS.DTEC.ZS Tuberculosis case detection rate (%, all forms) Tuberculosis case detection rate (all forms) is the number of new and relapse tuberculosis cases notified to WHO in a given year, divided by WHO's estimate of the number of incident tuberculosis cases for the same year, expressed as a percentage. Estimates for all years are recalculated as new information becomes available and techniques are refined, so they may differ from those published previously....
24 FX.OWN.TOTL.FE.ZS Account ownership at a financial institution or with a mobile-money-service provider, female (% of population ages 15+) Account denotes the percentage of respondents who report having an account (by themselves or together with someone else) at a bank or another type of financial institution or report personally using a mobile money service in the past 12 months (female, % age 15+)....
25 SE.SEC.UNER.LO.FE.ZS Adolescents out of school, female (% of female lower secondary school age) Adolescents out of school are the percentage of lower secondary school age adolescents who are not enrolled in school....
26 FX.OWN.TOTL.YG.ZS Account ownership at a financial institution or with a mobile-money-service provider, young adults (% of population ages 15-24) Account denotes the percentage of respondents who report having an account (by themselves or together with someone else) at a bank or another type of financial institution or report personally using a mobile money service in the past 12 months (young adults, % of population ages 15-24)....
27 per_lm_alllm.ben_q1_tot Benefit incidence of unemployment benefits and ALMP to poorest quintile (% of total U/ALMP benefits) Benefit incidence of unemployment benefits and active labor market programs (ALMP) to poorest quintile shows the percentage of total unemployment and active labor market programs benefits received by the poorest 20% of the population. Unemployment benefits and active labor market programs include unemployment compensation, severance pay, and early retirement due to labor market reasons, labor market services (intermediation), training (vocational, life skills, and cash for training), job rotati...
28 HD.HCI.OVRL.FE Human capital index (HCI), female (scale 0-1) The HCI calculates the contributions of health and education to worker productivity. The final index score ranges from zero to one and measures the productivity as a future worker of child born today relative to the benchmark of full health and complete education....
29 BX.GSR.CCIS.CD ICT service exports (BoP, current US$) Information and communication technology service exports include computer and communications services (telecommunications and postal and courier services) and information services (computer data and news-related service transactions). Data are in current U.S. dollars....

Here's a more complex query: Get the newest data for the youth literacy rate indicator (from 2010 or later), then show the 20 countries with the lowest rate.

with newest_data as (
  select country_code, indicator_code, max(year) as year from wdi_data
  where
    indicator_code = (select series_code from wdi_series where indicator_name = 'Literacy rate, youth total (% of people ages 15-24)')
    and year > 2010
    group by country_code
)
select c.long_name as country, printf('%.1f %%', value) as "Youth Literacy Rate"
from wdi_data, newest_data
  join wdi_country c on c.country_code = wdi_data.country_code
where wdi_data.indicator_code = newest_data.indicator_code and wdi_data.country_code = newest_data.country_code and wdi_data.year = newest_data.year
order by value asc limit 20
country Youth Literacy Rate
0 Republic of Chad 30.8 %
1 Central African Republic 38.3 %
2 Republic of Niger 43.5 %
3 Republic of Guinea 46.3 %
4 Republic of South Sudan 47.9 %
5 Republic of Mali 50.1 %
6 Republic of Liberia 55.4 %
7 Burkina Faso 58.3 %
8 Republic of CΓ΄te d'Ivoire 58.4 %
9 Republic of Guinea-Bissau 60.4 %
10 Republic of Benin 60.9 %
11 Islamic Republic of Mauritania 63.9 %
12 Islamic State of Afghanistan 65.4 %
13 Republic of Sierra Leone 66.6 %
14 Republic of The Gambia 67.2 %
15 Republic of Senegal 69.5 %
16 Republic of Mozambique 70.9 %
17 Federal Democratic Republic of Ethiopia 72.8 %
18 Republic of Malawi 72.9 %
19 Republic of the Sudan 73.0 %

Table Definitions

select * from sqlite_master

type name tbl_name rootpage sql
0 table wdi_data wdi_data 2 CREATE TABLE "wdi_data" (
"indicator_code" TEXT,
"country_code" TEXT,
"year" INTEGER,
"value" REAL
)
1 table wdi_country wdi_country 3 CREATE TABLE "wdi_country" (
"country_code" TEXT,
"short_name" TEXT,
"table_name" TEXT,
"long_name" TEXT,
"2-alpha_code" TEXT,
"currency_unit" TEXT,
"special_notes" TEXT,
"region" TEXT,
"income_group" TEXT,
"wb-2_code" TEXT,
"national_accounts_base_year" TEXT,
"national_accounts_reference_year" REAL,
"sna_price_valuation" TEXT,
"lending_category" TEXT,
"other_groups" TEXT,
"system_of_national_accounts" TEXT,
"alternative_conversion_factor" TEXT,
"balance_of_payments_manual_in_use" TEXT,
"external_debt_reporting_status" TEXT,
"system_of_trade" TEXT,
"government_accounting_concept" TEXT,
"imf_data_dissemination_standard" TEXT,
"latest_population_census" TEXT,
"latest_household_survey" TEXT,
"source_of_most_recent_income_and_expenditure_data" TEXT,
"vital_registration_complete" TEXT,
"latest_agricultural_census" TEXT,
"latest_industrial_data" REAL,
"latest_trade_data" REAL
)
2 table wdi_country_series wdi_country_series 4 CREATE TABLE "wdi_country_series" (
"countrycode" TEXT,
"seriescode" TEXT,
"description" TEXT
)
3 table wdi_footnote wdi_footnote 5 CREATE TABLE "wdi_footnote" (
"countrycode" TEXT,
"seriescode" TEXT,
"year" TEXT,
"description" TEXT
)
4 table wdi_series wdi_series 6 CREATE TABLE "wdi_series" (
"series_code" TEXT,
"topic" TEXT,
"indicator_name" TEXT,
"short_definition" TEXT,
"long_definition" TEXT,
"unit_of_measure" TEXT,
"periodicity" TEXT,
"base_period" TEXT,
"other_notes" TEXT,
"aggregation_method" TEXT,
"limitations_and_exceptions" TEXT,
"notes_from_original_source" TEXT,
"general_comments" TEXT,
"source" TEXT,
"statistical_concept_and_methodology" TEXT,
"development_relevance" TEXT,
"related_source_links" TEXT,
"related_indicators" TEXT,
"license_type" TEXT
)
5 table wdi_series_time wdi_series_time 7 CREATE TABLE "wdi_series_time" (
"seriescode" TEXT,
"year" TEXT,
"description" TEXT
)
6 index ix_wdi_data_indicator_code_country_code_year wdi_data 8 CREATE INDEX "ix_wdi_data_indicator_code_country_code_year"ON "wdi_data" ("indicator_code","country_code","year")

More Repositories

1

ripgrep-all

rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.
Rust
8,147
star
2

sql.js-httpvfs

Hosting read-only SQLite databases on static file hosters like Github Pages
TypeScript
3,469
star
3

sqlite-zstd

Transparent dictionary-based row-level compression for SQLite
Rust
1,447
star
4

blog

Source code of my personal blog
TypeScript
339
star
5

procedural-cities

Information about procedural city generation
TeX
128
star
6

nmap-log-parse

Logs which devices are in your local network and draws graphs
TypeScript
127
star
7

pandoc-url2cite

Effortlessly and transparently add correctly styled citations to your markdown paper given only a URL
TypeScript
122
star
8

youtube-sponsorship-stats

TypeScript
121
star
9

timetrackrs

An automated time tracker (WIP)
Rust
113
star
10

tv-show-ratings

Compare the episode ratings of TV shows
TypeScript
75
star
11

neural-network-demo

Demonstration and visualization of feed-forward neural networks running in the browser
TypeScript
60
star
12

tuxguitar

unofficial mirror from sourceforge svn
Java
57
star
13

backchannel-prediction

Yeah, Right, Uh-Huh: A Deep Learning Backchannel Predictor
Python
54
star
14

tantivy-wasm

TypeScript
42
star
15

fbstats

generate facebook messaging statistics
TypeScript
38
star
16

levenshtein-demo

TypeScript
35
star
17

typed-socket.io

A library for fully typed client-server communication with socket.io and TypeScript.
TypeScript
30
star
18

encrypted-gist

Storing files in github gists, with client side transparent encryption and authentication
TypeScript
28
star
19

webrtc-remote-touch-pen-input

JavaScript
24
star
20

ts-histdbimport

Imports a `.zsh_history` file into a https://github.com/larkery/zsh-histdb sqlite database.
JavaScript
24
star
21

convolution-demo

Visualization of convolution on pairs of simple functions
TypeScript
18
star
22

thought-forge-ai

Generate 30-60 second "deep thought" TikTok-style video including a monologue, moving video scenes, music, and subtitles.
TypeScript
16
star
23

ebv-theta-to-mqtt

Perl
15
star
24

emojidome

Interactive viewer of the results of the Emojidome XKCD
TypeScript
14
star
25

dupegone

small fast duplicate file finder in c++
C++
11
star
26

redis-remotify

A tiny TypeScript library for fully typed remote calls via Redis.
TypeScript
9
star
27

mima

Mima Compiler and Interpreter for the browser
JavaScript
9
star
28

Gelddruckmaschine

Telegram bot that finds arbitrage opportunities between multiple crypto trading sites to print money mostly risk-free
JavaScript
8
star
29

warc-sqlite

POC of converting a set of WARC web archive files to a SQLite database and querying it
Python
7
star
30

phiresky.github.io

Lazy Github homepage generator
TypeScript
7
star
31

deep-intellisense

IntelliSense based on deep learning using char-rnn
TypeScript
6
star
32

endoh1-ts

Deobfuscating the ASCII fluid simulater and converting it to TypeScript
JavaScript
6
star
33

lemmy-federation-state

Quick visualization of the lemmy federation state
TypeScript
6
star
34

youtube-watch-history-parse

TypeScript
5
star
35

rpi-autousbupload

Automatically uploads photos from usb devices to an ftp server. Optimized for robustness on an Raspberry Pi.
Python
5
star
36

dfa2regex

Converts Deterministic finite automata to regular expressions.
JavaScript
5
star
37

minecraft-stats

Parses Vanilla Minecraft Server Statistics and displays graphs
JavaScript
4
star
38

fix-messy-movie-folder

Try to identify movie files in messy folders and sort them into a predefined structure
Java
4
star
39

rust-brotli-wasm

Two experiments of the brotli encoder / decoder compiled from Rust to WebAssembly.
Rust
4
star
40

RobinHood-TheLegendOfSherwood-Resolution-Patcher

Patch the resolution of Robin Hood: The Legend of Sherwood to support any resolution (e.g. 1920x1080)
C++
4
star
41

socket.io-distributor

Simple load leveling for socket.io
TypeScript
3
star
42

portfolio-bot

Telegram Bot that allows you to get information on your investments (stocks, ETFs, etc)
TypeScript
3
star
43

coronavirus-reproduction-analysis

Jupyter Notebook
3
star
44

qalc-react

Unit Calculator
TypeScript
3
star
45

gaussian-mixtures-demo

TypeScript
2
star
46

pythoven

Python
2
star
47

joint-multilingual-speech-recognition-and-language-id

TeX
2
star
48

tree-magic-cli

Rust
2
star
49

how-long-am-i-working

use google location history to graph how much time you've spent at a location (e.g. at work)
TypeScript
2
star
50

pingplot

plot a ping log
R
2
star
51

csv-cooccurrence-graph

Rust
2
star
52

RaspberrySmartScaleReceiver

Program to receive the body weight and body composition analysis data from a Soehnle 63760 BB smart scale using a 433 Mhz receiver on the Raspberry Pi.
C++
2
star
53

phiresky

1
star
54

rga-windows-test

Rust
1
star
55

bachelor-thesis

TeX
1
star
56

bayesian-aggregation-for-swarm-reinforcement-learning

Python
1
star
57

score-voting-tool

Rust
1
star
58

masters-thesis

HTML
1
star
59

algo2-summary

compact summary of stuff from the algorithms 2 lecture. Zusammenfassung der Algorithmen II Vorlesung am KIT
TeX
1
star
60

ocr-pdf-via-document-ai

OCR a set of images via the Google Cloud API
JavaScript
1
star
61

plangraph-impl

Implementation of some algorithms for planar graphs
JavaScript
1
star
62

guitar-tabs

Static hosted version of the guitar tabs portion of my old (<2010) website
CSS
1
star
63

plangraph

Mitschrieb zur Vorlesung planare Graphen KIT SS2015
Python
1
star
64

kogsys-demos

overview page for all the lecture demonstrations
HTML
1
star
65

ts-boilerplates

just some simple boilerplates for me to use
TypeScript
1
star
66

nushell-history-skim

Rust
1
star
67

prosem-proto

Simple procedural town generation (for Proseminar)
TypeScript
1
star
68

fourier-series-demo

TypeScript
1
star