• Stars
    star
    364
  • Rank 117,101 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created over 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Easily view PyPI download statistics via Google's BigQuery.

pypinfo: View PyPI download statistics with ease.

PyPI version Supported Python versions License Code style: Black

pypinfo is a simple CLI to access PyPI download statistics via Google's BigQuery.

Table of contents

  1. Usage
  2. Installation
  3. Credits

Usage

Click to unfold usage
$ pypinfo
Usage: pypinfo [OPTIONS] [PROJECT] [FIELDS]... COMMAND [ARGS]...
        
    Valid fields are:

    project | version | file | pyversion | percent3 | percent2 | impl | impl-version |
  
    openssl | date | month | year | country | installer | installer-version |
  
    setuptools-version | system | system-release | distro | distro-version | cpu |

    libc | libc-version
    
Options:
    -a, --auth TEXT         Path to Google credentials JSON file.
    --run / --test          --test simply prints the query.
    -j, --json              Print data as JSON, with keys `rows` and `query`.
    -i, --indent INTEGER    JSON indentation level.
    -t, --timeout INTEGER   Milliseconds. Default: 120000 (2 minutes)
    -l, --limit INTEGER        Maximum number of query results. Default: 10
    -d, --days INTEGER         Number of days in the past to include. Default: 30
    -sd, --start-date TEXT  Must be negative or YYYY-MM[-DD]. Default: -31
    -ed, --end-date TEXT    Must be negative or YYYY-MM[-DD]. Default: -1
    -m, --month TEXT        Shortcut for -sd & -ed for a single YYYY-MM month.
    -w, --where TEXT        WHERE conditional. Default: file.project = "project"
    -o, --order TEXT        Field to order by. Default: download_count
    --all                   Show downloads by all installers, not only pip.
    -pc, --percent          Print percentages.
    -md, --markdown         Output as Markdown.
    -v, --verbose           Print debug messages to stderr.
    --version               Show the version and exit.
    -h, --help              Show this message and exit.

pypinfo accepts 0 or more options, followed by exactly 1 project, followed by 0 or more fields. By default only the last 30 days are queried. Let's take a look at some examples!

Tip: If queries are resulting in NoneType errors, increase timeout.

Downloads for a project

$ pypinfo requests
Served from cache: False
Data processed: 2.83 GiB
Data billed: 2.83 GiB
Estimated cost: $0.02

| download_count |
| -------------- |
|    116,353,535 |

All downloads

$ pypinfo ""
Served from cache: False
Data processed: 116.15 GiB
Data billed: 116.15 GiB
Estimated cost: $0.57

| download_count |
| -------------- |
|  8,642,447,168 |

Downloads for a project by Python version

$ pypinfo django pyversion
Served from cache: False
Data processed: 967.33 MiB
Data billed: 968.00 MiB
Estimated cost: $0.01

| python_version | download_count |
| -------------- | -------------- |
| 3.8            |      1,735,967 |
| 3.6            |      1,654,871 |
| 3.7            |      1,326,423 |
| 2.7            |        876,621 |
| 3.9            |        524,570 |
| 3.5            |        258,609 |
| 3.4            |         12,769 |
| 3.10           |          3,050 |
| 3.3            |            225 |
| 2.6            |            158 |
| Total          |      6,393,263 |

All downloads by country code

$ pypinfo "" country
Served from cache: False
Data processed: 150.40 GiB
Data billed: 150.40 GiB
Estimated cost: $0.74

| country | download_count |
| ------- | -------------- |
| US      |  6,614,473,568 |
| IE      |    336,037,059 |
| IN      |    192,914,402 |
| DE      |    186,968,946 |
| NL      |    182,691,755 |
| None    |    141,753,357 |
| BE      |    111,234,463 |
| GB      |    109,539,219 |
| SG      |    106,375,274 |
| FR      |     86,036,896 |
| Total   |  8,068,024,939 |

Downloads for a project by system and distribution

$ pypinfo cryptography system distro
Served from cache: False
Data processed: 2.52 GiB
Data billed: 2.52 GiB
Estimated cost: $0.02

| system_name | distro_name                     | download_count |
| ----------- | ------------------------------- | -------------- |
| Linux       | Ubuntu                          |     19,524,538 |
| Linux       | Debian GNU/Linux                |     11,662,104 |
| Linux       | Alpine Linux                    |      3,105,553 |
| Linux       | Amazon Linux AMI                |      2,427,975 |
| Linux       | Amazon Linux                    |      2,374,869 |
| Linux       | CentOS Linux                    |      1,955,181 |
| Windows     | None                            |      1,522,069 |
| Linux       | CentOS                          |        568,370 |
| Darwin      | macOS                           |        489,859 |
| Linux       | Red Hat Enterprise Linux Server |        296,858 |
| Total       |                                 |     43,927,376 |

Most popular projects in the past year

$ pypinfo --days 365 "" project
Served from cache: False
Data processed: 1.69 TiB
Data billed: 1.69 TiB
Estimated cost: $8.45

| project         | download_count |
| --------------- | -------------- |
| urllib3         |  1,382,528,406 |
| six             |  1,172,798,441 |
| botocore        |  1,053,169,690 |
| requests        |    995,387,353 |
| setuptools      |    992,794,567 |
| certifi         |    948,518,394 |
| python-dateutil |    934,709,454 |
| idna            |    929,781,443 |
| s3transfer      |    877,565,186 |
| chardet         |    854,744,674 |
| Total           | 10,141,997,608 |

Downloads between two YYYY-MM-DD dates

$ pypinfo --start-date 2018-04-01 --end-date 2018-04-30 setuptools
Served from cache: False
Data processed: 571.37 MiB
Data billed: 572.00 MiB
Estimated cost: $0.01

| download_count |
| -------------- |
|      8,972,826 |

Downloads between two YYYY-MM dates

  • A yyyy-mm --start-date defaults to the first day of the month
  • A yyyy-mm --end-date defaults to the last day of the month
$ pypinfo --start-date 2018-04 --end-date 2018-04 setuptools
Served from cache: False
Data processed: 571.37 MiB
Data billed: 572.00 MiB
Estimated cost: $0.01

| download_count |
| -------------- |
|      8,972,826 |

Downloads for a single YYYY-MM month

$ pypinfo --month 2018-04 setuptools
Served from cache: False
Data processed: 571.37 MiB
Data billed: 572.00 MiB
Estimated cost: $0.01

| download_count |
| -------------- |
|      8,972,826 |

Percentage of Python 3 downloads of the top 100 projects in the past year

Let's use --test to only see the query instead of sending it.

$ pypinfo --test --days 365 --limit 100 "" project percent3
SELECT
    file.project as project,
    ROUND(100 * SUM(CASE WHEN REGEXP_EXTRACT(details.python, r"^([^\.]+)") = "3" THEN 1 ELSE 0 END) / COUNT(*), 1) as percent_3,
    COUNT(*) as download_count,
FROM `bigquery-public-data.pypi.file_downloads`
WHERE timestamp BETWEEN TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -366 DAY) AND TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 DAY)
    AND details.installer.name = "pip"
GROUP BY
    project
ORDER BY
    download_count DESC
LIMIT 100

Downloads for a given version

pypinfo supports PEP 440 version matching.

We can use it to query stats on a given major version.

$ pypinfo -pc 'pip==21.*' pyversion version
Served from cache: False
Data processed: 34.45 MiB
Data billed: 35.00 MiB
Estimated cost: $0.01

| python_version | version | percent | download_count |
| -------------- | ------- | ------- | -------------- |
| 3.6            | 21.3.1  |  78.74% |         10,430 |
| 3.8            | 21.3.1  |   7.81% |          1,034 |
| 3.7            | 21.2.1  |   3.59% |            476 |
| 3.7            | 21.3.1  |   2.60% |            345 |
| 3.7            | 21.0.1  |   2.25% |            298 |
| 3.8            | 21.0.1  |   1.58% |            209 |
| 3.8            | 21.2.1  |   1.42% |            188 |
| 3.7            | 21.1.2  |   0.81% |            107 |
| 3.9            | 21.3.1  |   0.69% |             92 |
| 3.8            | 21.1.1  |   0.51% |             67 |
| Total          |         |         |         13,246 |

We can also use it to query stats on an exact version:

$ pypinfo -pc 'numpy==1.23rc3' pyversion version
Served from cache: False
Data processed: 34.01 MiB
Data billed: 35.00 MiB
Estimated cost: $0.01

| python_version | version   | percent | download_count |
| -------------- | --------- | ------- | -------------- |
| 3.9            | 1.23.0rc3 |  63.33% |             38 |
| 3.8            | 1.23.0rc3 |  28.33% |             17 |
| 3.10           | 1.23.0rc3 |   8.33% |              5 |
| Total          |           |         |             60 |

Installation

Click to unfold installation

pypinfo is distributed on PyPI as a universal wheel and is available on Linux/macOS and Windows and supports Python 3.7+.

This is relatively painless, I swear.

Create project

  1. Go to https://bigquery.cloud.google.com.

  2. Sign up if you haven't already. The first TB of queried data each month is free. Each additional TB is $5.

  3. Sign in on your account if you are not already;

  4. Go to https://console.developers.google.com/cloud-resource-manager and click CREATE PROJECT if you don't already have one:

create

  1. This takes you to https://console.developers.google.com/projectcreate. Fill out the form and click CREATE. Any name is fine, but I recommend you choose something to do with PyPI like pypinfo. This way you know what the project is designated for:

click

  1. A while after creation, at the left-top corner, select the project name of your choice on dropdown component AND at the left-top corner "Navigation Menu", select option "Cloud Overview > Dashboard":

show

Enable BigQuery API

  1. Click on top-left button "Navigation Menu" and click on option "API and services > Library":

api_library

  1. Perform a search with keywords "big query api" on available text field:

big_query_api_search

  1. Enable Big Query API by button "Enable" press:

big_query_api

  1. After enabling, click CREATE CREDENTIALS:

credentials

Note: You will be requested to go back to Big Query panel. In this case, click on top-left button "Navigation Menu", option "API and services > Enabled APIs and services" and on consequent page, on item "Big Query API":

enabled_credentials

  1. On the page after clicking the "CREATE CREDENTIALS" button, choose "BigQuery API", "Application Data" and "No, I'm not using them":

credentials_page_1

  1. Fill account details and press button "Create and Continue":

credentials_page_2

  1. Select role "BigQuery User" (option path "BigQuery > Big Query User"), press button "Done":

credentials_page_3

  1. On Big Query API panel (See Note on item 10), click on tab "CREDENTIALS". On section "Service accounts", click on created credentials on items 11, 12 and 13.

create_service_credential_key

  1. On page from credential click, click on tab "KEYS". On dropdown menu "ADD KEY", click on option "Create new key":

create_credential_key

  1. On appearing box, click on option "JSON" and press button "CREATE": This will start the download of credentials on a JSON file with name pattern {name}-{credentials_hash}.json:

create_private_key

Installation and authentication

  1. Run python -m pip install pypinfo in the terminal.
  2. pypinfo --auth path/to/your_credentials.json, or set an environment variable GOOGLE_APPLICATION_CREDENTIALS that points to the file.

Credits

More Repositories

1

bit

Bitcoin made easy.
Python
1,240
star
2

pyapp

Runtime installer for Python applications
Rust
1,198
star
3

privy

An easy, fast lib to correctly password-protect your data
Python
238
star
4

csi-gcs

Kubernetes CSI driver for Google Cloud Storage
Go
152
star
5

userpath

Cross-platform tool for adding locations to the user PATH, no elevated privileges required!
Python
129
star
6

coincurve

Cross-platform Python bindings for libsecp256k1
Python
128
star
7

hatch-vcs

Hatch plugin for versioning with your preferred VCS
Python
102
star
8

hatch-mypyc

Hatch build hook plugin for Mypyc
Python
36
star
9

venum

Verifiably better, validated Enum for Python
Python
31
star
10

hatch-showcase

A project showcasing features and plugins for Hatch
Python
26
star
11

extensionlib

The toolkit for building extension modules
Python
25
star
12

hatch-containers

Hatch plugin for Docker containers
Python
22
star
13

pybin

Cross-platform tool to put Python's user bin in PATH, no sudo/runas required!
Python
21
star
14

rusty

Rusty example CLI
Rust
14
star
15

binary

Easily convert between binary and SI units (kibibyte, kilobyte, etc.).
Python
9
star
16

terminal-demo

Produce GIFs from shell commands
Python
9
star
17

dep-sync

Synchronize Python environments with dependencies
Python
9
star
18

hatch-autorun

Hatch build hook plugin to inject code that will automatically run
Python
7
star
19

msgspec-click

Generate Click options from msgspec types
Python
7
star
20

pyproject-validate

Validate and format pyproject.toml files
Python
4
star
21

depq

CPython double-ended priority queue (DEPQ)
Python
3
star
22

mkpatcher

Python-Markdown extension allowing arbitrary scripts to modify MkDocs input files
Python
2
star
23

katutil

utilities for automating tasks on KickassTorrents
Python
2
star
24

ofek

Python
2
star
25

find-exe

Find matching executables
Python
2
star
26

spry

Modern file transfer utility supporting HTTPS & SFTP.
Python
2
star
27

talks

Collection of potential talks and associated materials
2
star
28

ink

Digital signatures made easy.
Python
1
star
29

pyoxidizer-build-example

Repo showing how to build PyOxidizer executables for every platform
Starlark
1
star
30

perplex

Perpetual Plex in the Cloud
Python
1
star
31

everlib

Everlasting media library backed by cloud storage
Python
1
star