• Stars
    star
    831
  • Rank 53,333 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created about 8 years ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A CLI tool to convert CSV / Excel / HTML / JSON / Jupyter Notebook / LDJSON / LTSV / Markdown / SQLite / SSV / TSV / Google-Sheets to a SQLite database file.

Summary

sqlitebiter is a CLI tool to convert CSV / Excel / HTML / JSON / Jupyter Notebook / LDJSON / LTSV / Markdown / SQLite / SSV / TSV / Google-Sheets to a SQLite database file.

PyPI package version Supported Python versions CI status of Linux/macOS/Windows CodeQL

Features

Usage

Create SQLite database from files

Create SQLite database from URL

Following is an example that converts HTML table tags within a web page to SQLite tables by the web page URL.

Example:
$ sqlitebiter url "https://en.wikipedia.org/wiki/Comparison_of_firewalls"
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html1' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html2' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html3' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html4' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html5' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html6' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html7' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html8' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html9' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html10' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html11' table
[INFO] sqlitebiter url: converted results: source=1, success=11, created-table=11
[INFO] sqlitebiter url: database path: out.sqlite
Output:
$ sqlite3 out.sqlite .schema
CREATE TABLE IF NOT EXISTS '_source_info_' ("source_id" INTEGER NOT NULL, "dir_name" TEXT, "base_name" TEXT NOT NULL, "format_name" TEXT NOT NULL, "dst_table" TEXT NOT NULL, size INTEGER, mtime INTEGER);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html1' (Firewall TEXT, License TEXT, [Cost and usage limits] TEXT, OS TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html2' (Firewall TEXT, License TEXT, Cost TEXT, OS TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html3' ([Can Target:] TEXT, [Changing default policy to accept/reject (by issuing a single rule)] TEXT, [IP destination address(es)] TEXT, [IP source address(es)] TEXT, [TCP/UDP destination port(s)] TEXT, [TCP/UDP source port(s)] TEXT, [Ethernet MAC destination address] TEXT, [Ethernet MAC source address] TEXT, [Inbound firewall (ingress)] TEXT, [Outbound firewall (egress)] TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html4' ([Can:] TEXT, [work at OSI Layer 4 (stateful firewall)] TEXT, [work at OSI Layer 7 (application inspection)] TEXT, [Change TTL? (Transparent to traceroute)] TEXT, [Configure REJECT-with answer] TEXT, [DMZ (de-militarized zone) - allows for single/several hosts not to be firewalled.] TEXT, [Filter according to time of day] TEXT, [Redirect TCP/UDP ports (port forwarding)] TEXT, [Redirect IP addresses (forwarding)] TEXT, [Filter according to User Authorization] TEXT, [Traffic rate-limit / QoS] TEXT, Tarpit TEXT, Log TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html5' ([Features:] TEXT, "Configuration: GUI_ text or both modes?" TEXT, "Remote Access: Web (HTTP)_ Telnet_ SSH_ RDP_ Serial COM RS232_ ..." TEXT, [Change rules without requiring restart?] TEXT, [Ability to centrally manage all firewalls together] TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html6' ([Features:] TEXT, [Modularity: supports third-party modules to extend functionality?] TEXT, [IPS : Intrusion prevention system] TEXT, [Open-Source License?] TEXT, [supports IPv6 ?] TEXT, [Class: Home / Professional] TEXT, [Operating Systems on which it runs?] TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html7' ([Can:] TEXT, "NAT44 (static_ dynamic w/o ports_ PAT)" TEXT, "NAT64_ NPTv6" TEXT, [IDS (Intrusion Detection System)] TEXT, [VPN (Virtual Private Network)] TEXT, [AV  (Anti-Virus)] TEXT, Sniffer TEXT, [Profile selection] TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html8' ([vteFirewall software] TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html9' (A TEXT, B TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html10' (A TEXT, B TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html11' (A TEXT, B TEXT);

The attributes within the converted SQLite database may include symbols as the above. Symbols within attributes can be replaced by using --replace-symbol option. The following example shows replacing symbols with underscores.

Example:
$ sqlitebiter --replace-symbol _ -q url "https://en.wikipedia.org/wiki/Comparison_of_firewalls"
Output:
$ sqlite3 out.sqlite .schema
CREATE TABLE IF NOT EXISTS '_source_info_' ("source_id" INTEGER NOT NULL, "dir_name" TEXT, "base_name" TEXT NOT NULL, "format_name" TEXT NOT NULL, "dst_table" TEXT NOT NULL, size INTEGER, mtime INTEGER);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html1' (Firewall TEXT, License TEXT, "Cost_and_usage_limits" TEXT, OS TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html2' (Firewall TEXT, License TEXT, Cost TEXT, OS TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html3' ("Can_Target" TEXT, "Changing_default_policy_to_accept_reject_by_issuing_a_single_rule" TEXT, "IP_destination_address_es" TEXT, "IP_source_address_es" TEXT, "TCP_UDP_destination_port_s" TEXT, "TCP_UDP_source_port_s" TEXT, "Ethernet_MAC_destination_address" TEXT, "Ethernet_MAC_source_address" TEXT, "Inbound_firewall_ingress" TEXT, "Outbound_firewall_egress" TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html4' (Can TEXT, "work_at_OSI_Layer_4_stateful_firewall" TEXT, "work_at_OSI_Layer_7_application_inspection" TEXT, "Change_TTL_Transparent_to_traceroute" TEXT, "Configure_REJECT_with_answer" TEXT, "DMZ_de_militarized_zone_allows_for_single_several_hosts_not_to_be_firewalled" TEXT, "Filter_according_to_time_of_day" TEXT, "Redirect_TCP_UDP_ports_port_forwarding" TEXT, "Redirect_IP_addresses_forwarding" TEXT, "Filter_according_to_User_Authorization" TEXT, "Traffic_rate_limit_QoS" TEXT, Tarpit TEXT, Log TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html5' (Features TEXT, "Configuration_GUI_text_or_both_modes" TEXT, "Remote_Access_Web_HTTP_Telnet_SSH_RDP_Serial_COM_RS232" TEXT, "Change_rules_without_requiring_restart" TEXT, "Ability_to_centrally_manage_all_firewalls_together" TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html6' (Features TEXT, "Modularity_supports_third_party_modules_to_extend_functionality" TEXT, "IPS _Intrusion_prevention_system" TEXT, "Open_Source_License" TEXT, "supports_IPv6" TEXT, "Class_Home_Professional" TEXT, "Operating_Systems_on_which_it_runs" TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html7' (Can TEXT, "NAT44_static_dynamic_w_o_ports_PAT" TEXT, "NAT64_NPTv6" TEXT, "IDS_Intrusion_Detection_System" TEXT, "VPN_Virtual_Private_Network" TEXT, "AV_Anti_Virus" TEXT, Sniffer TEXT, "Profile_selection" TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html8' ("vteFirewall_software" TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html9' (A TEXT, B TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html10' (A TEXT, B TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html11' (A TEXT, B TEXT);

Create SQLite database from stdin

Examples

A data format is a mandatory argument for sqlitebiter stdin subcommand:

Example:
$ echo '[{"hoge": 4, "foo": "abc"}, {"hoge": 12, "foo": "xyz"}]' | sqlitebiter stdin json
[INFO] convert 'stdin' to 'json1' table
[INFO] converted results: source=1, success=1, created-table=1
[INFO] database path: out.sqlite

Command help

Usage: sqlitebiter [OPTIONS] COMMAND [ARGS]...

Options:
  -o, --output-path PATH          Output path of the SQLite database file.
                                  Defaults to 'out.sqlite'.
  -a, --append                    Append table(s) to existing database.
  --add-primary-key PRIMARY_KEY_NAME
                                  Add 'PRIMARY KEY AUTOINCREMENT' column to a
                                  converted table with the specified name.
  --convert-config TEXT           [experimental] Configurations for data
                                  conversion. The option can be used only for
                                  url subcommand.
  -i, --index INDEX_ATTR          Comma separated attribute names to create
                                  indices.
  --no-type-inference             All of the columns assume as TEXT data type
                                  in creating tables.
  --type-hint-header              Use header suffixes as type hints. If there
                                  are type hints, convert columns by datatype
                                  corresponding with type hints. The following
                                  suffixes can be recognized as type hints
                                  (case insensitive): "text": TEXT datatype.
                                  "integer": INTEGER datatype. "real": REAL
                                  datatype.
  --matrix-formatting [header_aligned|trim]
                                  header_aligned: fitting table data to header
                                  columns. trim: fitting table data to minimum
                                  column size. Defaults to header_aligned.
  --replace-symbol TEXT           Replace symbols in attributes.
  -v, --verbose
  --max-workers WORKERS           Specify the maximum number of workers that
                                  the command may use. Defaults to 1.
  --debug                         For debug print.
  -q, --quiet                     Suppress execution log messages.
  -h, --help                      Show this message and exit.

Commands:
  completion  A helper command to setup command completion.
  configure   Configure the following application settings:
  file        Convert tabular data within CSV/Excel/HTML/JSON/Jupyter...
  gs          Convert a spreadsheet in Google Sheets to a SQLite database...
  stdin       Convert tabular data within CSV/HTML/JSON/Jupyter...
  url         Scrape tabular data from a URL and convert data to a SQLite...
  version     Show version information

For more information

More examples are available at https://sqlitebiter.rtfd.io/en/latest/pages/usage/index.html

Installation

Installation: pip (Python package manager)

pip install sqlitebiter

Installation: apt

You can install the package by apt via a Personal Package Archive (PPA):

sudo add-apt-repository ppa:thombashi/ppa
sudo apt update
sudo apt install sqlitebiter

Installation: dpkg (.deb package)

The following commands will download the latest .deb package from the release page, and install it.

curl -sSL https://raw.githubusercontent.com/thombashi/sqlitebiter/master/scripts/installer.sh | sudo bash

Installation: Windows

sqlitebiter can be used in Windows environments without Python installation as follows:

  1. Navigate to https://github.com/thombashi/sqlitebiter/releases
  2. Download the latest version of the sqlitebiter_win_x64.zip
  3. Unzip the file
  4. Execute sqlitebiter.exe in either Command Prompt or PowerShell

Installation: Windows (PowerShell)

The following commands will download the latest execution binary from the release page to the current directory.

wget https://github.com/thombashi/sqlitebiter/raw/master/scripts/get-sqlitebiter.ps1 -OutFile get-sqlitebiter.ps1
Set-ExecutionPolicy Unrestricted -Scope Process -Force; .\get-sqlitebiter.ps1

Installation: brew for macOS

$ brew tap thombashi/sqlitebiter
$ brew install sqlitebiter

Command Completion (bash/zsh)

setup command completion for bash:

    sqlitebiter completion bash >> ~/.bashrc

setup command completion for zsh:

    sqlitebiter completion zsh >> ~/.zshrc

Dependencies

Python 3.7+

Python package dependencies

Google Sheets dependencies (Optional)

Extra Python packages are required to install to use the Google Sheets feature (gs subcommand):

The extra packages can be installed with the following pip command;

$ pip install sqlitebiter[gs]

note: binary packages include these dependencies

Misc dependencies (Optional)

Dependencies other than Python packages (Optional)

  • libxml2 (faster HTML/Markdown conversion)
  • pandoc (required when converting MediaWiki files)

Documentation

https://sqlitebiter.rtfd.io/

Sponsors

Charles Becker (chasbecker) onetime: Arturi0 onetime: Dmitry Belyaev (b4tman)

Become a sponsor

More Repositories

1

tcconfig

A tc command wrapper. Make it easy to set up traffic control of network bandwidth/latency/packet-loss/packet-corruption/etc. to a network-interface/Docker-container(veth).
Python
767
star
2

pytablewriter

pytablewriter is a Python library to write a table in various formats: AsciiDoc / CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.
Python
594
star
3

allpairspy

A python library for test combinations generator. The generator allows one to create a set of tests using "pairwise combinations" method, reducing a number of combinations of variables into a lesser set that covers most situations.
Python
251
star
4

pathvalidate

A Python library to sanitize/validate a string such as filenames/file-paths/etc.
Python
207
star
5

SimpleSQLite

SimpleSQLite is a Python library to simplify SQLite database operations: table creation, data insertion and get data as other data formats. Simple ORM functionality for SQLite.
Python
128
star
6

pytablereader

A Python library to load structured table data from files/strings/URL with various data format: CSV / Excel / Google-Sheets / HTML / JSON / LDJSON / LTSV / Markdown / SQLite / TSV.
Python
105
star
7

DateTimeRange

DateTimeRange is a Python library to handle a time range. e.g. check whether a time is within the time range, get the intersection of time ranges, truncate a time range, iterate through a time range, and so forth.
Python
100
star
8

pingparsing

pingparsing is a CLI-tool/Python-library parser and transmitter for ping command ↪️
Python
75
star
9

pytest-md-report

A pytest plugin to generate test outcomes reports with markdown table format.
Python
28
star
10

subprocrunner

A Python wrapper library for subprocess module.
Python
21
star
11

typepy

A Python library for variable type checker/validator/converter at a run time.
Python
16
star
12

DataProperty

A Python library for extract property from data.
Python
15
star
13

humanreadable

humanreadable is a Python library to convert human-readable values to other units.
Python
15
star
14

cleanpy

cleanpy is a CLI tool to remove caches and temporary files related to Python.
Python
11
star
15

sqliteschema

sqliteschema is a Python library to dump table schema of a SQLite database file.
Python
10
star
16

elasticsearch-faker

elasticsearch-faker is a CLI tool to generate fake data for Elasticsearch.
Python
10
star
17

tcolorpy

tcolopy is a Python library to apply true color for terminal text.
Python
10
star
18

excelrd

excelrd is a modified version of xlrd to work for the latest Python versions.
Python
9
star
19

cmakew

cmakew is a CMake wrapper CLI tool.
Python
9
star
20

ghscard

A JavaScript widget to generate interactive GitHub user/repository/organization cards for static web pages (like GitHub pages).
JavaScript
9
star
21

tabledata

tabledata is a Python library to represent tabular data.
Python
7
star
22

thank-you-stars

thank-you-stars is a CLI tool to stars to a PyPI package and its dependencies hosted on GitHub ⭐
Python
7
star
23

mbstrdecoder

Python library for multi-byte character string decoder.
Python
6
star
24

pytest-discord

A pytest plugin to notify test results to a Discord channel.
Python
5
star
25

tblfaker

tblfaker is a Python library to generate fake tabular data.
Python
5
star
26

CriterionSample

📔 Examples of Criterion (https://github.com/Snaipe/Criterion)
C
5
star
27

python-lib-project-template

A project template for a Python library with CI configurations
Python
5
star
28

envinfopy

envinfopy is a Python Library to get execution environment information.
Python
4
star
29

NFStest

NFStest mirror with command help wiki.
Python
4
star
30

readmemaker

A Python utility library to help make a README file from document files.
Python
4
star
31

retryrequests

A Python library that make HTTP requests with exponential back-off retry by using requests package.
Python
4
star
32

gtest-project-template

C++ test project template using Google Test.
C++
4
star
33

appconfigpy

A Python library to create/load an application configuration file.
Python
3
star
34

msgfy

msgfy is a Python library for convert Exception instance to a human-readable error message.
Python
3
star
35

python-cli-project-template

A project template for a Python CLI tool with CI configurations
Python
2
star
36

releasecmd

releasecmd is a release subcommand for setup.py (setuptools.setup). The subcommand create a git tag and push, and upload packages to PyPI.
Python
2
star
37

homebrew-sqlitebiter

Homebrew Formula for sqlitebiter
Ruby
2
star
38

docker-opengrok

Docker image for OpenGrok
Dockerfile
2
star
39

vscode-snippets

User defined snippets for Visual Studio Code
2
star
40

pytablewriter-altrow-theme

pytablewriter-altrow-theme is a pytablewriter plugin to provide a theme that colored rows alternatively.
Python
1
star
41

thutils

Personal python utility library (may delete in the future)
Python
1
star
42

install-gh

Simple one-liner installer of gh (cli/cli) from the release assets.
Shell
1
star
43

Connectathon_README

Connectathon test suite README
HTML
1
star
44

docker-alias

Docker aliases and functions 🐳
Shell
1
star
45

universal-ctags-installer

Build and installation script for Universal Ctags
Shell
1
star
46

df-diskcache

df-diskcache is a Python library for caching pandas.DataFrame objects to local disk.
Python
1
star
47

pathvalidate-cli

pathvalidate-cli is a command line interface for pathvalidate library. The tool can do sanitize/validate a string such as file-names/file-paths.
Python
1
star
48

gh-upgrade

gh-upgrade is a gh extension that updates the gh and its extensions to the latest version.
Shell
1
star
49

PythonExamples

📔 Learning Python libraries by examples.
Jupyter Notebook
1
star
50

dotfiles

♻️ dotfiles
Python
1
star