trdsql
CLI tool that can execute SQL queries on CSV, LTSV, JSON and TBLN.
It is a tool like q, textql and others.
The difference from these tools is that the syntax of PostgreSQL or MySQL can be used.
Please refer to godoc and _example for usage as a library.
-
- 1.1. go get
- 1.1.1. Requirements
- 1.2. Download binary
- 1.3. Homebrew
- 1.4. MacPorts
- 1.5. FreeBSD
- 1.1. go get
-
- 2.1. Docker pull
- 2.2. image build
- 2.3. Docker Run
-
- 3.1. global options
- 3.2. Input formats
- 3.2.1. Input options
- 3.3. Output formats
- 3.3.1. Output options
- 3.4. Handling of NULL
-
- 4.1. STDIN input
- 4.2. Multiple files
- 4.3. Compressed files
- 4.4. Output file
- 4.5. Output compression
- 4.6. Guess by output file name
- 4.7. Columns is not constant
- 4.8. TSV (Tab Separated Value)
- 4.9. LTSV (Labeled Tab-separated Values)
- 4.10. JSON
- 4.10.1. jq expression
- 4.11. JSONL
- 4.12. TBLN
- 4.13. WIDTH
- 4.14. Raw output
- 4.15. ASCII Table & MarkDown output
- 4.16. Vertical format output
- 4.17. SQL function
- 4.18. JOIN
- 4.19. PostgreSQL
- 4.19.1. Function
- 4.19.2. Join table and CSV file is possible
- 4.20. MySQL
- 4.21. Analyze
- 4.22. configuration
INSTALL
1.go get
1.1.$ go get -d github.com/noborus/trdsql
$ cd $GOPATH/src/github.com/noborus/trdsql
$ make
$ make install
Requirements
1.1.1.go 1.18 or higher.
Download binary
1.2.Download binary from the releases page(Linux/Windows/macOS).
Homebrew
1.3.brew install noborus/tap/trdsql
MacPorts
1.4.sudo port selfupdate
sudo port install trdsql
FreeBSD
1.5.pkg install trdsql
Docker
2.Docker pull
2.1.Pull the latest image from the Docker hub.
docker pull noborus/trdsql
image build
2.2.Or build it yourself.
docker build -t trdsql .
Docker Run
2.3.Docker run.
docker run --rm -it -v $(pwd):/tmp trdsql [options and commands]
Usage
3.trdsql [options] SQL
global options
3.1.-a
filename analyze the file and suggest SQL.-A
filename analyze the file but only suggest SQL.-config
filename configuration file location.-db
db name specify db name of the setting.-dblist
display db list of configure.-driver
string database driver. [ mysql | postgres | sqlite3 ]-dsn
string database driver specific data source name.-debug
debug print.-help
display usage information.-version
display version information.-q
filename read query from the specified file.
Input formats
3.2.-ig
guess format from extension. (default)-icsv
CSV format for input.-ijson
JSON format for input.-iltsv
LTSV format for input.-itbln
TBLN format for input.-iwidth
width specification format for input.
Input options
3.2.1.-
-ih
the first line is interpreted as column names(CSV only). -
-id
character field delimiter for input. (default ",")(CSV only) -
-ijq
string jq expression string for input(JSON/JSONL only). -
-ilr
int limited number of rows to read. -
-inull
string value(string) to convert to null on input. -
-ir
int number of rows to preread. (default 1) -
-is
int skip header row.
Output formats
3.3.-ocsv
CSV format for output. (default)-ojson
JSON format for output.-ojsonl
JSONL(JSON Lines) format for output.-oltsv
LTSV format for output.-oat
ASCII Table format for output.-omd
Markdown format for output.-oraw
Raw format for output.-ovf
Vertical format for output.-otbln
TBLN format for output.
Or, guess the output format by file name.
Output options
3.3.1.-out
filename output file name.-out-without-guess
output without guessing (when using -out).-oh
output column name as header.-od
character field delimiter for output. (default ",")(CSV and RAW only).-oq
character quote character for output. (default """)(CSV only).-oaq
enclose all fields in quotes for output(CSV only).-ocrlf
use CRLF for output. End each output line with '\r\n' instead of '\n'.")(CSV only).-onowrap
do not wrap long columns(AT and MD only).-onull
value(string) to convert from null on output.
Handling of NULL
3.4.NULL is undecided in many text formats.
JSON null
is considered the same as SQL NULL
.
For formats other than JSON, you must specify a string that is considered NULL.
In most cases you will need to specify an empty string ("").
If -inull ""
is specified, an empty string will be treated as SQL NULL.
SQL NULL is an empty string by default. Specify the -onull "(NULL)" option if you want a different string.
$ echo "1,,v" | trdsql -inull "" -onull "(NULL)" "SELECT * FROM -"
1,(NULL),v
In the case of JSON, null is NULL as it is, and the specified string is converted to NULL.
$ echo '[1,null,""]' | trdsql -inull "" -ojson -ijson "SELECT * FROM -"
[
{
"c1": "1"
},
{
"c1": null
},
{
"c1": null
}
]
Example
4.test.csv file.
1,Orange
2,Melon
3,Apple
Please write a file name like a table name.
trdsql "SELECT * FROM test.csv"
-q filename
can execute SQL from file
trdsql -q test.sql
STDIN input
4.1."-
" or "stdin
" is received from standard input instead of file name.
cat test.csv | trdsql "SELECT * FROM -"
or
cat test.csv | trdsql "SELECT * FROM stdin"
Multiple files
4.2.Multiple matched files can be executed as one table.
$ trdsql -ih "SELECT * FROM tt*.csv"
1,test1
2,test2
3,test3
Note: It is not possible to mix different formats (ex: CSV and LTSV).
Compressed files
4.3.If the file is compressed with gzip, bz2, zstd, lz4, xz, it will be automatically uncompressed.
trdsql "SELECT * FROM testdata/test.csv.gz"
trdsql "SELECT * FROM testdata/test.csv.zst"
It is possible to mix uncompressed and compressed files using wildcards.
trdsql "SELECT * FROM testdata/test.csv*"
Output file
4.4.-out filename
option to output the file to a file.
trdsql -out result.csv "SELECT * FROM testdata/test.csv ORDER BY c1"
Output compression
4.5.-oz compression type
to compress and output.
trdsql -oz gz "SELECT * FROM testdata/test.csv ORDER BY c1" > result.csv.gz
Guess by output file name
4.6.The filename of -out filename
option determines
the output format(csv, ltsv, json, tbln, raw, md, at, vf, jsonl)
and compression format(gzip, bz2, zstd,lz4, xz) by guess.
Guess by extension output format + output compression (eg .csv.gz, .ltsv.lz4, .md.zst ...).
The following is an LTSV file compressed in zstd.
trdsql -out result.ltsv.zst "SELECT * FROM testdata/test.csv"
Columns is not constant
4.7.If the number of columns is not a constant, read and decide multiple rows.
$ trdsql -ir 3 -iltsv "SELECT * FROM test_indefinite.ltsv"
1,Orange,50,,
2,Melon,500,ibaraki,
3,Apple,100,aomori,red
TSV (Tab Separated Value)
4.8.-id "\\t"
is input from TSV (Tab Separated Value)
1 Orange
2 Melon
3 Apple
trdsql -id "\t" "SELECT * FROM test-tab.csv"
-od "\\t"
is TSV (Tab Separated Value) output.
$ trdsql -od "\t" "SELECT * FROM test.csv"
1 Orange
2 Melon
3 Apple
LTSV (Labeled Tab-separated Values)
4.9.-iltsv
is input from LTSV(Labeled Tab-separated Values).
sample.ltsv
id:1 name:Orange price:50
id:2 name:Melon price:500
id:3 name:Apple price:100
trdsql -iltsv "SELECT * FROM sample.ltsv"
1,Orange,50
2,Melon,500
3,Apple,100
Note: Only the columns in the first row are targeted.
-oltsv
is LTSV(Labeled Tab-separated Values) output.
$ trdsql -iltsv -oltsv "SELECT * FROM sample.ltsv"
id:1 name:Orange price:50
id:2 name:Melon price:500
id:3 name:Apple price:100
JSON
4.10.-ijson
is input from JSON.
sample.json
[
{
"id": "1",
"name": "Orange",
"price": "50"
},
{
"id": "2",
"name": "Melon",
"price": "500"
},
{
"id": "3",
"name": "Apple",
"price": "100"
}
]
$ trdsql -ijson "SELECT * FROM sample.json"
1,Orange,50
2,Melon,500
3,Apple,100
JSON can contain structured types, but trdsql is stored as it is as JSON string.
sample2.json
[
{
"id": 1,
"name": "Drolet",
"attribute": { "country": "Maldives", "color": "burlywood" }
},
{
"id": 2,
"name": "Shelly",
"attribute": { "country": "Yemen", "color": "plum" }
},
{
"id": 3,
"name": "Tuck",
"attribute": { "country": "Mayotte", "color": "antiquewhite" }
}
]
$ trdsql -ijson "SELECT * FROM sample2.json"
1,Drolet,"{""color"":""burlywood"",""country"":""Maldives""}"
2,Shelly,"{""color"":""plum"",""country"":""Yemen""}"
3,Tuck,"{""color"":""antiquewhite"",""country"":""Mayotte""}"
Please use SQL function.
- SQLite3 - JSON Functions And Operators
- PostgreSQL - JSON Functions and Operators
- MySQL - Functions That Search JSON Values
$ trdsql -ijson "SELECT id, name, JSON_EXTRACT(attribute,'$.country'), JSON_EXTRACT(attribute,'$.color') FROM sample2.json"
1,Drolet,Maldives,burlywood
2,Shelly,Yemen,plum
3,Tuck,Mayotte,antiquewhite
jq expression
4.10.1.If json has a hierarchy, you can filter by jq expression.
The jq expression is implemented using gojq.
menu.json
{
"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{
"value": "New",
"onclick": "CreateDoc()"
},
{
"value": "Open",
"onclick": "OpenDoc()"
},
{
"value": "Save",
"onclick": "SaveDoc()"
}
]
}
}
}
You can write a jq expression by connecting :: after the json file name. Enclose the jq expression in double quotes if needed.
trdsql -oat 'SELECT value, onclick FROM menu.json::".menu.popup.menuitem"'
Or specify with the -ijq
option.
$ trdsql -oat -ijq ".menu.popup.menuitem" "SELECT * FROM menu.json"
+-------+-------------+
| value | onclick |
+-------+-------------+
| New | CreateDoc() |
| Open | OpenDoc() |
| Save | SaveDoc() |
+-------+-------------+
Example to use instead of gojq.
$ echo '{"foo": 128}' | trdsql -ijson "SELECT * FROM -::'.foo'"
128
$ echo '{"a": {"b": 42}}' | trdsql -ijson "SELECT * FROM -::'.a.b'"
42
$ echo '{"id": "sample", "10": {"b": 42}}' | trdsql -ijson "SELECT * FROM -::'{(.id): .[\"10\"].b}'"
42
$ echo '[{"id":1},{"id":2},{"id":3}]' | trdsql -ijson "SELECT * FROM -::'.[] | .id'"
1
2
3
$ echo '{"a":1,"b":2}' | trdsql -ijson "SELECT * FROM -::'.a += 1 | .b *= 2'"
4,2
$ echo '{"a":1} [2] 3' | trdsql -ijson "SELECT * FROM -::'. as {\$a} ?// [\$a] ?// \$a | \$a'"
1
2
3
JSONL
4.11.Another json format. JSONL(JSON Lines).
sample2.json
{"id": "1","name": "Orange","price": "50"}
{"id": "2","name": "Melon","price": "500"}
{"id": "3","name": "Apple","price": "100"}
-ojson
is JSON Output.
$ trdsql -ojson "SELECT * FROM test.csv"
[
{
"c1": "1",
"c2": "Orange"
},
{
"c1": "2",
"c2": "Melon"
},
{
"c1": "3",
"c2": "Apple"
}
]
To output in JSONL, specify -ojsonl
.
$ trdsql -ojsonl "SELECT * FROM test.csv"
{"c1":"1","c2":"Orange"}
{"c1":"2","c2":"Melon"}
{"c1":"3","c2":"Apple"}
TBLN
4.12.-itbln
is input from TBLN.
sample.tbln
; name: | id | name |
; type: | int | text |
| 1 | Bob |
| 2 | Alice |
$ trdsql -itbln "SELECT * FROM sample.tbln"
1,Bob
2,Alice
TBLN file reflects extras name and type.
-otbln
is TBLN Output.
$ trdsql -otbln "SELECT c1::int as id, c2::text as name FROM test.csv"
; created_at: 2019-03-22T13:20:31+09:00
; name: | id | name |
; type: | int | text |
| 1 | Orange |
| 2 | Melon |
| 3 | Apple |
TBLN can contain column names and type definitions. Please refer to https://tbln.dev/ for details of TBLN.
WIDTH
4.13.-iwidth
inputs the format specifying the width.
This is used when the header column width represents the body column width.
$ ps | trdsql -oh -iwidth "SELECT * FROM -"
PID,TTY,TIME,CMD
302965,pts/3,00:00:12,zsh
733211,pts/3,00:00:00,ps
733212,pts/3,00:00:00,tee
733213,pts/3,00:00:00,guesswidth
-id " "
for CSV
also works in many cases.
But -id " "
does not recognize spaces in columns very well.
-iwidth
recognizes column widths and space separators.
Raw output
4.14.-oraw
is Raw Output.
It is used when "escape processing is unnecessary" in CSV output.
(For example, when outputting JSON in the database).
$ trdsql -oraw "SELECT row_to_json(t,TRUE) FROM test.csv AS t"
{"c1":"1",
"c2":"Orange"}
{"c1":"2",
"c2":"Melon"}
{"c1":"3",
"c2":"Apple"}
Multiple delimiter characters can be used for raw.
$ trdsql -oraw -od "\t|\t" -db pdb "SELECT * FROM test.csv"
1 | Orange
2 | Melon
3 | Apple
ASCII Table & MarkDown output
4.15.-oat
is ASCII table output.
$ trdsql -oat "SELECT * FROM test.csv"
+----+--------+
| C1 | C2 |
+----+--------+
| 1 | Orange |
| 2 | Melon |
| 3 | Apple |
+----+--------+
-omd
is Markdown output.
$ trdsql -omd "SELECT * FROM test.csv"
| C1 | C2 |
|----|--------|
| 1 | Orange |
| 2 | Melon |
| 3 | Apple |
The -onowrap
option does not wrap long columns in at
or md
output.
Vertical format output
4.16.-ovf is Vertical format output("column name | value" vertically).
$ trdsql -ovf "SELECT * FROM test.csv"
---[ 1]--------------------------------------------------------
c1 | 1
c2 | Orange
---[ 2]--------------------------------------------------------
c1 | 2
c2 | Melon
---[ 3]--------------------------------------------------------
c1 | 3
c2 | Apple
SQL function
4.17.$ trdsql "SELECT count(*) FROM test.csv"
3
The default column names are c1, c2,...
$ trdsql "SELECT c2,c1 FROM test.csv"
Orange,1
Melon,2
Apple,3
Note: the available functions and their syntax depend on the driver you have chosen (mysql or postgres or sqlite). The default one is sqlite.
JOIN
4.18.The SQL JOIN can be used.
user.csv
1,userA
2,userB
hist.csv
1,2017-7-10
2,2017-7-10
2,2017-7-11
$ trdsql "SELECT u.c1,u.c2,h.c2 FROM user.csv as u LEFT JOIN hist.csv as h ON(u.c1=h.c1)"
1,userA,2017-7-10
2,userB,2017-7-10
2,userB,2017-7-11
PostgreSQL
4.19.When using PostgreSQL, specify postgres for driver and driver-specific data source name for dsn.
trdsql -driver postgres -dsn "dbname=test" "SELECT count(*) FROM test.csv "
Function
4.19.1.The PostgreSQL driver can use the window function.
$ trdsql -driver postgres -dsn "dbname=test" "SELECT row_number() OVER (ORDER BY c2),c1,c2 FROM test.csv"
1,3,Apple
2,2,Melon
3,1,Orange
For example, the generate_series function can be used.
$ trdsql -driver postgres -dsn "dbname=test" "SELECT generate_series(1,3);"
1
2
3
Join table and CSV file is possible
4.19.2.Test database has a colors table.
$ psql test -c "SELECT * FROM colors"
id | name
----+--------
1 | orange
2 | green
3 | red
(3 rows)
Join table and CSV file.
$ trdsql -driver postgres -dsn "dbname=test" "SELECT t.c1,t.c2,c.name FROM test.csv AS t LEFT JOIN colors AS c ON (t.c1::int = c.id)"
1,Orange,orange
2,Melon,green
3,Apple,red
To create a table from a file, use "CREATE TABLE ... AS SELECT...".
trdsql -driver postgres -dns "dbname=test" "CREATE TABLE fruits (id, name) AS SELECT c1::int, c2 FROM fruits.csv "
$ psql -c "SELECT * FROM fruits;"
id | name
----+--------
1 | Orange
2 | Melon
3 | Apple
(3 rows)
MySQL
4.20.When using MySQL, specify mysql for driver and connection information for dsn.
$ trdsql -driver mysql -dsn "user:password@/test" "SELECT GROUP_CONCAT(c2 ORDER BY c2 DESC) FROM testdata/test.csv"
"g,d,a"
$ trdsql -driver mysql -dsn "user:password@/test" "SELECT c1, SHA2(c2,224) FROM test.csv"
1,a063876767f00792bac16d0dac57457fc88863709361a1bb33f13dfb
2,2e7906d37e9523efeefb6fd2bc3be6b3f2991678427bedc296f9ddb6
3,d0b8d1d417a45c7c58202f55cbb617865f1ef72c606f9bce54322802
MySQL can join tables and CSV files as well as PostgreSQL.
Analyze
4.21.The -a filename option parses the file and outputs table information and SQL examples.
$ trdsql -a testdata/test.ltsv
The table name is testdata/header.csv.
The file type is CSV.
Data types:
+-------------+------+
| column name | type |
+-------------+------+
| id | text |
| \`name\` | text |
+-------------+------+
Data samples:
+----+----------+
| id | \`name\` |
+----+----------+
| 1 | Orange |
+----+----------+
Examples:
trdsql -db sdb -ih "SELECT id, \`name\` FROM testdata/header.csv"
trdsql -db sdb -ih "SELECT id, \`name\` FROM testdata/header.csv WHERE id = '1'"
trdsql -db sdb -ih "SELECT id, count(id) FROM testdata/header.csv GROUP BY id"
trdsql -db sdb -ih "SELECT id, \`name\` FROM testdata/header.csv ORDER BY id LIMIT 10"
Other options(-id,-ih,-ir,-is,icsv,iltsv,-ijson,-itbln...) are available.
trdsql -ih -a testdata/header.csv
Similarly, with -A filename option, only Examples (SQL) is output.
$ trdsql -ih -A testdata/header.csv
trdsql -ih "SELECT id, \`name\` FROM testdata/header.csv"
trdsql -ih "SELECT id, \`name\` FROM testdata/header.csv WHERE id = '1'"
trdsql -ih "SELECT id, count(id) FROM testdata/header.csv GROUP BY id"
trdsql -ih "SELECT id, \`name\` FROM testdata/header.csv ORDER BY id LIMIT 10"
configuration
4.22.You can specify driver and dsn in the configuration file.
Unix like.
${HOME}/.config/trdsql/config.json
Windows (ex).
C:\Users\{"User"}\AppData\Roaming\trdsql\config.json
Or use the -config file option.
$ trdsql -config config.json "SELECT * FROM test.csv"
sample: config.json
{
"db": "pdb",
"database": {
"sdb": {
"driver": "sqlite3",
"dsn": ""
},
"pdb": {
"driver": "postgres",
"dsn": "user=test dbname=test"
},
"mdb": {
"driver": "mysql",
"dsn": "user:password@/dbname"
}
}
}
The default database is an entry of "db".
If you put the setting in you can specify the name with -db.
$ trdsql -debug -db pdb "SELECT * FROM test.csv"
2017/07/18 02:27:47 driver: postgres, dsn: user=test dbname=test
2017/07/18 02:27:47 CREATE TEMPORARY TABLE "test.csv" ( c1 text,c2 text );
2017/07/18 02:27:47 INSERT INTO "test.csv" (c1,c2) VALUES ($1,$2);
2017/07/18 02:27:47 SELECT * FROM "test.csv"
1,Orange
2,Melon
3,Apple
Library
5.Example of use as a library.
package main
import (
"log"
"github.com/noborus/trdsql"
)
func main() {
trd := trdsql.NewTRDSQL(
trdsql.NewImporter(trdsql.InDelimiter(":")),
trdsql.NewExporter(trdsql.NewWriter()),
)
err := trd.Exec("SELECT c1 FROM /etc/passwd")
if err != nil {
log.Fatal(err)
}
}
Please refer to godoc and _example for usage as a library.
See also psutilsql, which uses trdsql as a library.
License
6.MIT
Please check each license of SQL driver.