• This repository has been archived on 26/Jul/2020
  • Stars
    star
    358
  • Rank 114,490 (Top 3 %)
  • Language
    Rust
  • License
    GNU Affero Genera...
  • Created about 5 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

High-performance log search engine.

NOTE: This project is under development, please do not depend on it yet as things may break.

MinSQL

MinSQL is a log search engine designed with simplicity in mind to the extent that no SDK is needed to interact with it, most programming languages and tools have some form of http request capability (ie: curl) and that's all you need to interact with MinSQL.

To build

docker build . -t minio/minsql
docker run --rm minio/minsql --help

OR

make
./minsql --help

Running the project

An instance of MinIO is needed as the storage engine for MinSQL. To keep things easier we have a docker-compose example for MinIO and MinSQL.

To run the project you need to provide the access details for a Meta Bucket to store the shared configuration between multiple MinSQL instances, the location and access to it should be configured via environment variables when starting MinSQL .

Binary:
export MINSQL_METABUCKET_NAME=minsql-meta
export MINSQL_METABUCKET_ENDPOINT=http://localhost:9000
export MINSQL_METABUCKET_ACCESS_KEY=minio
export MINSQL_METABUCKET_SECRET_KEY=minio123
export MINSQL_ROOT_ACCESS_KEY=minsqlaccesskeyx
export MINSQL_ROOT_SECRET_KEY=minsqlsecretkeypleasechangexxxx
./minsql

Then go to http://127.0.0.1:9999/ui/ and login with the provided MINSQL_ROOT_ACCESS_KEY and MINSQL_ROOT_SECRET_KEY.

Docker

Create the compose file

cat > docker-compose.yml <<EOF
version: '2'

services:
 minio-engine:
  image: minio/minio
  volumes:
   - data:/data
  environment:
   MINIO_ACCESS_KEY: minio
   MINIO_SECRET_KEY: minio123
  command: server /data
 mc:
  image: minio/mc
  depends_on:
   - minio
  entrypoint: >
    /bin/sh -c "
    echo /usr/bin/mc config host a http://minio-engine:9000 minio minio123;
    /usr/bin/mc mb a/minsql-meta;
    "
 minsql:
  image: minio/minsql
  depends_on:
   - minio
   - mc
  ports:
   - "9999:9999"
  environment:
   MINSQL_METABUCKET_NAME: minsql-meta
   MINSQL_METABUCKET_ENDPOINT: http://minio-engine:9000
   MINSQL_ACCESS_KEY: minio
   MINSQL_SECRET_KEY: minio123
   MINSQL_ROOT_ACCESS_KEY: minsqlaccesskeyx
   MINSQL_ROOT_SECRET_KEY: minsqlsecretkeypleasechangexxxx

volumes:
  data:
EOF
docker-compose up

Environment variables

Environment Description
MINSQL_METABUCKET_NAME Name of the meta bucket
MINSQL_METABUCKET_ENDPOINT Name of the endpoint, ex: http://localhost:9000
MINSQL_METABUCKET_ACCESS_KEY Meta Bucket Access key
MINSQL_METABUCKET_SECRET_KEY Meta Bucket Secret key
MINSQL_PKCS12_CERT Optional: location to a pkcs12 certificate.
MINSQL_PKCS12_PASSWORD Optional: password to unlock the certificate.
MINSQL_ROOT_ACCESS_KEY Optional: 16 digit access key to bootstrap minsql
MINSQL_ROOT_SECRET_KEY Optional: 32 digit secret key to bootstrap minsql

Configuring

To start storing logs you need to setup a DataStore, Log, Token and a Authorization on MinSQL, this can be done using the admin REST APIs.

To get our sample code going we are going to:

  1. minioplay datastore
  2. mylog log
  3. a token
  4. authorize the token to our log

Add a sample datastore

Our sample datastore will be pointing to play, a demo instance of MinIO.

curl -X POST \
  http://127.0.0.1:9999/api/datastores \
  -H 'Content-Type: application/json' \
  -d '{
  "bucket" : "play-minsql",
  "endpoint" : "https://play.minio.io:9000",
  "prefix" : "",
  "name" : "minioplay",
  "access_key" : "Q3AM3UQ867SPQQA43P2F",
  "secret_key" : "zuf+tfteSlswRu7BJ86wekitnifILbZam1KYY3TG"
}'

Add a Sample log

We are going to add a log mylog that stores it's contents on the minioplay datastore.

curl -X POST \
  http://127.0.0.1:9999/api/logs \
  -H 'Content-Type: application/json' \
  -d '{
  "name" : "mylog",
  "datastores" : [
    "minioplay",
  ],
  "commit_window" : "5s"
}'

Create a sample token

We are going to generate a token with a hardcoded token abcdefghijklmnopabcdefghijklmnopabcdefghijklmnop

curl -X POST \
  http://127.0.0.1:9999/api/tokens \
  -H 'Content-Type: application/json' \
  -d '{
  "access_key" : "abcdefghijklmnop",
  "secret_key" : "abcdefghijklmnopabcdefghijklmnop",
  "description" : "test",
  "is_admin" : true,
  "enabled" : false
}'

Authorize token to log

Finally, we are going to authorize our new token to access mylog

curl -X POST \
  http://127.0.0.1:9999/api/auth/abcdefghijklmnop \
  -H 'Content-Type: application/json' \
  -d '{
  "log_name" : "mylog",
  "api" : ["search","store"]
}'

Storing logs

For a log mylog defined on the configuration we can store logs on MinSQL by performing a PUT to your MinSQL instance

curl -X PUT \
  http://127.0.0.1:9999/mylog/store \
  -H 'MINSQL-TOKEN: TOKEN1' \
  -d '10.8.0.1 - - [16/May/2019:23:02:56 +0000] "GET / HTTP/1.1" 400 256 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"'

You can send multiple log lines separated by new line

Querying logs

To get data out of MinSQL you can use SQL. Note that MinSQL is a data layer and not a computation layer, therefore certain SQL statements that need computations (SUM, MAX, GROUP BY, JOIN, etc...) are not supported.

All the query statements must be sent via POST to your MinSQL instance.

SELECT

To select all the logs for a particular log you can perform a simple SELECT statement

SELECT * FROM mylog

And send that to MinSQL via POST

curl -X POST \
  http://127.0.0.1:9999/search \
  -H 'MINSQL-TOKEN: TOKEN1' \
  -d 'SELECT * FROM mylog'

This will return you all the raw log lines stored for that log.

67.164.164.165 - - [24/Jul/2017:00:16:46 +0000] "GET /info.php HTTP/1.1" 200 24564 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"
67.164.164.165 - - [24/Jul/2017:00:16:48 +0000] "GET /favicon.ico HTTP/1.1" 404 209 "http://104.236.9.232/info.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"
24.26.204.22 - - [24/Jul/2017:00:17:16 +0000] "GET /info.php HTTP/1.1" 200 24579 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"
45.23.126.92 - - [24/Jul/2017:00:16:18 +0000] "GET /info.php HTTP/1.1" 200 24589 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"

Select parts of the data

We can get only parts of the data by using any of the supported MinSQL entities, which start with a $ sign.

Positional

We can select from the data by its position, for example to get the first column and the fourth we can use $1 and $4

SELECT $1, $4 FROM mylog;

To which MinSQL will reply

67.164.164.165 [24/Jul/2017:00:16:46
67.164.164.165 [24/Jul/2017:00:16:48
24.26.204.22 [24/Jul/2017:00:17:16
45.23.126.92 [24/Jul/2017:00:16:18

You can see that the data was selected as is, however the selected date column is not clean enough, MinSQL provides other entities to deal with this.

By Type

MinSQL provides a nice list of entities that make the extraction of data chunks from your raw data easy thanks to our powerful Schema on Read approach. For example we can select any ip in our data by using the entity $ip and any date using $date.

SELECT $ip, $date FROM mylog

To which MinSQL will reply

67.164.164.165 24/Jul/2017
67.164.164.165 24/Jul/2017
24.26.204.22 24/Jul/2017
45.23.126.92 24/Jul/2017

If your data contains more than one ip address you can access the subsequent ip's using positional entities.

SELECT $ip, $ip2, $ip3, $date FROM mylog

Please note that if no positional number is specified on an entity, it will default to the first position, in this case $ip == $ip1

Filtering

Using the powerful select engine of MinSQL you can also filter the data so only the relevant information that you need to extract from your logs is returned.

For example, to filter out a single ip from your logs you could select by $ip

SELECT * FROM mylog WHERE $ip = '67.164.164.165'

To which MinSQL will reply only with the matched lines

67.164.164.165 - - [24/Jul/2017:00:16:46 +0000] "GET /info.php HTTP/1.1" 200 24564 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"
67.164.164.165 - - [24/Jul/2017:00:16:48 +0000] "GET /favicon.ico HTTP/1.1" 404 209 "http://104.236.9.232/info.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"

By value

You can select log lines that contain a value by using the LIKE operator or NOT NULL for any entity.

SELECT * FROM mylog WHERE $line LIKE 'Intel' AND $email IS NOT NULL

This query would return all the log lines conaining the word Intel that also contain an email address.

Entities

A list of supported entities by MinSQL :

  • $line: Represents the whole log line
  • $ip: Selects any format of ipv4
  • $date: Any format of date containing date, month and year.
  • $email: Any [email protected]
  • $quoted: any text that is within single quotes (') or double quotes (")
  • $url: any url starting with http
  • $phone: any valid 10 digit phone.
  • $user_agent: A quoted user agent found in the logs
    • $user_agent.name: Browser name
    • $user_agent.category: type of machine (pc, mac)
    • $user_agent.os: Operative System name
    • $user_agent.os_version: Operative System version
    • $user_agent.browser_type: Type of browser
    • $user_agent.version: version of browser
    • $user_agent.vendor: browser vendor

More Repositories

1

minio

The Object Store for AI Data Infrastructure
Go
43,034
star
2

mc

Simple | Fast tool to manage MinIO clusters โ˜๏ธ
Go
2,683
star
3

minio-go

MinIO Go client SDK for S3 compatible object storage
Go
2,204
star
4

simdjson-go

Golang port of simdjson: parsing gigabytes of JSON per second
Go
1,730
star
5

c2goasm

C to Go Assembly
Go
1,296
star
6

operator

Simple Kubernetes Operator for MinIO clusters ๐Ÿ’ป
Go
1,092
star
7

minio-java

MinIO Client SDK for Java
Java
995
star
8

sha256-simd

Accelerate SHA256 computations in pure Go using AVX512, SHA Extensions for x86 and ARM64 for ARM. On AVX512 it provides an up to 8x improvement (over 3 GB/s per core). SHA Extensions give a performance boost of close to 4x over native.
Go
919
star
9

minio-js

MinIO Client SDK for Javascript
JavaScript
879
star
10

highwayhash

Native Go version of HighwayHash with optimized assembly implementations on Intel and ARM. Able to process over 10 GB/sec on a single core on Intel CPUs - https://en.wikipedia.org/wiki/HighwayHash
Go
850
star
11

console

Simple UI for MinIO Object Storage ๐Ÿงฎ
TypeScript
788
star
12

minio-py

MinIO Client SDK for Python
Python
758
star
13

awesome-minio

A curated list of Awesome MinIO community projects.
658
star
14

selfupdate

Build self-updating Go programs
Go
583
star
15

docs

MinIO Object Storage Documentation
SCSS
532
star
16

directpv

Simple Kubernetes CSI driver for Direct Attached Storage ๐Ÿ’ฝ
Go
517
star
17

sidekick

High Performance HTTP Sidecar Load Balancer
Go
515
star
18

minio-dotnet

MinIO Client SDK for .NET
C#
506
star
19

warp

S3 benchmarking tool
Go
463
star
20

minfs

A network filesystem client to connect to MinIO and Amazon S3 compatible cloud storage servers
Go
451
star
21

kes

Key Managament Server for Object Storage and more
Go
441
star
22

dsync

A distributed sync package.
Go
399
star
23

doctor

Doctor is a documentation server for your docs in github
Ruby
389
star
24

minio-service

Collection of MinIO server scripts for upstart, systemd, sysvinit, launchd.
Shell
345
star
25

sio

Go implementation of the Data At Rest Encryption (DARE) format.
Go
340
star
26

blake2b-simd

Fast hashing using pure Go implementation of BLAKE2b with SIMD instructions
Go
245
star
27

concert

Concert is a console based certificate generation tool for https://letsencrypt.org.
Go
195
star
28

minio-rs

MinIO Rust SDK for Amazon S3 Compatible Cloud Storage
Rust
169
star
29

asm2plan9s

Tool to generate BYTE sequences for Go assembly as generated by YASM
Go
165
star
30

md5-simd

Accelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.
Go
159
star
31

certgen

A dead simple tool to generate self signed certificates for MinIO TLS deployments
Go
104
star
32

thumbnailer

A thumbnail generator example using Minio's listenBucketNotification API
JavaScript
103
star
33

spark-select

A library for Spark DataFrame using MinIO Select API
Scala
97
star
34

minio-cpp

MinIO C++ Client SDK for Amazon S3 Compatible Cloud Storage
C++
92
star
35

mint

Collection of tests to detect overall correctness of MinIO server.
Go
76
star
36

madmin-go

The MinIO Admin Go Client SDK provides APIs to manage MinIO services
Go
65
star
37

minio-java-rest-example

REST example using minio-java library.
Java
62
star
38

minio-go-media-player

A HTML5 media player using minio-go library.
HTML
57
star
39

minio-js-store-app

Store Application using minio-js library to manage product assets
HTML
49
star
40

minio-hs

MinIO Client SDK for Haskell
Haskell
46
star
41

dperf

Drive performance measurement tool
Go
46
star
42

msf

MFS (Minio Federation Service) is a namespace, identity and access management server for Minio Servers
Go
43
star
43

openlake

Build Data Lake using Open Source tools
Jupyter Notebook
39
star
44

zipindex

Package for indexing zip files and storing a compressed index
Go
39
star
45

hperf

Distributed HTTP Speed Test.
Go
38
star
46

simdcsv

Go
33
star
47

nifi-minio

A custom ContentRepository implementation for NiFi to persist data to MinIO Object Storage
Java
30
star
48

benchmarks

Collection of benchmarks captured for MinIO server.
29
star
49

m3

MinIO Kubernetes Cloud
Go
27
star
50

android-photo-app

Android Photo App example using minio-java library.
Java
26
star
51

minio-ruby

MinIO Client SDK for Ruby
Ruby
26
star
52

lxmin

Backup and Restore LXC instances from MinIO
Go
26
star
53

radio

Redundant Array of Distributed Independent Objectstores in short RADIO performs synchronous mirroring, erasure coding across multiple object stores
Go
24
star
54

parquet-go

Go library to work with Parquet Files
Go
23
star
55

presto-minio

How to use Presto (with Hive metastore) and MinIO?
23
star
56

pkg

Repository to hold all the common packages imported by MinIO projects
Go
22
star
57

bottlenet

Find bottlenecks in distributed network
Go
21
star
58

lsync

Local syncing package with support for timeouts. This package offers both a sync.Mutex and sync.RWMutex compatible interface.
Go
17
star
59

simple-ci

Stateless. Infinite scalability. Easy Setup. Microservice. Minimalist CI
JavaScript
17
star
60

ming

Object Storage Gateway for Hybrid Cloud
Go
17
star
61

blog-assets

Collection of assets used for various articles at https://blogs.min.io
Jupyter Notebook
17
star
62

gluegun

Glues Github markdown docs to present a beautiful documentation site.
CSS
16
star
63

swift-photo-app

Swift photo app
Swift
15
star
64

homebrew-stable

Homebrew tap for MinIO
Ruby
15
star
65

mnm

Minimal Minio API aggregates many minio instances to look like one
Go
13
star
66

perftest

Collection of scripts used in Minio performance testing.
Go
12
star
67

ror-resumeuploader-app

Ruby on rails app using aws-sdk-ruby
JavaScript
11
star
68

mds

MinIO Design System is a common library of all the UI design elements.
TypeScript
10
star
69

minio-iam-testing

Shell
10
star
70

rsync-go

This is a pure go implementation of the rsync algorithm with highwayhash signature
Go
9
star
71

select-simd

Go
8
star
72

chaos

A framework for testing Minio's fault tolerance capability.
Go
8
star
73

hdfs-to-minio

A simple containerized hadoop CLI to migrate content between various HCFS implementations
Dockerfile
7
star
74

simdjson-fuzz

Fuzzers and corpus for https://github.com/minio/simdjson-go
Go
7
star
75

minio-lambda-notification-example

Example App that uses MinIO Lambda Notification with Postgres
JavaScript
7
star
76

buzz

A prototype for github issue workflow management
Less
7
star
77

dmt

Direct MinIO Tunnel
Go
6
star
78

go-cv

Golang wrapper for https://github.com/ermig1979/Simd
Go
6
star
79

spark-data-generator

Generates dummy parquet, csv, json files for testing and validating MinIO compatibility
Scala
6
star
80

kms-go

MinIO key managment SDK
Go
6
star
81

xxml

Package xml implements a simple XML 1.0 parser that understands XML name spaces, extended support for control characters.
Go
5
star
82

spark-streaming-checkpoint

Spark Streaming Checkpoint File Manager for MinIO
Scala
5
star
83

minio-jenkins

This is a simple Jenkins plugin that lets you upload Jenkins artifacts to a Minio Server
Java
5
star
84

disco

Disco discovery service for MinIO.
Go
5
star
85

charts

Mustache
5
star
86

docs-k8s

MinIO Docs for Kubernetes
Python
4
star
87

attic

Collection of deprecated packages ๐Ÿ˜Ÿ
C++
4
star
88

pkger

Debian, RPMs and APKs for MinIO
Go
4
star
89

marketplace

Makefile
4
star
90

kitchensink

Go
3
star
91

confess

Object store consistency checker
Go
3
star
92

webhook

HTTP events to file logger
Go
3
star
93

colorjson

Package json implements encoding and decoding of JSON as defined in RFC 7159. The mapping between JSON and Go values is described in the documentation for the Marshal and Unmarshal functions
Go
2
star
94

minio-pcf-adapter

MinIO Service Adapter for Pivotal
Go
2
star
95

training

Materials for supporting MinIO-led training and curriculum.
Python
2
star
96

docs-vsphere

MinIO Docs for VMware Cloud Foundation
Python
2
star
97

xfile

Determines information about the object.
Go
2
star
98

wiki

MinIO's Wiki
2
star
99

hcp-to-minio

About A simple CLI to migrate content from HCP to MinIO
Go
2
star
100

csvparser

Package csv reads and writes comma-separated values (CSV) files.
Go
2
star