• Stars
    star
    144
  • Rank 255,590 (Top 6 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created about 7 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Simple nginx logs parser & transporter to ClickHouse database.

nginx-clickhouse   Tweet

License: Apache 2 Golang Version Docker Build Status Docker Pulls Docker Stars GitHub issues

Simple nginx logs parser & transporter to ClickHouse database.

How to build from sources

1. Install helpers

make install-helpers

2. Install dependencies

make dependencies

3. Build binary file

make build

How to build Docker image

To build image just type this command, and it will compile binary from sources and create Docker image. You don't need to have Go development tools, the build process will be in Docker.

make docker

How to run

1. Pull image from Docker Hub (or build from sources)

docker pull mintance/nginx-clickhouse

There are always last stable image, it automatically builds when release created.

2. Run Docker container

For this example, we include /var/log/nginx directory, where we store our logs, and config directory where we store config.yml file.

docker run --rm --net=host --name nginx-clickhouse -v /var/log/nginx:/logs -v config:/config -d mintance/nginx-clickhouse

How it works?

Here are described full setting-up example.

NGINX log format description

In nginx, there are: nginx_http_log_module that writes request logs in the specified format.

They are defined in /etc/nginx/nginx.conf file. For example we create main log format.

http {
    ...
     log_format main '$remote_addr - $remote_user [$time_local] "$request" $status $bytes_sent "$http_referer" "$http_user_agent"';
    ...
}

After defining this, we can use it in our site config /etc/nginx/sites-enabled/my-site.conf inside server section:

server {
  ...
  access_log /var/log/nginx/my-site-access.log main;
  ...
}

Now all what we need, is to create config.yml file where we describe our log format, log file path, and ClickHouse credentials. We can also use environment variables for this.

ClickHouse table schema example

This is table schema for our example.

CREATE TABLE metrics.nginx (
    RemoteAddr String,
    RemoteUser String,
    TimeLocal DateTime,
    Date Date DEFAULT toDate(TimeLocal),
    Request String,
    RequestMethod String,
    Status Int32,
    BytesSent Int64,
    HttpReferer String,
    HttpUserAgent String,
    RequestTime Float32,
    UpstreamConnectTime Float32,
    UpstreamHeaderTime Float32,
    UpstreamResponseTime Float32,
    Https FixedString(2),
    ConnectionsWaiting Int64,
    ConnectionsActive Int64
) ENGINE = MergeTree(Date, (Status, Date), 8192)

Config file description

1. Log path & flushing interval
settings:
  interval: 5 # in seconds
  log_path: /var/log/nginx/my-site-access.log # path to logfile
2. ClickHouse credentials and table schema
clickhouse:
 db: metrics # Database name
 table: nginx # Table name
 host: localhost # ClickHouse host (cluster support will be added later)
 port: 8123 # ClicHhouse HTTP port
 credentials:
  user: default # User name
  password: # User password

Here we describe in key-value format (key - ClickHouse column, value - log variable) relation between column and log variable.

columns:
    RemoteAddr: remote_addr
    RemoteUser: remote_user
    TimeLocal: time_local
    Request: request
    Status: status
    BytesSent: bytes_sent
    HttpReferer: http_referer
    HttpUserAgent: http_user_agent
3. NGINX log type & format

In log_format - we just copy format from nginx.conf

nginx:
  log_type: main
  log_format: $remote_addr - $remote_user [$time_local] "$request" $status $bytes_sent "$http_referer" "$http_user_agent"
4. Full config file example
settings:
    interval: 5
    log_path: /var/log/nginx/my-site-access.log
    seek_from_end: false
clickhouse:
    db: metrics
    table: nginx
    host: localhost
    port: 8123
    credentials:
        user: default
        password:
    columns:
        RemoteAddr: remote_addr
        RemoteUser: remote_user
        TimeLocal: time_local
        Request: request
        Status: status
        BytesSent: bytes_sent
        HttpReferer: http_referer
        HttpUserAgent: http_user_agent
nginx:
    log_type: main
    log_format: $remote_addr - $remote_user [$time_local] "$request" $status $bytes_sent "$http_referer" "$http_user_agent"

Grafana Dashboard

After all steps you can build your own grafana dashboards.

alt text

alt text