• Stars
    star
    283
  • Rank 146,031 (Top 3 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created about 7 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

MySQL random data loader

Random data generator for MySQL

Build Status

Many times in my job I need to generate random data for a specific table in order to reproduce an issue.
After writing many random generators for every table, I decided to write a random data generator, able to get the table structure and generate random data for it.
Plase take into consideration that this is the first version and it doesn't support all field types yet!

NOTICE
This is an early stage project.

Supported fields:

Field type Generated values
tinyint 0 ~ 0xFF
smallint 0 ~ 0XFFFF
mediumint 0 ~ 0xFFFFFF
int - integer 0 ~ 0xFFFFFFFF
bigint 0 ~ 0xFFFFFFFFFFFFFFFF
float 0 ~ 1e8
decimal(m,n) 0 ~ 10^(m-n)
double 0 ~ 1000
char(n) up to n random chars
varchar(n) up to n random chars
date NOW() - 1 year ~ NOW()
datetime NOW() - 1 year ~ NOW()
timestamp NOW() - 1 year ~ NOW()
time 00:00:00 ~ 23:59:59
year Current year - 1 ~ current year
tinyblob up to 100 chars random paragraph
tinytext up to 100 chars random paragraph
blob up to 100 chars random paragraph
text up to 100 chars random paragraph
mediumblob up to 100 chars random paragraph
mediumtext up to 100 chars random paragraph
longblob up to 100 chars random paragraph
longtext up to 100 chars random paragraph
varbinary up to 100 chars random paragraph
enum A random item from the valid items list
set A random item from the valid items list

How strings are generated

  • If field size < 10 the program generates a random "first name"
  • If the field size > 10 and < 30 the program generates a random "full name"
  • If the field size > 30 the program generates a "lorem ipsum" paragraph having up to 100 chars.

The program can detect if a field accepts NULLs and if it does, it will generate NULLs ramdomly (~ 10 % of the values).

Usage

mysql_random_data_load <database> <table> <number of rows> [options...]

Options

Option Description
--bulk-size Number of rows per INSERT statement (Default: 1000)
--debug Show some debug information
--fk-samples-factor Percentage used to get random samples for foreign keys fields. Default 0.3
--host Host name/ip
--max-fk-samples Maximum number of samples for fields having foreign keys constarints. Default: 100
--max-retries Maximum number of rows to retry in case of errors. See duplicated keys. Deafult: 100
--no-progressbar Skip showing the progress bar. Default: false
--password Password
--port Port number
--Print Print queries to the standard output instead of inserting them into the db
--user Username
--version Show version and exit

Foreign keys support

If a field has Foreign Keys constraints, random-data-load will get up to --max-fk-samples random samples from the referenced tables in order to insert valid values for the field.
The number of samples to get follows this rules:
1. Get the aproximate number of rows in the referenced table using the rows field in:

EXPLAIN SELECT COUNT(*) FROM <referenced schema>.<referenced table>

1.1 If the number of rows is less than max-fk-samples, all rows are retrieved from the referenced table using this query:

SELECT <referenced field> FROM <referenced schema>.<referenced table>

1.2 If the number of rows is greater than max-fk-samples, samples are retrieved from the referenced table using this query:

SELECT <referenced field> FROM <referenced schema>.<referenced table> WHERE RAND() <= <fk-samples-factor> LIMIT <max-fk-samples>

Example

CREATE DATABASE IF NOT EXISTS test;

CREATE TABLE `test`.`t3` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `tcol01` tinyint(4) DEFAULT NULL,
  `tcol02` smallint(6) DEFAULT NULL,
  `tcol03` mediumint(9) DEFAULT NULL,
  `tcol04` int(11) DEFAULT NULL,
  `tcol05` bigint(20) DEFAULT NULL,
  `tcol06` float DEFAULT NULL,
  `tcol07` double DEFAULT NULL,
  `tcol08` decimal(10,2) DEFAULT NULL,
  `tcol09` date DEFAULT NULL,
  `tcol10` datetime DEFAULT NULL,
  `tcol11` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `tcol12` time DEFAULT NULL,
  `tcol13` year(4) DEFAULT NULL,
  `tcol14` varchar(100) DEFAULT NULL,
  `tcol15` char(2) DEFAULT NULL,
  `tcol16` blob,
  `tcol17` text,
  `tcol18` mediumtext,
  `tcol19` mediumblob,
  `tcol20` longblob,
  `tcol21` longtext,
  `tcol22` mediumtext,
  `tcol23` varchar(3) DEFAULT NULL,
  `tcol24` varbinary(10) DEFAULT NULL,
  `tcol25` enum('a','b','c') DEFAULT NULL,
  `tcol26` set('red','green','blue') DEFAULT NULL,
  `tcol27` float(5,3) DEFAULT NULL,
  `tcol28` double(4,2) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

To generate 100K random rows, just run:

mysql_random_data_load test t3 100000 --user=root --password=root
mysql> select * from t3 limit 1\G
*************************** 1. row ***************************
    id: 1
tcol01: 10
tcol02: 173
tcol03: 1700
tcol04: 13498
tcol05: 33239373
tcol06: 44846.4
tcol07: 5300.23
tcol08: 11360967.75
tcol09: 2017-09-04
tcol10: 2016-11-02 23:11:25
tcol11: 2017-03-03 08:11:40
tcol12: 03:19:39
tcol13: 2017
tcol14: repellat maxime nostrum provident maiores ut quo voluptas.
tcol15: Th
tcol16: Walter
tcol17: quo repellat accusamus quidem odi
tcol18: esse laboriosam nobis libero aut dolores e
tcol19: Carlos Willia
tcol20: et nostrum iusto ipsa sunt recusa
tcol21: a accusantium laboriosam voluptas facilis.
tcol22: laudantium quo unde molestiae consequatur magnam.
tcol23: Pet
tcol24: Richard
tcol25: c
tcol26: green
tcol27: 47.430
tcol28: 6.12
1 row in set (0.00 sec)

How to download the precompiled binaries

There are binaries available for each version for Linux and Darwin. You can find compiled binaries for each version in the releases tab:

https://github.com/Percona-Lab/mysql_random_data_load/releases

To do

  • Add suport for all data types.
  • Add supporrt for foreign keys.
  • Support config files to override default values/ranges.
  • Support custom functions via LUA plugins.

Version history

0.1.10

  • Fixed argument validations
  • Fixed ~/.my.cnf loading

0.1.10

  • Fixed connection parameters for MySQL 5.7 (set driver's AllowNativePasswords: true)

0.1.9

  • Added support for bunary and varbinary columns
  • By default, read connection params from ${HOME}/.my.cnf

0.1.8

  • Fixed error for triggers created with MySQL 5.6
  • Added Travis-CI
  • Code clean up

0.1.7

  • Support for MySQL 8.0
  • Added --print parameter
  • Added --version parameter
  • Removed qps parameter

0.1.6

  • Improved generation speed (up to 50% faster)
  • Improved support for TokuDB (Thanks Agustin Gallego)
  • Code refactored
  • Improved debug logging
  • Added Query Per Seconds support (experimental)

0.1.5

  • Fixed handling of NULL collation for index parser

0.1.4

  • Fixed handling of time columns
  • Improved support of GENERATED columns

0.1.3

  • Fixed handling of nulls

0.1.2

  • New table parser able to retrieve all the information for fields, indexes and foreign keys constraints.
  • Support for foreign keys constraints
  • Added some tests

0.1.1

  • Fixed random data generation

0.1.0

  • Initial version

More Repositories

1

tpcc-mysql

C
455
star
2

sysbench-tpcc

Sysbench scripts to generate a tpcc-like workload for MySQL and PostgreSQL
Lua
292
star
3

mongodb_consistent_backup

A tool for performing consistent backups of MongoDB Clusters or Replica Sets
Python
273
star
4

PromHouse

PromHouse is a long-term remote storage with built-in clustering and downsampling for Prometheus 2.x on top of ClickHouse.
Go
259
star
5

clickhousedb_fdw

PostgreSQL's Foreign Data Wrapper For ClickHouse
C
202
star
6

query-playback

Query Playback
C++
95
star
7

ontime-airline-performance

Shell
95
star
8

pacemaker-replication-agents

Repository of the Percona Pacemaker resource agents
Shell
77
star
9

percona-openshift

Set of scripts to run Percona software in OpenShift / Kubernetes
Shell
54
star
10

proxysql-docker

Shell
44
star
11

pxc-docker

All docker related code for PXC
Shell
39
star
12

coredumper

Google coredumper library
C
38
star
13

tuned-percona-mongodb

A performance-focused tuned profile for MongoDB on CentOS/Redhat Linux
Makefile
37
star
14

pxc-proxysql-k8s

Shell
32
star
15

benchmark-results

23
star
16

jenkins-pipelines

Groovy
21
star
17

percona-xtradb-cluster-docker

Shell
21
star
18

tpce-mysql

C++
20
star
19

grafana_mongodb_dashboards

Dashboard for using Grafana and prometheus_mongodb_exporter
17
star
20

prom-migrate

prom-migrate reads all data from Prometheus 1.8 via API and creates a new Prometheus 2.0 storage directory.
Go
16
star
21

sysbench-mongodb-lua

Lua
14
star
22

serverless-postgresql-build

Shell
13
star
23

pmm-submodules

A repo dedicated to building Percona Monitoring and Management (PMM)
Python
12
star
24

pt-mysql-config-diff

A tool like pt-config-diff written in Go
Go
11
star
25

terraform-provider-percona

Terraform modules to deploy Percona Server and Percona XtraDB Cluster
Go
11
star
26

pt-pmp

Shell
10
star
27

go-tpcc

Go
10
star
28

pmm-custom-queries

Custom queries for Percona Monitoring and Management (PMM)
Shell
9
star
29

proxysql-ha-experiments

Shell
8
star
30

slowlog2clickhouse

Parse MySQL Slow log and save into ClickHouse table
Go
8
star
31

tokumx2_to_psmdb3_migration

Instructions and scripts to facilitate migration from TokuMX 2.0.x to PSMDB 3.0.x
JavaScript
7
star
32

percona-images

Packer config to build Percona base boxes
Shell
7
star
33

libeatmydata

libeatmydata library and packaging
M4
7
star
34

k8s-lab

HCL
6
star
35

autotokubackup

AutoTokBackup: A tokubackup commandline tool for running Percona TokuBackup written in Python3
Python
6
star
36

ognom-toolkit

Go
5
star
37

mysql-group-replication-docker

Shell
5
star
38

mysql-configs

5
star
39

percona-dbaas-cli

Go
5
star
40

percona-binlog-server

Percona Binary Log Server
C++
5
star
41

percona-millipede

Multi-host, sub-second replication delay monitor
Python
4
star
42

ps-build

Collection of MySQL build scripts
Shell
4
star
43

proxysql-scheduler

Shell
4
star
44

pmm-ruled

Rules Daemon
Go
4
star
45

group_replication_tools

4
star
46

mnogo_exporter

Moved to https://github.com/percona/mongodb_exporter (branch exporter_v2)
Go
4
star
47

percona-version-service

Go
3
star
48

pt-mongodb-summary

pt-mongodb-summary
Go
3
star
49

toolkit-tests

Docker container to run Percona Toolkit tests
Shell
3
star
50

minimum_permissions

Get the minimum set of permissions needed to run a particular query
Go
3
star
51

MetricBench

C++
3
star
52

serverless-postgresql-ansible

Ansible playbook to deploy serverless PostgreSQL
Jinja
3
star
53

codeceptjs-saucehelper

CodeceptJS Sauce Labs helpers, to update Test Names, Test Results after test execution
JavaScript
2
star
54

codeceptjs-influxdbhelper

CodeceptJS helper to collect Test Execution Metrics with the help of CodeceptJS test events
JavaScript
2
star
55

text2json

POC for a text (pt-summary / pt-mysql-summary) reports to json converter
Go
2
star
56

sysbench-blob

Lua
2
star
57

PLG

Recording and replaying exporters
Go
2
star
58

redo_log_dumper

POC for a innodb redo log dumper
Go
2
star
59

mongodb-fingerprint

Go
2
star
60

pmm-dashboards

PMM compatible dashboards or dashboards done during webinars or presentations
Shell
2
star
61

pmm-api

Moved to https://github.com/percona/pmm
HTML
2
star
62

benchmark_automation

Scripts to help automate the running of repeatable benchmarks
Shell
2
star
63

mongodb_systemd_multi

Setup of script for multiple instances in a single host for mongodb (designed for testing) or systemd mongo for mongos, arbiters, config servers and mongod's
Shell
2
star
64

wsrpc

Early prototype; dead end
Go
2
star
65

pmm-workloads

Various Workloads to Test and Demo Percona Monitoring and Management (PMM)
PHP
2
star
66

star-schema-benchmark

Scala
1
star
67

sanitizer

POC for a log sanitizer for pt-stalk, pt-mysql-summary, pt-summary
Go
1
star
68

operator-env

Go
1
star
69

sst-bench

Shell
1
star
70

vitess-tpcc-lab

Lua
1
star
71

percona-server-mongodb-openshift

Shell
1
star
72

wikistat-data

Python
1
star
73

mysql-kubernetes-openshift

1
star
74

pmm-client-docker

Shell
1
star
75

install-repo-pmm-server

Shell
1
star
76

pmm-api-tests

API tests for PMM 2.x.
Go
1
star
77

eng-scripts

Scripts from MySQL Engineering Team
Shell
1
star
78

pmm-build

Early prototype; dead end
Go
1
star
79

percona-mixins

Jsonnet
1
star
80

percona-on-arm

Unofficial builds
Shell
1
star
81

sysbench-mongodb-loop

A wrapper to run sysbench-mongodb forever
Shell
1
star
82

procfs

C++
1
star
83

visualize-mysql-queries

visualize mysql queries (based on Performance Schema)
JavaScript
1
star
84

rdsosmetrics_exporter

PROOF OF CONCEPT! export RDS Enhanced Monitoring metrics from CloudWatch Logs for prometheus
Go
1
star
85

MyRocks-benchmark

Scripts used too perform MyRocks specific benchmarks, forked from mdcallag/sysbench
1
star
86

single_install

This repo provides a single click installation script to Percona Products
Shell
1
star
87

qa-integration

Integration Tests Repo for Percona QA cross product common tests for release Integration testing
Shell
1
star