Polo travels through your database and creates sample snapshots so you can work with real world data in any environment.
Polo takes an ActiveRecord::Base
seed object and traverses every whitelisted ActiveRecord::Association
generating SQL INSERTs
along the way.
You can then save those SQL INSERTS
to .sql file and import the data to your favorite environment.
Read our blog post or check out this presentation.
Given the following data model:
class Chef < ActiveRecord::Base
has_many :recipes
has_many :ingredients, through: :recipes
end
class Recipe < ActiveRecord::Base
has_many :recipes_ingredients
has_many :ingredients, through: :recipes_ingredients
end
class Ingredient < ActiveRecord::Base
end
class RecipesIngredient < ActiveRecord::Base
belongs_to :recipe
belongs_to :ingredient
end
inserts = Polo.explore(Chef, 1)
INSERT INTO `chefs` (`id`, `name`) VALUES (1, 'Netto')
Where Chef
is the seed object class, and 1
is the seed object id.
inserts = Polo.explore(Chef, 1, :recipes)
INSERT INTO `chefs` (`id`, `name`) VALUES (1, 'Netto')
INSERT INTO `recipes` (`id`, `title`, `num_steps`, `chef_id`) VALUES (1, 'Turkey Sandwich', NULL, 1)
INSERT INTO `recipes` (`id`, `title`, `num_steps`, `chef_id`) VALUES (2, 'Cheese Burger', NULL, 1)
inserts = Polo.explore(Chef, 1, :recipes => :ingredients)
INSERT INTO `chefs` (`id`, `name`) VALUES (1, 'Netto')
INSERT INTO `recipes` (`id`, `title`, `num_steps`, `chef_id`) VALUES (1, 'Turkey Sandwich', NULL, 1)
INSERT INTO `recipes` (`id`, `title`, `num_steps`, `chef_id`) VALUES (2, 'Cheese Burger', NULL, 1)
INSERT INTO `recipes_ingredients` (`id`, `recipe_id`, `ingredient_id`) VALUES (1, 1, 1)
INSERT INTO `recipes_ingredients` (`id`, `recipe_id`, `ingredient_id`) VALUES (2, 1, 2)
INSERT INTO `recipes_ingredients` (`id`, `recipe_id`, `ingredient_id`) VALUES (3, 2, 3)
INSERT INTO `recipes_ingredients` (`id`, `recipe_id`, `ingredient_id`) VALUES (4, 2, 4)
INSERT INTO `ingredients` (`id`, `name`, `quantity`) VALUES (1, 'Turkey', 'a lot')
INSERT INTO `ingredients` (`id`, `name`, `quantity`) VALUES (2, 'Cheese', '1 slice')
INSERT INTO `ingredients` (`id`, `name`, `quantity`) VALUES (3, 'Patty', '1')
INSERT INTO `ingredients` (`id`, `name`, `quantity`) VALUES (4, 'Cheese', '2 slices')
Occasionally, you might have a dataset that you want to refresh. A production database that has data that might be useful on your local copy of the database. Polo doesn't have an opinion about your data; if you try to import data with a key that's already in your local database, Polo doesn't necessarily know how you want to handle that conflict.
Advanced users will find the on_duplicate
option to be helpful in this context. It gives Polo instructions on how to handle collisions.
Note: This feature is currently only supported for MySQL databases. (PRs for other databases are welcome!)
There are two possible values for the on_duplicate
key: :ignore
and :override
. Ignore keeps the old data. Override keeps the new data. If there's a collision and the on_duplicate param is not set, Polo will simpy stop importing the data.
A.K.A the Ostrich Approach: stick your head in the sand and pretend nothing happened.
Polo.configure do
on_duplicate :ignore
end
Polo::Traveler.explore(Chef, 1, :recipes)
INSERT IGNORE INTO `chefs` (`id`, `name`) VALUES (1, 'Netto')
INSERT IGNORE INTO `recipes` (`id`, `title`, `num_steps`, `chef_id`) VALUES (1, 'Turkey Sandwich', NULL, 1)
INSERT IGNORE INTO `recipes` (`id`, `title`, `num_steps`, `chef_id`) VALUES (2, 'Cheese Burger', NULL, 1)
Use the option on_duplicate: :override
to override your local data with new data from your Polo script.
Polo.configure do
on_duplicate :override
end
Polo::Traveler.explore(Chef, 1, :recipes)
INSERT INTO `chefs` (`id`, `name`) VALUES (1, 'Netto')
ON DUPLICATE KEY UPDATE id = VALUES(id), name = VALUES(name)
...
You can use the obfuscate
option to obfuscate sensitive fields like emails or
user logins.
Polo.configure do
obfuscate :email, :credit_card
end
Polo::Traveler.explore(AR::Chef, 1)
INSERT INTO `chefs` (`id`, `name`, `email`) VALUES (1, 'Netto', 'eahorctmaagfo.nitm@l')
Warning: This is not a security feature. Fields can still easily be rearranged back to their original format. Polo will simply scramble the order of strings so you don't accidentally end up causing side effects when using production data in development. It is not a good practice to use highly sensitive data in development.
For more advanced obfuscation, you can pass in a custom obfuscation strategy. Polo will take in a lambda that can be used to transform sensitive data.
Using a :symbol
as an obfuscate key targets all columns of that name. Passing an SQL selector as a String
will target columns within the specified table.
Polo.configure do
email_strategy = lambda do |email|
first_part = email.split("@")[0]
"#{first_part}@test.com"
end
credit_card_strategy = lambda do |credit_card|
"4123 4567 8910 1112"
end
# If you need the context of the record for its fields, it is accessible
# in the second argument of the strategy
social_security_strategy = lambda do |ssn, instance|
sprintf("%09d", instance.id)
end
obfuscate({
'chefs.email' => email_strategy, # This only applies to the "email" column in the "chefs" table
:credit_card => credit_card_strategy, # This applies to any column named "credit_card" across every table
:ssn_strategy => social_security_strategy
})
end
Polo::Traveler.explore(AR::Chef, 1)
INSERT INTO `chefs` (`id`, `name`, `email`) VALUES (1, 'Netto', '[email protected]')
Add this line to your application's Gemfile:
gem 'polo'
And then execute:
$ bundle
Or install it yourself as:
$ gem install polo
Bug reports and pull requests are welcome on GitHub at https://github.com/IFTTT/polo. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Code of Conduct.
To run the specs across all supported version of Rails, check out the repo and follow these steps:
$ bundle install
$ bundle exec appraisal install
$ bundle exec appraisal rake
The gem is available as open source under the terms of the MIT License.