• Stars
  • Rank 321,655 (Top 7 %)
  • Language
  • License
    MIT License
  • Created almost 11 years ago
  • Updated 12 months ago


There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Oplog-based data sync tool that synchronizes data from a replica set to another deployment, e.g.: standalone, replica set, and sharded cluster.


It synchronizes data from a replica set to another MongoDB deployment, e.g., standalone, replica set, and sharded cluster.

It's oplog-based and provides a realtime data synchronization.

It's written in Python 2.7.


  • MongoDB 2.4
  • MongoDB 2.6
  • MongoDB 3.0
  • MongoDB 3.2
  • MongoDB 3.4


  • initial sync and oplog based incremental sync
  • sync the specified databases and collections
  • concurrent oplog replaying


See requirements for details.

  • gevent

  • toml

  • mmh3

  • pymongo

    Always use pymongo 3.5.1.

    Refer to https://api.mongodb.com/python/3.6.0/changelog.html

    Version 3.6 adds support for MongoDB 3.6, drops support for CPython 3.3 (PyPy3 is still supported), and drops support for MongoDB versions older than 2.6. If connecting to a MongoDB 2.4 server or older, PyMongo now throws a ConfigurationError.


  • source MUST be a replica set
  • ignore system databases
    • admin
    • local
  • ignore system collections
    • system.*
  • create users for destination manually if necessary
  • suggest to authenticate with administrator if source enabled authentication
  • not support geospatial index

if the source is a sharded cluster

  • first, stop the balancer
  • then, start a seprate sync process for each shard


Use TOML as configuration file format.

Refer to mongo_conf.toml.


Source config items.

  • src.hosts - hostportstr of a member of replica set
  • src.username - username
  • src.password - password
  • src.authdb - authentiction database


Destination config items.

  • dst.mongo.hosts
  • dst.mongo.authdb
  • dst.mongo.username
  • dst.mongo.password


Custom options for synchronization.

sync.dbs specfies the databases to sync. sync.dbs.colls specifies the collections to sync.

  • sync.dbs - databases to sync, sync all databases if not specify
    • sync.dbs.db - source database name
    • sync.dbs.rename_db - destination database name, stay the same if not specify
    • sync.dbs.colls - collectons to sync, sync all collections if not specify

coll in sync.dbs.colls element specifies the collection to sync. fileds in sync.dbs.colls element specifies the fields of current collection to sync.


  • log.filepath - log file path, write to stdout if empty or not set


Command options has functional limitations. It's strongly recommended that use config file.


usage: sync.py [-h] [-f [CONFIG]] [--src [SRC]] [--src-authdb [SRC_AUTHDB]]
               [--src-username [SRC_USERNAME]] [--src-password [SRC_PASSWORD]]
               [--dst [DST]] [--dst-authdb [DST_AUTHDB]]
               [--dst-username [DST_USERNAME]] [--dst-password [DST_PASSWORD]]
               [--start-optime [START_OPTIME]]
               [--optime-logfile [OPTIME_LOGFILE]] [--logfile [LOGFILE]]

Sync data from a replica-set to another MongoDB/Elasticsearch.

optional arguments:
  -h, --help            show this help message and exit
  -f [CONFIG], --config [CONFIG]
                        configuration file, note that command options will
                        override items in config file
  --src [SRC]           source should be hostportstr of a replica-set member
  --src-authdb [SRC_AUTHDB]
                        src authentication database, default is 'admin'
  --src-username [SRC_USERNAME]
                        src username
  --src-password [SRC_PASSWORD]
                        src password
  --dst [DST]           destination should be hostportstr of a mongos or
                        mongod instance
  --dst-authdb [DST_AUTHDB]
                        dst authentication database, default is 'admin', for
  --dst-username [DST_USERNAME]
                        dst username, for MongoDB
  --dst-password [DST_PASSWORD]
                        dst password, for MongoDB
  --start-optime [START_OPTIME]
                        timestamp in second, indicates oplog based increment
  --optime-logfile [OPTIME_LOGFILE]
                        optime log file path, use this as start optime if
                        without '--start-optime'
  --logfile [LOGFILE]   log file path


  • command options tuning
  • config file format tuning
  • sync sharding config (enableSharding & shardCollection)