• Stars
    star
    547
  • Rank 81,254 (Top 2 %)
  • Language
    C++
  • License
    GNU General Publi...
  • Created about 9 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

cmix is a lossless data compression program aimed at optimizing compression ratio at the cost of high CPU/memory usage.
cmix version 19
Released by Byron Knoll on August 29, 2021.
http://www.byronknoll.com/cmix.html
https://github.com/byronknoll/cmix

cmix is a lossless data compression program aimed at optimizing compression ratio at the cost of high CPU/memory usage. cmix is free software distributed under the GNU General Public License.

cmix works in Linux, Windows, and Mac OS X. At least 32GB of RAM is recommended to run cmix. Feel free to contact me at [email protected] if you have any questions.

Use "make" to compile cmix. In Windows, cmix can be compiled with MinGW (http://nuwen.net/mingw.html) or Cygwin (https://www.cygwin.com).

cmix can only compress/decompress single files. To compress multiple files or directories, create an archive file using "tar" (or some similar tool).

For some files, preprocessing using "precomp" may improve compression: https://github.com/schnaader/precomp-cpp

Compile with "-Ofast -march=native" for fastest performance. These compiler options might lead to incompatibility between different computers due to floating-point precision differences. Compile with "-O3" to fix compatibility issues (at the cost of slower performance).

Instructions for enwik9 preprocessing:
1) Copy enwik9-preproc and dictionary/.new_article_order to a new folder.
2) Run: ./enwik9-preproc c path_to_enwik9
3) .ready4cmix is the preprocessed file. To reproduce enwik9, run: ./enwik9-preproc d .ready4cmix

Changes from version 18 to version 19:
- Improvements from STARLIT and cmix-hp

Changes from version 17 to version 18:
- LSTM improvements

Changes from version 16 to version 17:
- New dictionary from phda 2017 November release
- Rewrote WRT dictionary preprocessor
- Improvements ported from paq8px
- Some other minor improvements, bug fixes, and refactoring

Changes from version 15 to version 16:
- Improvements ported from paq8px
- Context mixer improvements
- Removed some redundant models
- LSTM improvements

Changes from version 14 to version 15:
- Improvements ported from paq8px and paq8pxd
- Enabled pretraining
- LSTM tuning
- Context mixer improvements
- Memory tuning

Changes from version 13 to version 14:
- LSTM performance optimizations and tuning
- Changes from MΓ‘rcio Pais: improved preprocessing and many PAQ8 model improvements
- Disabled pretraining
- New dictionary (from phda)

Changes from version 12 to version 13:
- Improved LSTM and SSE

Changes from version 11 to version 12:
- Replaced byte mixer with LSTM

Changes from version 10 to version 11:
- Added ppmd model
- Memory tuning and bug fixes

Changes from version 9 to version 10:
- Learning rate decay tuning
- Added "interval" context (adapted from MCM https://github.com/mathieuchartier/mcm)
- Added "only preprocessing" compression option
- Refactoring and memory tuning

Changes from version 8 to version 9:
- Improved Windows compatibility
- Learning rate decay
- New nonstationary state table
- Bracket matching models
- Minor bug fixes and memory tuning

Changes from version 7 to version 8:
- Added a recurrent backpropagation neural network to mix byte-level models
- Added some PPM models
- Memory tuning
- Minor refactoring

Changes from version 6 to version 7:
- Fixed bug that caused cmix to crash in Windows
- Tweaked word model
- Added some image models (JPEG/BMP/TIFF)
- Improved preprocessor (JPEG/BMP/text/EXE)

Changes from version 5 to version 6:
- Memory tuning
- Removed PPM model

Changes from version 4 to version 5:
- Added dictionary pretraining
- Removed dictionary from cmix binary
- Integrated some code from paq8pxd_v12 into paq8l
- Minor refactoring

Changes from version 3 to version 4:
- Added an additional mixer context (longest match length)
- Implemented an additional PPM model
- Integrated some code from paq8pxd_v11 into paq8l
- Integrated mixer contexts from paqar into paq8l
- Minor refactoring and tuning

Changes from version 2 to version 3:
- Merged paq8pxd code into paq8l
- Optimized speed of cmix and paq8 mixers
- Minor refactoring and tuning

Changes from version 1 to version 2:
- Memory tuning
- Increase PPM order to 7
- Replaced dictionary with the one used by decmprs8
- Added paq8pxd_v7 models
- Removed pic models
- Added "facade" models to copy the paq8 model predictions to the cmix mixer