• Stars
    star
    37
  • Rank 720,807 (Top 15 %)
  • Language
    C++
  • License
    GNU Affero Genera...
  • Created over 13 years ago
  • Updated almost 13 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

AutoCorpus is a set of utilities that enable automatic extraction of language corpora and language models from publicly available datasets. Autocorpus utilities follow the Unix design philosophy and integrate easily into custom data processing pipelines.