• Stars
    star
    2
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created almost 4 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

This repository presents an approach to predict the language in which a document is written. In particular, the proposed approach transforms a text into character n-gram features and uses them to support the predictive power of a machine-learned classifier. Experimental results show that it is capable of identifying 14 languages with high accuracy and that its performance is better than that of some of the most popular language identification libraries in the Python ecosystem.