Awesome Java Document and Text Processing Libraries

  • updated about 2 months ago GNU Lesser Genera...

    Style and Grammar Checker for 25+ Languages

  • tika tika 1,860
    star
    updated over 1 year ago Apache License 2.0

    The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).