• Stars
    star
    156
  • Rank 238,667 (Top 5 %)
  • Language
    C#
  • License
    Apache License 2.0
  • Created over 8 years ago
  • Updated almost 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

.NET based webcrawler

NCrawler

.NET based webcrawler

Simple and very efficient multithreaded web crawler with pipeline based processing written in C#. Contains HTML, Text, PDF, and IFilter document processors and language detection(Google). Easy to add pipeline steps to extract, use and alter information.

Total rewrite of NCrawler from 2010 using more modern programming. Now on v4