• Stars
    star
    716
  • Rank 63,241 (Top 2 %)
  • Language
  • License
    MIT License
  • Created over 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

🔧 A curated list of awesome dataset tools

Awesome Dataset Tools

A curated list of awesome dataset tools

Labeling Tools

Images

  • LabeFlow - Open image annotation tool for machine learning projects
  • CVAT - Online, interactive video and image annotation tool for computer vision
  • COCO Annotator - Web-based image segmentation tool for object detection, localization and keypoints
  • VoTT - Visual Object Tagging Tool: An electron app for building end to end object detection models from images and videos.
  • Scalabel - Versatile and scalable tool that supports various kinds of annotations
  • EVA - EVA is a web-based tool for efficient annotation of videos and image sequences and has an additional tracking capabilities
  • LOST - Design your own smart Image Annotation process in a web-based environment
  • Boobs - Fast and efficient BBox annotation for your images in YOLO, VOC/COCO formats
  • MuViLab - Tool to help you labelling videos for computer vision
  • Turkey - Web UI on Amazon Mechanical Turk to crowd-source image segmentation
  • React Image Annotation - An infinitely customizable image tool built on React
  • Point Cloud Annotation Tool - Annotate 3D boxes in point cloud
  • ImageTagger - Open source online platform for collaborative image labeling
  • DeepLabel - A cross-platform image annotation tool for machine learning
  • Visual Object Tagging Tool - An electron app for building end to end Object Detection Models
  • VGG Image Annotator - Standalone image annotator application packaged as a single HTML file
  • SMART - Efficiently build labeled training datasets for supervised machine learning tasks
  • Pixel Annotation Tool - Uses the algorithm watershed marked of OpenCV to annotate images in directories
  • Pixie - GUI annotation tool which provides the bounding box, polygon, and semantic segmentation
  • Turktool - Modern React app for scalable bounding box annotation of images
  • LabelD - Simple image annotation tool to streamlining the overall process
  • Comma Coloring - Adult coloring book for image segmentation
  • LabelImg - Graphical image annotation tool and label object bounding boxes in images
  • LCs Finder - Image annotation and object detection tool written in C
  • js-segment-annotator - Javascript image annotation tool based on image segmentation
  • Cytomine - Analysis of multi-gigapixel images
  • labelme - Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation)
  • SimpleAnnotate - Open source video and image annotation software for, currently only for OSX
  • Sloth - Labeling image and video data for computer vision research
  • Fast Annotation Tool - Online platform for collaborative image annotation
  • Anno-Mage - Helps you in annotating images by suggesting you annotations for 80 object classes
  • MedTagger - Collaborative framework for annotating medical datasets using crowdsourcing
  • OpenLabeling - Labeling in multiple annotation formats
  • Alturos.ImageAnnotation - Collaborative tool for labeling image data for yolo
  • Yolo_mark - GUI for marking bounded boxes of objects in images
  • imglab - peedup and simplify image labeling/ annotation process with multiple supported formats
  • OpenLabeler - Open source desktop application for annotating objects
  • UltimateLabeling - A multi-purpose Video Labeling GUI with integrated SOTA detector and tracker
  • DataGym.ai - Open source annotation and labeling tool for image and video assets

Closed Source

  • DataTorch - Platform for creating and shareing datasets.
  • Labelbox - Platform for data labeling, data management, and data science. Its features include image annotation, bounding boxes, text classification, and more
  • Supervise.ly - Image annotation and data management tool that you can use create image and video datasets
  • Prodigy - Various machine learning models such as image classification, entity recognition and intent detection
  • RectLabel - Label images for bounding box object detection and segmentation
  • Lionbridge AI - Quickly annotate thousands of images and videos with relevant tags
  • TrainingData.io - Medical image annotation tool for data labeling. Spports DICOM image format for radiology AI
  • Spare5 - Crowdsourcing service for tasks such as data and image annotation, language assessment, and more
  • Hive - Text and image annotation service that helps you create training datasets
  • Figure Eight - Supports audio , computer vision, natural language processing, and other data tasks
  • Dataturks - Image segmentation, named entity recognition (NER) tagging in documents, and POS tagging
  • UBIAI - Easy-to-use text annotation tool for teams with most comprehensive auto-annotation features. Supports NER, relations and document classification as well as OCR annotation for invoice labeling
  • Playment - Services offered include bounding boxes, points and lines, polygons, semantic segmentation, and more
  • Cogito Tech - Image annotation, content moderation, sentiment analysis, chatbot training
  • OCLAVI - Annotate Bounding Box, Polygon, Circle, Point and Cuboidal annotations with precision
  • Humans in the Loop - Use cases include face recognition, autonomous vehicles, and figure detection
  • WorkAround - Host and annotate data, manage projects, and build datasets alongside top companies
  • TaQadam - On-demand annotation with agents-in-the-loop
  • Zillin - Image annotation service for classification, object detection and segmentation with API access and georeferenced images support.
  • IBM Cloud Annotations - Simple and collaborative image annotation tool for teams and individuals inside ibm cloud environment.
  • TrainingSet.AI - Platform to solve the data labelling step in the AI Development for images, video and point cloud data (automatic labeling, ground truth, annotation tools, web dataset creation, s3, teams and statistics tools)
  • MedSeg - Free online medical annotation (segmentation) with AI models.
  • MVTec Deep Learning Tool - Provides labeling functionalities for HALCON's deep-learning-based object detection and classification.
  • Amazon SageMaker Ground Truth - annotate data using MTurk, vendor workforces, or your own private workteams. Use Ground Truth's built-in UIs (video, point cloud, image, text, document processing) or bring your own custom UI

Audio

  • Audio Annotator - JavaScript interface for annotating and labeling audio files
  • Dynitag - Web-based collaborative audio annotator tool
  • EchoML - play, visualize, and annotate your audio files for machine learning

Closed Source

  • Figure Eight - Supports audio , computer vision, natural language processing, and other data tasks

Time Series

  • Curve - An integrated experimental platform for time series data anomaly detection
  • TagAnomaly - Anomaly detection analysis and labeling tool, specifically for multiple time series
  • time-series-annotator - Implements classification tasks for time series.
  • WDK - Tools to facilitate the development of activity recognition applications with wearable devices

Text

  • brat - For all your textual annotation needs
  • doccano - Open source text annotation tool for machine learning practitioner.
  • Inception - A semantic annotation platform offering intelligent annotation assistance
  • NeuroNER - Named-entity recognition using neural networks
  • YEDDA - For annotating chunk/entity/event on text, symbol and even emoji
  • TALEN - Web-based tool for annotating word sequences
  • WebAnno - Web-based annotation tool for a wide range of linguistic annotations
  • MAE - Lightweight, general-purpose natural language annotation tool
  • Anafora - Web-based raw text annotation tool
  • TagEditor - Label dependencies, parts of speech, Named entities, and text categories
  • ML-Annotate - Supports binary, multi-label and multi-class labeling of text

Closed Source

  • Hive - Text and image annotation service that helps you create training datasets
  • Figure Eight - Supports audio , computer vision, natural language processing, and other data tasks
  • LightTag Text Annotation Tool for Teams.

Libraries

Audio

  • Muda - Python library for augmenting annotated audio data

Text

  • DataProfiler - A Python library to facilitate data analysis, monitoring, and data identification