There are no reviews yet. Be the first to send feedback to the community and the maintainers!
MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content ExtractionlabelU
Data annotation toolbox supports image, audio and video data.WanJuan1.0
万卷1.0多模态语料LabelLLM
The Open-Source Data Annotation Platformmagic-doc
magic-html
UniMERNet
UniMERNet: A Universal Network for Real-World Mathematical Expression RecognitionVIGC
AAAI 2024: Visual Instruction Generation and CorrectionCLIP-Parrot-Bias
ECCV2024_Parrot Captions Teach CLIP to Spot Textopendatalab-python-sdk
SDK of OpenDataLab - https://opendatalab.org.cnlabelU-Kit
Data annotation component library --provided as NPM packagesH2RSVLM
H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Modeldsdl-docs
Data Set Description Language Specification (新一代人工智能数据集描述语言DSDL)MLLM-DataEngine
MLLM-DataEngine: An Iterative Refinement Approach for MLLMimage-downloader
dsdl-sdk
labelU-frontend
LabelU front-end libraryallz
A universal command line tool for compression and decompressionlaion5b-downloader
HA-DPO
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference OptimizationMLS-BRN
[CVPR 2024] 3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level SupervisionsMiner-PDF-Benchmark
MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.labelU-ML
s3_browser
基于Streamlit开发,可在线查看S3存储内容的工具。Love Open Source and this site? Check out how you can help us