• Stars
    star
    3
  • Rank 3,941,167 (Top 79 %)
  • Language
    Python
  • Created over 1 year ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

More Repositories

1

MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
Python
10,891
star
2

PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction
Python
4,609
star
3

labelU

Data annotation toolbox supports image, audio and video data.
Python
747
star
4

WanJuan1.0

万卷1.0多模态语料
525
star
5

LabelLLM

The Open-Source Data Annotation Platform
TypeScript
384
star
6

VIGC

AAAI 2024: Visual Instruction Generation and Correction
Python
73
star
7

opendatalab-datasets

datasets resource
65
star
8

labelU-Kit

Data annotation component library --provided as NPM packages
TypeScript
53
star
9

opendatalab-python-sdk

SDK of OpenDataLab - https://opendatalab.org.cn
Python
52
star
10

CLIP-Parrot-Bias

ECCV2024_Parrot Captions Teach CLIP to Spot Text
Python
52
star
11

magic-doc

Python
49
star
12

dsdl-docs

Data Set Description Language Specification (新一代人工智能数据集描述语言DSDL)
HTML
43
star
13

H2RSVLM

H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model
37
star
14

MLLM-DataEngine

MLLM-DataEngine: An Iterative Refinement Approach for MLLM
Python
27
star
15

image-downloader

Python
24
star
16

magic-html

Python
20
star
17

dsdl-sdk

Jupyter Notebook
13
star
18

labelU-frontend

LabelU front-end library
TypeScript
7
star
19

allz

A universal command line tool for compression and decompression
Python
4
star
20

HA-DPO

Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
Python
2
star
21

MLS-BRN

[CVPR 2024] 3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions
1
star
22

Miner-PDF-Benchmark

MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.
Python
1
star
23

labelU-ML

Python
1
star
24

s3_browser

基于Streamlit开发,可在线查看S3存储内容的工具。
Python
1
star