• This repository has been archived on 19/Aug/2023
  • Stars
    star
    825
  • Rank 55,281 (Top 2 %)
  • Language
    Python
  • License
    Other
  • Created over 1 year ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

🌟 ChatGenTitle:使用百万arXiv论文信息在LLaMA模型上进行微调的论文题目生成模型

Note

ChatPaper已经完成了近五年5w篇顶会论文总结,可以助你在科研道路更加顺利:https://chatpaper.org/

全新的基于ChatGLM微调的医学问答大模型已经发布:https://github.com/WangRongsheng/MedQA-ChatGLM

Logo

ChatGenTitle:使用百万arXiv论文信息在LLaMA模型上进行微调的论文题目生成模型

GitHub Contributors GitHub Contributors Issues GitHub pull requests

一站式服务 / 简单 / 快速 / 高效 / 智能
视频教程 安装部署 在线体验

News

  • 🎉🎉 训练数据集在Cornell-University/arxiv,可以直接使用;
  • 🎉🎉 所有模型在HuggingFace 开源,可以直接使用;
  • 🎉🎉 所有人可以在线免费体验ChatGenTitle,Open In Colab
  • 🎉🎉 由于缺乏GPU计算资源,我们发布了在线部署版本 的所有代码和权重,可以在任何环境部署使用;
  • 🎉🎉 arXiv上每天都会产生大量与LLMs相关的工作,该仓库每日自动推送30篇LLMs相关的论文供大家学习,点击学习今日LLMs论文 ;
  • 🎉🎉 正式发布LLaMa-Lora-7B-3LLaMa-Lora-7B-3-new 版本的LoRA模型权重,允许本地部署使用;
  • 🎉🎉 完成了基于alpaca-lora 上进行的LLaMa-Lora-7B-3LLaMa-Lora-13B-3模型微调;
  • 🎉🎉 开始了一项长期进行在arXiv上定时爬取cs.AIcs.CVcs.LG 论文的任务,目的是为了支持 CS 相关方向的研究;
  • 🎉🎉 整理了220W+篇arXiv论文的元信息,这些元信息包括:titleabstract,更多的有:idsubmitterauthorscommentsjournal-refdoicategoriesversions

TODO

  • 完成LoRA对大模型微调的教程,Enjoy it!!!
  • 发布arXiv(很快完成...)
  • 完成ChatGenTitle、ChatGPT、GPT4的效果对比
  • 发布在线使用版本,LLaMa-Lora-7B-cs-6-new-app Open In Colab

Release

Note

Meta发布的LLaMA模型禁止商用,因此这里我们开源的是LoRA模型,LoRA模型必须搭配对应版本的LLaMA模型使用才可以,具体请看Chinese-LLaMA-Alpaca #合并模型

模型名称 微调数据 微调基准模型 模型大小 微调时长 微调效果
LLaMa-Lora-7B-3 arXiv-50-all LLaMa-7B -MB 9 hours 点击查看
LLaMa-Lora-7B-3-new arXiv-50-all LLaMa-7B -MB 12.5 hours 点击查看
LLaMa-Lora-7B-cs-3-new arXiv-cs LLaMa-7B -MB 20.5 hours 点击查看
LLaMa-Lora-7B-cs-6-new arXiv-cs LLaMa-7B -MB 34 hours 点击查看
LLaMa-Lora-13B-3 arXiv-100-all LLaMa-13B -MB 26 hours 点击查看
训练设置

* 实验在A100 (4X, 80GB)上进行

Comparison

提示词 摘要 原始论文题目 ChatGenTitle ChatGPT(GPT3.5) GPT4 ChatGLM(130B)
提示词① 摘要① Focus-RCNet: A lightweight recyclable waste classification algorithm based on Focus and knowledge distillation Focus-RCNet: A Lightweight Convolutional Neural Network for Recyclable Waste Image Classification Focus-RCNet: A lightweight deep learning model for automated waste classification with enhanced recyclable waste image feature recognition Efficient Waste Classification with Focus-RCNet: A Lightweight Deep Learning Architecture Employing Sandglass Structure, SimAM Attention Mechanism, and Knowledge Distillation for Real-Time Embedded Applications 超过Token长度
提示词② 摘要② ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices ShuffleNet: Efficient Convolutional Neural Networks for Mobile Devices ShuffleNet: A Computation-Efficient CNN Architecture for Mobile Devices with Superior Performance in Object Detection and ImageNet Classification while Maintaining Accuracy ShuffleNet: A Computationally Efficient CNN Architecture for Mobile Devices with Enhanced Performance in ImageNet Classification and MS COCO Object Detection ShuffleNet: An Extremely Computation-Efficient CNN Architecture for Mobile Devices
提示词③ 摘要③ Segment Anything Segment Anything Segment Anything: Introducing a New Task, Model, and Dataset for Promptable Image Segmentation with Superior Zero-Shot Performance Exploring the Segment Anything Project: A Promptable Image Segmentation Model and Extensive Dataset with Impressive Zero-Shot Performance Segment Anything (SA) Project: A New Task, Model, and Dataset for Image Segmentation
1. 提示词①和摘要①
  • 提示词①:If you are an expert in writing papers, please generate a good paper title for this paper based on other authors' descriptions of their abstracts.
  • 摘要①:Waste pollution is one of the most important environmental problems in the modern world. With the continuous improvement of the living standard of the population and the increasing richness of the consumption structure, the amount of domestic waste generated has increased dramatically and there is an urgent need for further waste treatment of waste. The rapid development of artificial intelligence provides an effective solution for automated waste classification. However, the large computational power and high complexity of algorithms make convolutional neural networks (CNNs) unsuitable for real-time embedded applications. In this paper, we propose a lightweight network architecture, Focus-RCNet, designed with reference to the sandglass structure of MobileNetV2, which uses deeply separable convolution to extract features from images. The Focus module is introduced into the field of recyclable waste image classification to reduce the dimensionality of features while retaining relevant information. In order to make the model focus more on waste image features while keeping the amount of parameters computationally small, we introduce the SimAM attention mechanism. Additionally, knowledge distillation is used to further compress the number of parameters in the model. By training and testing on the TrashNet dataset, the Focus-RCNet model not only achieves an accuracy of 92%, but also has high mobility of deployment.
2. 提示词②和摘要②
  • 提示词②:If you are an expert in writing papers, please generate a good paper title for this paper based on other authors' descriptions of their abstracts.
  • 摘要②:We introduce an extremely computation-efficient CNN architecture named ShuffleNet, which is designed specially for mobile devices with very limited computing power (e.g., 10-150 MFLOPs). The new architecture utilizes two new operations, pointwise group convolution and channel shuffle, to greatly reduce computation cost while maintaining accuracy. Experiments on ImageNet classification and MS COCO object detection demonstrate the superior performance of ShuffleNet over other structures, e.g. lower top-1 error (absolute 7.8%) than recent MobileNet on ImageNet classification task, under the computation budget of 40 MFLOPs. On an ARM-based mobile device, ShuffleNet achieves ~13x actual speedup over AlexNet while maintaining comparable accuracy.
3. 提示词③和摘要③
  • 提示词③:If you are an expert in writing papers, please generate a good paper title for this paper based on other authors' descriptions of their abstracts.
  • 摘要③:We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date (by far), with over 1 billion masks on 11M licensed and privacy respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and tasks. We evaluate its capabilities on numerous tasks and find that its zero-shot performance is impressive -- often competitive with or even superior to prior fully supervised results. We are releasing the Segment Anything Model (SAM) and corresponding dataset (SA-1B) of 1B masks and 11M images.

Reference

Note

时代在进步,大模型(LLMs)也是,所以你可以每天来读30篇最新的关于LLM的Paper,保证你的知识不会跟丢!

👉👉👉查看今日LLMs论文

Knowledge

1. 关于Instruct微调和LoRa微调

Instruct微调和LoRa微调是两种不同的技术。 Instruct微调是指在深度神经网络训练过程中调整模型参数的过程,以优化模型的性能。在微调过程中,使用一个预先训练好的模型作为基础模型,然后在新的数据集上对该模型进行微调。Instruct微调是一种通过更新预训练模型的所有参数来完成的微调方法,通过微调使其适用于多个下游应用。 LoRa微调则是指对低功耗广域网(LoRaWAN)中的LoRa节点参数进行微调的过程,以提高节点的传输效率。在LoRa微调中,需要了解节点的硬件和网络部署情况,并通过对节点参数进行微小调整来优化传输效率。与Instruct微调相比,LoRA在每个Transformer块中注入可训练层,因为不需要为大多数模型权重计算梯度,大大减少了需要训练参数的数量并且降低了GPU内存的要求。 研究发现,使用LoRA进行的微调质量与全模型微调相当,速度更快并且需要更少的计算。因此,如果有低延迟和低内存需求的情况,建议使用LoRA微调。

2. 为什么会有LLaMA模型和LoRA两种模型?

如1所述,模型的微调方式有很多种,基于LoRA的微调产生保存了新的权重,我们可以将生成的LoRA权重认为是一个原来LLaMA模型的补丁权重 。至于LLaMA 权重,它则是由Mean公司开源的大模型预训练权重。

3. 关于词表扩充

加入词表是有一定破坏性的, 一是破坏原有分词体系,二是增加了未训练的权重。所以如果不能进行充分训练的话,可能会有比较大的问题。个人觉得如果不是特别专的领域(比如生物医学等涉及很多专业词汇的领域)没有太大必要去扩充英文词表。 Chinese-LLaMA-Alpaca/issues/16

LICENSE

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

使用和许可声明:ChatGenTitle 仅可以在获得许可下供研究使用。使用仅可用于科学研究,不可用于实际论文写作中,否则,由此产生的一切后果由使用人负责!!!

CC BY-NC-SA 4.0

Repography logo / Recent activity Time period

Timeline graph Issue status graph Pull request status graph Trending topics Top contributors Activity map

Stargazers

Stargazers over time


Go for it!

Feel free to ask any questions, open a PR if you feel something can be done differently!

🌟 Star this repository 🌟

Created by WangRongsheng

More Repositories

1

awesome-LLM-resourses

🧑‍🚀 全世界最好的LLM资料总结 | Summary of the world's best LLM resources.
1,600
star
2

XrayGLM

🩺 首个会看胸部X光片的中文多模态医学大模型 | The first Chinese Medical Multimodal Model that Chest Radiographs Summarization.
Python
889
star
3

CareGPT

🌞 CareGPT (关怀GPT)是一个医疗大语言模型,同时它集合了数十个公开可用的医疗微调数据集和开放可用的医疗大语言模型,包含LLM的训练、测评、部署等以促进医疗LLM快速发展。Medical LLM, Open Source Driven for a Healthy Future.
Python
733
star
4

MedQA-ChatGLM

🛰️ 基于真实医疗对话数据在ChatGLM上进行LoRA、P-Tuning V2、Freeze、RLHF等微调,我们的眼光不止于医疗问答
Python
297
star
5

Aurora

🐳 Aurora is a [Chinese Version] MoE model. Aurora is a further work based on Mixtral-8x7B, which activates the chat capability of the model's Chinese open domain.
Python
257
star
6

BestYOLO

🌟Change the world, it will become a better place. | 以科研和竞赛为导向的最好的YOLO实践框架!
Python
223
star
7

SAM-fine-tune

🌌 Fine tune specific SAM model on any task
Python
145
star
8

Use-LLMs-in-Colab

🤖 集合众多大模型在Colab上的使用 | LLMs is all you need.
Jupyter Notebook
120
star
9

make-your-yolov5_dataset

💥Make your yolov5 dataset by using labelimg.I hope my work can help you make your yolov5 datasets more quickly.
Python
77
star
10

KDAT

💥一个专为视觉方向目标检测全流程的标注工具集,全称:Kill Object Detection Annotation Tools。
Python
69
star
11

Awesome-LLM-with-RAG

A curated list of Large Language Model with RAG
Python
67
star
12

IvyGPT

[CICAI 2023] The official codes for "Ivygpt: Interactive chinese pathway language model in medical domain"
57
star
13

Chinese-LLaMA-Alpaca-Usage

📔 对Chinese-LLaMA-Alpaca进行使用说明和核心代码注解
Jupyter Notebook
48
star
14

DS_Yanweimin

🐋 大学《数据结构(C语言版)(第2版)》 严蔚敏版的配套PPT/源代码/实验安排/课时安排
43
star
15

PaddleOCR-Flask-deploy

✅Deploy PaddleOCR with flask | 利用Flask对PaddleOCR进行部署,方便调用
HTML
39
star
16

Knowledge-Base-LLMs-QA

👽 基于大模型的知识库问答 | Large model-based knowledge base Q&A.
Python
26
star
17

pytorch-classification

利用pytorch实现图像分类的一个完整的代码,训练,预测,TTA,模型融合,模型部署,cnn提取特征,svm或者随机森林等进行分类,模型蒸馏,一个完整的代码
Jupyter Notebook
25
star
18

Statistical-learning-method-lihang

《统计学习方法》,作者李航,本书全面系统地介绍了统计学习的主要内容
Jupyter Notebook
23
star
19

for-Graduate_student

💥考研指导
20
star
20

tf-pytorch-paddle

💥三大深度学习框架:tensorflow,pytorch,paddle的高层API使用学习
Jupyter Notebook
19
star
21

yolov5-obj-distance-measure

✅yolov5-obj-distance-measure.You can change code in yolov5.
Python
19
star
22

DeepLearing-LiMu-Notes

⛏️LiMu Deeplearning notes. | 李沐深度学习课程笔记
Jupyter Notebook
14
star
23

Classify-Leaves

✅Kaggle竞赛之176种树叶图片种类识别分类
Jupyter Notebook
12
star
24

CDNet-yolov5

💟《CDNet:一个基于YOLOv5的在Jetson Nano上实时、鲁棒的斑马线检测网络》论文的原生(ultralytics)yolov5训练、推理baseline仓库
Python
12
star
25

Directory

一个简约又简单的基于云服务部署的私人网盘系统
PHP
11
star
26

Yolov5-on-flask

Running YOLOv5 through web browser using Flask microframework
Python
9
star
27

Wear_mask

基于自制数据集+百度EasyDL训练的佩戴口罩识别
8
star
28

DS-python_Zhangguanghe

🐒 大学《数据结构-python语言描述》 张光河版的配套PPT/源代码/实验安排/课时安排
8
star
29

for-CPA

小白CPA(注册会计师)学习考证指南
8
star
30

Interesting-python

💟There are many interesting python examples.
Python
7
star
31

ReadPaper

🧑‍🚀 Professional translation and reading of English academic papers in PDF format.
HTML
7
star
32

Image-Registration

⛏️A collection of tools and practices for image registration. | 图像配准的工具和实践集合
Python
7
star
33

EasyDL

✅基于百度EasyDL训练的模型,并可以部署在前端、PC,移动端和微信小程序端,视频流推断
Python
6
star
34

Sentence-BERT-Similarity

📃Train text similarity model based on Sentence-BERT | 基于Sentence-BERT训练自己的文本相似度模型
Python
6
star
35

wrsArxiv

👻 Arxiv个性化定制化模版,实现对特定领域的相关内容、作者与学术会议的有效跟进,将Arxiv定制化为MyArxiv.
CSS
5
star
36

YOLOv5_tfjs

✅YOLOv5 TFjs infer demo
JavaScript
5
star
37

Yolov5-DeepSort-Pytorch

Real-time multi-person tracker using YOLO v5 and deep sort.
Python
5
star
38

mask-yolov5-fastapi

Yolov5 is used for mask recognition, and fastapi is used for web deployment.
Python
4
star
39

cnn-visualization

卷积神经网络(CNN)从卷积层到池化层可视化演示
C#
3
star
40

Awesome-Blog

集结优秀的博客与高颜值博客
3
star
41

WangRongsheng.github.io

HTML
3
star
42

smart-photo

基于百度飞桨PaddleClas和Watchdog构建的智慧相册
Python
3
star
43

Bayesian-Personalized-Ranking

Bayesian Personalized Ranking is a learning algorithm for collaborative filtering first introduced in: BPR: Bayesian Personalized Ranking from Implicit Feedback. Steffen Rendle, Christoph Freudenthaler, Zeno Gantner and Lars Schmidt-Thieme, Proc. UAI 2009.
Python
3
star
44

WangRongsheng

✔️My Github profile. | 我的Github个人主页
Python
3
star
45

Dimensionality-reduction-algorithm

☺️ PCA、LDA、MDS、LLE、TSNE等降维算法的python实现
Jupyter Notebook
3
star
46

yolov5_LPRNet_carcard

Jupyter Notebook
2
star
47

Blog-backup

✅Personal blog posts and page backups.
Jupyter Notebook
2
star
48

Mask-Detection-yolov4-tiny

✅Mask wearing detection based on Yolo V4 tiny.It can be used in graduation project.
Python
2
star
49

Classify-Fu

✅图像分类成长赛——AI集福,“福”字图片识别
Python
2
star
50

Algorithms

😊 All Algorithms.
Python
2
star
51

Machine-Learning-zzh-notes

周志华《机器学习》,一本详细介绍了机器学习领域不同类型的算法的书
2
star
52

Slides-Reports-and-papers

⛏️This is the storage of my Slides、Reports and Papers. | 存储PPT、报告和论文
2
star
53

ChatCitation

ChatGPT辅助论文引用格式生成,支持单个和批量生成
Python
2
star
54

weibo-HotSearch-data

微博实时热搜榜数据抓取并保存
Python
2
star
55

IlovePaddleModel

💟集锦PaddlePaddle为我们提供的方便的模型和解决方案 | The best model, the best US.
2
star
56

Computer-Vision

💟养成系计算机视觉
1
star
57

MisinfoGenDet

Jupyter Notebook
1
star
58

flask-data-collection

✅基于flask实现的前端数据集收集平台
HTML
1
star
59

Our-Love

[源码]毒💘 /甜💖 鸡汤的网站
CSS
1
star
60

random-roll-call

随机点名
HTML
1
star
61

auto-healthy-clock

Henan University of Technology auto healthy clock.
Python
1
star
62

Christmas-lucky-draw

🎅 圣诞抽奖-python
Python
1
star
63

images

参考: https://github.com/jrainlau/picee ,制作的前端上传图片到github,作为图床使用
Python
1
star
64

beautiful

Many beautiful sisters.
HTML
1
star
65

Are-you-still-there

😇 一个python的demo,可以秒速发千万条消息给别人,当代的键盘侠克星
Python
1
star
66

Good-Search_Github

🌝 Github的高级搜索方法
1
star
67

make-EfficientDet-datasets

✅make your EfficientDet datasets.|制作你的EfficientDet可以训练的数据集
Python
1
star
68

CareGPT-Bot

1
star
69

DimensionalityReduction-code

💝 经典降维算法的Demo(python),同时还给出了参考资料的链接。
Jupyter Notebook
1
star
70

HandWriting-Recognition-GUI

HandWriting-Recognition-GUI.
Python
1
star
71

Dataset-website

数据集网站的收集
1
star
72

Pytorch-Tutorial

💟 Learn Pytorch.
Python
1
star
73

Transform-your-data

yolo->voc,voc->yolo,voc->coco,voc-tfrecord
Python
1
star
74

LLM-cookbook

1
star
75

Data-Science-Notes

📔 一份数据科学的笔记以及资料
Jupyter Notebook
1
star
76

Paddle2.0-API

高层API助你快速上手深度学习
HTML
1
star
77

Seven-sorting-methods-python

🐋 实现了常用的排序算法,包括:冒泡排序、直接插入排序、直接选择排序、希尔排序、归并排序、快速排序、堆排序,基于python 3 编程实现。
Python
1
star
78

AI-Practice

⛏️Artificial intelligence project practice. | 人工智能项目实践
Jupyter Notebook
1
star
79

RetrievalCLIP

1
star
80

YOLOv5_Server_deploy

✅YOLOv5 Server deploy
Python
1
star
81

MLPNumberClassifier

基于MLP的简单手写体数字识别
Python
1
star