Here is a collection of research papers and the relevant valuable open-source resources for awesome knowledge-driven autonomous driving (AD). The repository will be continuously updated to track the frontier of knowledge-driven AD.
🌟 Welcome to star and contribute to (PR) this awesome knowledge-driven AD! 🌟
[2023.12.08] New: We release the survey 'Towards Knowledge-driven Autonomous Driving'! [2023.10.24] New: We release the awesome knowledge-driven AD!
The autonomous driving community has witnessed substantial growth in approaches that embrace a knowledge-driven paradigm. Here, we delve into knowledge-driven autonomous driving, exploring motivations, components, challenges, and prospects. More details of knowledge-driven autonomous driving can be found in our paper.
Key components in knowledge-driven AD.
Knowledge-aug. Dataset | Sensors | Knowledge Form | Tasks | Metrics |
---|---|---|---|---|
BDD-X | C | Explanation | Vehicle Control, Explanation Generation, Scene Captioning | MAE, MDC, BLEU-4, METEOR, CIDEr-D |
Cityscapes-Ref | C | Object Referral, Gaze Heatmap | Object Referring | Acc@1 |
DR(eye)VE | C | Gaze Heatmap | Gaze Prediction | CC, KLD, IG |
HAD | C | Advice | Vehicle Control | MAE, MDC |
Talk2Car | C+L+R | Object Referral | Object Referring | [email protected] |
DADA-2000 | C | Gaze Heatmap, Crash Objects, Accident Window | Gaze Prediction | CC, KLD, NSS, SIM |
HDBD | C | Gaze Heatmap, Takeover Intention | Driver Takeover Detection | AUC |
Refer-KITTI | C+L | Object Referral | Object Referring, Object Tracking | HOTA |
DRAMA | C | Advice, Risk Localization | Motion Planning | L2 Error, Collision Rate |
Rank2Tell | C+L | Object Referral, Importance Ranking | Importance Estimation, Scene Captioning | F1 Score, Accuracy, BLEU-4, METEOR, ROUGE, CIDER |
DriveLM | C | Scene Captioning, Question Answering | Scene Captioning, Question Answering, Vehicle Control | ADE, FDE, Accuracy, Collision Rate, SPICE, GPT-Score |
NuScenes-QA | C+L+R | Question Answering | Question Answering | Exist, Count, Object, Status, Comparison, Acc |
DESIGN | C+L+R | Scene Captioning, Question Answering | Question Answering, Motion Planning | BLEU-4, METEOR, ROUGE, L2 Error, Collision Rate |
Reason2Drive | C+L | Question Answering | Question Answering | BLEU-4, METEOR, ROUGE, CIDER |
NuScenes-MQA | C+L+R | Question Answering | Question Answering | BLEU-4, METEOR, ROUGE |
LangAuto | C+L | Navigation Instructions, Notice Instructions | Vehicle Control | RC, IS, DS |
DriveMLM | C+L | Question Answering, User Instructions | Vehicle Control, Decision Explanation | RC, IS, DS, BLEU-4, METEOR, CIDER |
NuInstruct | C | Scene-, Frame-, Ego-, Instance Information, Question Answering | Question Answering, Scene Captioning | MAE, Accuracy, BLEU-4, mAP |
- UniSim: A Neural Closed-Loop Sensor Simulator[
CVPR 2023
, Project] - NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles[
arxiv 2023
, Github] - DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model [
arxiv 2023
, Project] - OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving [
arxiv 2023
, Project] - ADriver-I: A General World Model for Autonomous Driving [
arxiv 2023
] - Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving [
arxiv 2023
, Project, Github] - WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation [
arxiv 2023
, Github] - DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving [
arxiv 2023
] - MagicDrive: Street View Generation with Diverse 3D Geometry Control [
arxiv 2023
] - GAIA-1: A Generative World Model for Autonomous Driving [
arxiv 2023
] - Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research [
NeurIPS 2023
, Github] - MUVO: A Multimodal Generative World Model for Autonomous Driving with Geometric Representations [
arxiv 2023
] - Natural-language-driven Simulation Benchmark and Copilot for Efficient Production of Object Interactions in Virtual Road Scenes [
arxiv 2023
] - LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs [
arxiv 2023
] - DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes [
arxiv 2023
] - OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields [
arxiv 2023
] - Neural Lighting Simulation for Urban Scenes [
NeurIPS 2023
, Project] - Street Gaussians for Modeling Dynamic Urban Scenes [
arxiv 2024
, Github, Project] - Panacea: Panoramic and Controllable Video Generation for Autonomous Driving [
arxiv 2023
, Github, Project] - LimSim++: A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving [
arxiv 2024
, Github, Project] - ChatSim: Editable Scene Simulation for Autonomous Driving via LLM-Agent Collaboration [
arxiv 2024
, Github, Project] - Neural Rendering based Urban Scene Reconstruction for Autonomous Driving [
arxiv 2024
] - OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous Driving [
arxiv 2024
, Github, Project]
- Grounding human-to-vehicle advice for self-driving vehicles [
CVPR 2019
] - ADAPT: Action-aware Driving Caption Transformer [
ICRA 2023
, Github] - Talk to the Vehicle: Language Conditioned Autonomous Navigation of Self Driving Cars [
IROS 2019
] - Talk2Car: Taking Control of Your Self-Driving Car [
EMNLP-IJNLP 2019
, Project] - Textual explanations for self-driving vehicles [
ECCV 2018
, Github] - Drive Like a Human: Rethinking Autonomous Driving with Large Language Models [
arxiv 2023
, Github] - DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model] [
arxiv 2023
, Project] - DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models [
ICLR 24
, Github] - GPT-Driver: Learning to Drive with GPT [
arxiv 2023
, Github] - Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving [
arxiv 2023
, Github] - LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving [
arxiv 2023
, Project] - Receive, Reason, and React: Drive as You Say with Large Language Models in Autonomous Vehicles [
arxiv 2023
] - Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles [
arxiv 2023
] - SurrealDriver: Designing Generative Driver Agent Simulation Framework in Urban Contexts based on Large Language Model [
arxiv 2023
] - Language-Guided Traffic Simulation via Scene-Level Diffusion [
arxiv 2023
] - Language Prompt for Autonomous Driving [
arxiv 2023
, Github] - Talk2BEV: Language-Enhanced Bird's Eye View (BEV) Maps [
arxiv 2023
, Project, Github] - BEVGPT: Generative Pre-trained Large Model for Autonomous Driving Prediction, Decision-Making, and Planning [
AAAI 2024
] - HiLM-D: Towards High-Resolution Understanding in Multimodal Large Language Models for Autonomous Driving [
arxiv 2023
] - Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving [
arxiv 2023
] - OpenAnnotate3D: Open-Vocabulary Auto-Labeling System for Multi-modal 3D Data [
arxiv 2023
, Github] - LangProp: A Code Optimization Framework Using Language Models Applied to Driving [
arxiv 2024
, Github] - Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion [
openreview 2023
] - Planning with an Ensemble of World Models [
openreview 2023
] - Large Language Models Can Design Game-Theoretic Objectives for Multi-Agent Planning [
openreview 2023
] - TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction [
arxiv 2023
] - BEV-CLIP: Multi-Modal BEV Retrieval Methodology for Complex Scene in Autonomous Driving [
arxiv 2023
] - Large Language Models Can Design Game-theoretic Objectives for Multi-Agent Planning [
openreview 2023
] - Semantic Anomaly Detection with Large Language Models [
arxiv 2023
] - Driving through the Concept Gridlock: Unraveling Explainability Bottlenecks in Automated Driving [
arxiv 2023
] - Drama: Joint risk localization and captioning in driving [
WACV 2023
] - 3D Dense Captioning Beyond Nouns: A Middleware for Autonomous Driving [
openreview 2023
] - SwapTransformer: Highway Overtaking Tactical Planner Model via Imitation Learning on OSHA Dataset [
openreview 2023
] - NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario [
arxiv 2023
, Github] - Language Prompt for Autonomous Driving [
arxiv 2023
, Github] - Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models [
arxiv 2023
] - Addressing Limitations of State-Aware Imitation Learning for Autonomous Driving [
arxiv 2023
] - A Language Agent for Autonomous Driving [
arxiv 2023
] - Human-Centric Autonomous Systems With LLMs for User Command Reasoning [
WACVW 2024
] - On the Road with GPT-4V (ision): Early Explorations of Visual-Language Model on Autonomous Driving [
arxiv 2023
] - Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving [
arxiv 2023
, Github] - GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models [
arxiv 2023
, Github] - ChatGPT as Your Vehicle Co-Pilot: An Initial Attempt [
IEEE TIV 2023
] - DriveLLM: Charting The Path Toward Full Autonomous Driving with Large Language Models [
IEEE TIV 2023
] - NuScenes-MQA: Integrated Evaluation of Captions and QA for Autonomous Driving Datasets using Markup Annotations [
WACVW 2024
, Github] - Evaluation of Large Language Models for Decision Making in Autonomous Driving [
arxiv 2023
] - LMDrive: Closed-Loop End-to-End Driving with Large Language Models [
arxiv 2023
, Github] - DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving [
arxiv 2023
, Github] - Large Language Models for Autonomous Driving: Real-World Experiments [
arxiv 2023
] - LingoQA: Video Question Answering for Autonomous Driving [
arxiv 2023
, Github] - DriveLM: Driving with Graph Visual Question Answering [
arxiv 2023
, Github] - LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning [
arxiv 2024
, Project] - Holistic Autonomous Driving Understanding by Bird’s-Eye-View Injected Multi-Modal Large Models [
arxiv 2024
, Github] - BEV-CLIP: Multi-modal BEV Retrieval Methodology for Complex Scene in Autonomous Driving [
arxiv 2024
] - DME-Driver: Integrating Human Decision Logic and 3D Scene Perception in Autonomous Driving [
arxiv 2024
] - VLP: Vision Language Planning for Autonomous Driving [
arxiv 2024
] - Driving Everywhere with Large Language Model Policy Adaptation [
arxiv 2024
, Github, Project] - RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model [
arxiv 2024
, Github, Project] - DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models [
arxiv 2024
, Project]
- Applications of Large Scale Foundation Models for Autonomous Driving [
arxiv 2023
] - A Survey on Multimodal Large Language Models for Autonomous Driving [
arxiv 2023
] - A Survey of Large Language Models for Autonomous Driving [
arxiv 2023
] - Vision Language Models in Autonomous Driving and Intelligent Transportation Systems [
arxiv 2023
] - Choose Your Simulator Wisely: A Review on Open-source Simulators for Autonomous Driving [
arxiv 2023
] - Towards Knowledge-driven Autonomous Driving [
arxiv 2023
] - Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities [
arxiv 2024
] - A Survey for Foundation Models in Autonomous Driving [
arxiv 2024
]
- [WACV2024 Workshop] MAPLM: A Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding
- [Blog] LINGO-1: Exploring Natural Language for Autonomous Driving
- [Blog] Introducing GAIA-1: A Cutting-Edge Generative AI Model for Autonomy
- [Blog] Ghost Gym: A Neural Simulator for Autonomous Driving
If you find our paper useful, please kindly cite us via:
@article{li2023knowledgedriven,
title={Towards Knowledge-driven Autonomous Driving},
author={Li, Xin and Bai, Yeqi and Cai, Pinlong and Wen, Licheng and Fu, Daocheng and Zhang, Bo and Yang, Xuemeng and Cai, Xinyu and Ma, Tao and Guo, Jianfei and Gao, Xing and Dou, Min and Shi, Botian and Liu, Yong and He, Liang and Qiao, Yu},
journal={arXiv preprint arXiv:2312.04316},
year = {2023}
}
Awesome Knowledge-driven Autonomous Driving is released under the Apache 2.0 license.