• Stars
    star
    669
  • Rank 64,937 (Top 2 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created about 1 year ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

ChatDBG - AI-assisted debugging. Uses AI to answer 'why'

ChatDBG

by Emery Berger

ChatDBG is an experimental debugger for Python and native C/C++ code that integrates large language models into a standard debugger (pdb, lldb, and gdb) to help debug your code. With ChatDBG, you can ask your debugger "why" your program failed, and it will provide a suggested fix.

As far as we are aware, ChatDBG is the first debugger to automatically perform root cause analysis and to provide suggested fixes. This is an alpha release; we greatly welcome feedback and suggestions!

PyPI Latest ReleaseDownloads Downloads Python versions

Installation

NOTE: To use ChatDBG, you must first set up an OpenAI API key. If you already have an API key, you can set it as an environment variable called OPENAI_API_KEY. If you do not have one yet, you can get a key here: https://platform.openai.com/account/api-keys

export OPENAI_API_KEY=<your-api-key>

Install ChatDBG using pip (you need to do this whether you are debugging Python, C, or C++ code):

python3 -m pip install chatdbg

If you are using ChatDBG to debug Python programs, you are done. If you want to use ChatDBG to debug native code with gdb or lldb, follow the installation instructions below.

Installing as an lldb extension

lldb installation instructions

Install ChatDBG into the lldb debugger by running the following command:

Linux

python3 -m pip install ChatDBG
python3 -c 'import chatdbg; print(f"command script import {chatdbg.__path__[0]}/chatdbg_lldb.py")' >> ~/.lldbinit

Mac

xcrun python3 -m pip install ChatDBG
xcrun python3 -c 'import chatdbg; print(f"command script import {chatdbg.__path__[0]}/chatdbg_lldb.py")' >> ~/.lldbinit

This will install ChatDBG as an LLVM extension.

Installing as a gdb extension

gdb installation instructions

Install ChatDBG into the gdb debugger by running the following command:

python3 -m pip install ChatDBG
python3 -c 'import chatdbg; print(f"source {chatdbg.__path__[0]}/chatdbg_gdb.py")' >> ~/.gdbinit

This will install ChatDBG as a GDB extension.

Usage

Debugging Python

To use ChatDBG to debug Python programs, simply run your Python script with the -m flag:

python3 -m chatdbg -c continue yourscript.py

or just

chatdbg -c continue yourscript.py

ChatDBG is an extension of the standard Python debugger pdb. Like pdb, when your script encounters an uncaught exception, ChatDBG will enter post mortem debugging mode.

Unlike other debuggers, you can then use the why command to ask ChatDBG why your program failed and get a suggested fix.

ChatDBG example with Python
Traceback (most recent call last):
  File "yourscript.py", line 9, in <module>
    print(tryme(100))
  File "yourscript.py", line 4, in tryme
    if x / i > 2:
ZeroDivisionError: division by zero
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> yourscript.py(4)tryme()
-> if x / i > 2:
(ChatDBG Pdb) why

ChatDBG will then provide a helpful explanation of why your program failed and a suggested fix:

The root cause of the error is that the code is attempting to
divide by zero in the line "if x / i > 2". As i ranges from 0 to 99,
it will eventually reach the value of 0, causing a ZeroDivisionError.

A possible fix for this would be to add a check for i being equal to
zero before performing the division. This could be done by adding an
additional conditional statement, such as "if i == 0: continue", to
skip over the iteration when i is zero. The updated code would look
like this:

def tryme(x):
    count = 0
    for i in range(100):
        if i == 0:
            continue
        if x / i > 2:
            count += 1
    return count

if __name__ == '__main__':
    print(tryme(100))

Debugging native code (lldb / gdb)

To use ChatDBG with lldb or gdb, just run native code (compiled with -g for debugging symbols) with your choice of debugger; when it crashes, ask why. This also works for post mortem debugging (when you load a core with the -c option).

Example of using why in lldb
(ChatDBG lldb) run
Process 85494 launched: '/Users/emery/git/ChatDBG/test/a.out' (arm64)
TEST 1
TEST -422761288
TEST 0
TEST 0
TEST 0
TEST 0
TEST 0
TEST 0
Process 85494 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x100056200)
    frame #0: 0x0000000100002f64 a.out`foo(n=8, b=1) at test.cpp:7:22
   4     int x[] = { 1, 2, 3, 4, 5 };
   5     
   6     void foo(int n, float b) {
-> 7       cout << "TEST " << x[n * 10000] << endl;
   8     }
   9     
   10    int main()
Target 0: (a.out) stopped.

Now you can ask why:

(ChatDBG lldb) why
The root cause of this error is accessing an index of the array `x`
that is out of bounds. In `foo()`, the index is calculated as `n *
10000`, which can be much larger than the size of the array `x` (which
is only 5 elements). In the given trace, the program is trying to
access the memory address `0x100056200`, which is outside of the range
of allocated memory for the array `x`.

To fix this error, we need to ensure that the index is within the
bounds of the array. One way to do this is to check the value of `n`
before calculating the index, and ensure that it is less than the size
of the array divided by the size of the element. For example, we can
modify `foo()` as follows:

    ```
    void foo(int n, float b) {
      if (n < 0 || n >= sizeof(x)/sizeof(x[0])) {
        cout << "ERROR: Invalid index" << endl;
        return;
      }
      cout << "TEST " << x[n] << endl;
    }
    ```

This code checks if `n` is within the valid range, and prints an error
message if it is not. If `n` is within the range, the function prints
the value of the element at index `n` of `x`. With this modification,
the program will avoid accessing memory outside the bounds of the
array, and will print the expected output for valid indices.

More Repositories

1

scalene

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
Python
11,180
star
2

coz

Coz: Causal Profiling
C
3,753
star
3

browsix

Browsix is a Unix-like operating system for the browser.
JavaScript
3,117
star
4

doppio

Breaks the browser language barrier (includes a plugin-free JVM).
TypeScript
2,150
star
5

Mesh

A memory allocator that automatically reduces the memory footprint of C/C++ applications.
C++
1,618
star
6

BLeak

BLeak: Automatically Debugging Memory Leaks in Web Applications
TypeScript
409
star
7

slipcover

Near Zero-Overhead Python Code Coverage
Python
405
star
8

cwhy

"See why!" Explains and suggests fixes for compile-time errors for C, C++, C#, Go, Java, LaTeX, PHP, Python, Ruby, Rust, and TypeScript
C++
266
star
9

sqlwrite

SQLwrite: AI in your DBMS! Automatically converts natural language queries to SQL.
C
106
star
10

NextDoor

Graph Sampling using GPU
Cuda
47
star
11

DataDebug

Excel 2010/2013 add-in that automatically finds errors in spreadsheets
C#
46
star
12

systemgo

Init system in Go, intended to run on Browsix and other Unix-like OS. Part of GSoC 2016 project.
Go
31
star
13

sheriff

Sheriff consists of two tools: Sheriff-Detect, a false-sharing detector, and Sheriff-Protect, a false-sharing eliminator that you can link with your code to eliminate false sharing.
C++
29
star
14

DoubleTake

Evidence-based dynamic analysis: a fast checker for memory errors.
C
21
star
15

commentator

Automatically comments Python code, adding docstrings and type annotations, with optional translation to other languages.
Python
19
star
16

Predator

Predator: Predictive False Sharing Detection
C
19
star
17

memory-landscape

The space of memory management research and systems produced by the PLASMA lab (https://plasma-umass.org).
16
star
18

snakefish

parallel Python
Python
13
star
19

parcel

An Excel formula parser
C#
12
star
20

entroprise

measure entropy of memory allocators
C++
12
star
21

Rehearsal

Rehearsal: A Configuration Verification Tool for Puppet
Scala
12
star
22

coverup

Automatic AI-powered test suite generator
Python
12
star
23

Hound

Hound memory leak detector
C++
11
star
24

smash-project

Smash compressing allocator project
C++
10
star
25

browsix-spec

JavaScript
9
star
26

Archipelago

Archipelago memory allocator
C
8
star
27

simplesocket

A simple socket wrapper for C++.
C++
8
star
28

pythoness

Pythoness: use natural language to define Python functions.
Python
7
star
29

compsci631

Support code for Programming Languages (COMPSCI631)
OCaml
7
star
30

Tortoise

Tortoise: Interactive System Configuration Repair
Scala
6
star
31

scalene-gui

Scalene web GUI
JavaScript
5
star
32

transparentFS

TransparentFS code, paper, and slides
C
5
star
33

homebrew-scalene

Homebrew tap for Scalene (emeryberger/scalene)
Ruby
4
star
34

GSoC

Description of our Google Summer of Code projects for 2015
4
star
35

llm-utils

Utilities for our LLM projects (CWhy, ChatDBG, ...).
Python
4
star
36

HeapToss

HeapToss is an LLVM compiler pass that moves stack variables that may escape their declaring function's context into the heap.
3
star
37

GSoC-2013

Google Summer of Code 2013
2
star
38

jsvm

JavaScript
2
star
39

plasma-umass.github.io

home page
HTML
2
star
40

spl

Rust
2
star
41

doppio_jcl

Scripts that produce a version of the Java Class Library and Java Home in a way that is compatible with DoppioJVM.
TypeScript
2
star
42

nextdoor-eurosys21

HTML
1
star
43

mesh-testsuite

C
1
star
44

proto

probabilistic race tolerance
C
1
star
45

ChatSheet

Python
1
star
46

custom-public

Jupyter Notebook
1
star
47

wasm-gc-template

C++
1
star
48

typissed

Generates MTurk typo jobs
C#
1
star
49

scalene-benchmarks

Benchmarks comparing Scalene with other commonly-used profilers
Python
1
star
50

emcc_control

C
1
star
51

transparentMM

Transparent memory management
1
star