• Stars
    star
    250
  • Rank 162,397 (Top 4 %)
  • Language
    Python
  • Created almost 5 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Automatic Discrepancy Discovery for DPI Elusion

SymTCP - Automatic Discrepancy Discovery for DPI (Deep Packet Inspection) Elusion

SymTCP is a tool used to automatically discover subtle discrepancies between two TCP implementations, e.g., how they accept and drop packets. Specifically, it can find the discrepancies between a server and a DPI, and use them to elude the DPI, e.g., a packet accepted by the server but ignored by the DPI. It first runs symbolic execution on the server's TCP implementation (whitebox) and collect program execution paths labeled as either "accept" path or "drop" path. Symbolic execution will generate the input packet sequence for each execution paths. Then it probes the DPI (blackbox) with the packet sequences generated and finds if they are processed the same by the DPI as the server. You can find more information in our NDSS paper.

Source

β”œβ”€β”€ bin                    Binaries used to send/receive packets
β”œβ”€β”€ data                   Sampled test cased generated using Linux kernel v4.9.3
β”œβ”€β”€ docker                 Docker files used in early stage testing
β”œβ”€β”€ kernel_info            Debug info extracted from Linux kernel v4.9.3
β”œβ”€β”€ patches                Changes made to S2E symbolic execution engine
β”œβ”€β”€ scripts                Python scripts used for data analysis, DPI probing, ...
β”‚   └── eval               Scripts used in evaluation
β”œβ”€β”€ static                 Some static analysis scripts not used in final version
└── tools
    β”œβ”€β”€ aws_remote_ctl     Scripts used to control AWS instances 
    β”œβ”€β”€ data_processing    Result analysis scripts used in early stage
    β”œβ”€β”€ dpi_sys_confs      Configuration files of Bro and Snort
    β”œβ”€β”€ sym_packet_sender  Symbolic packet sender
    └── web_addr2line      Web application used to translate memory address to source code line number

Usage

Requirement

  • Ubuntu 16.04/18.04 (Other Linux-based system might work as well)
  • Z3 Solver with Python library (v4.8.5)
  • root privilege (for sending raw packets)

Setup S2E

We are using a S2E 2.0 version fetched in Apr, 2019 (can be found in release). Using a newer version of S2E requires porting the code in the patches folder to the newer version.

To set up the S2E environment, you may use the s2e-env tool (by following the instructions here). If you want to reuse the exact S2E version used in this project, you may consider to replace the source code downloaded by s2e-env with the one we provided.

The S2E project we created can be also found in the release. And it should be put in the projects folder of the S2E environment.

Configure network interfaces in the guest and host OS

Because we need to run the guest OS in QEMU with bridge mode, we also need to add a tun/tap interface in the host OS.

mkdir /dev/net (if it doesn't exist already)
mknod /dev/net/tun c 10 200

ip link add name qemubr0 type bridge
ip addr add 172.20.0.1/24 dev qemubr0
ip link set qemubr0 up

And use setuid to give qemu-bridge-helper root privilege:

sudo chown root:root $S2E_DIR/install/libexec/qemu-bridge-helper 

sudo chmod u+s  $S2E_DIR/install/libexec/qemu-bridge-helper

In the guest OS, we need to configure the network interface to use a static IP address 172.20.0.2.

Run symbolic execution

To launch the S2E project, please run the launch-s2e.sh shell script in the tcp project folder. Please also check the path variables before that.

The results are stored in the debug.log file in the s2e output folder, e.g., s2e-out-0. And it will be used later by the Python analysis scripts.

Test case generation

You may use the scripts/get_concrete_examples.py to extract test cases from symbolic execution results.

Probe the DPI

We provide two datasets with 1,000 and 10,000 test cases respectively in our repository, which are both sampled from the same run of symbolic execution with 3 symbolic packets of 40-byte length (20-byte header + 20-byte TCP options and payload). The smaller 1,000 dataset is in the data folder, and the larger 10,000 dataset is in the release. You may also use your own dataset generated by running symbolic execution.

To probe a DPI with the test cases, you may use scripts/probe_dpi.py.

Probe DPI with test cases generated from symbolic execution.

positional arguments:
  test_case_file        test case file

optional arguments:
  -h, --help            show this help message and exit
  -P, --dump-pcaps      dump pcap files for each test case
  -G, --gfw             probing the gfw
  -I INT, --int INT     interface to listen on
  -F, --tcp-flags-fuzzing
                        switch of tcp flags fuzzing
  --tcp-flags TCP_FLAGS
                        Use specific TCP flags for testing
  -D, --debug           turn on debug mode
  -p PAYLOAD_LEN        TCP payload length (because header len is symbolic,
                        payload may be counted as optional header
  -N NUM_INSTS
  -S SPLIT_ID
  -t TEST_CASE_IDX      test case index
  --packet-idx PACKET_IDX
                        packet index in the test case
  --replay REPLAY       replay a list of successful cases

Examples:

./probe_dpi.py -p 20 -P -F             (20-byte options and payload, dump packets, try different flags)
./probe_dpi.py -p 20 -P -F -N 50 -S 0  (Split the dataset into 50 chunks and use the first chunk)

The results can be found in the probe_dpi_result file under current folder. And detailed logs will be output to the probe_dpi.log file.

probe_dpi.py needs to read DPI logs from local paths in order to check whether it succeeds or not. You will need to configure the path variables in the script.

probe_dpi.py also loads a list of server IP addresses from a server_list file (for probing the GFW) or a server_list.local file (for probing DPIs) in order to balance work loads. In such a file, each line is a server's IP address.

Applying SymTCP to another version of Linux kernel

In our research, we used Linux kernel v4.9.3 as the targeted server implementation. Our tool can also be applied to other version of Linux kernel as well (or even other OSes), and it can also be used to probe other DPI systems with little additional efforts.

To apply to another version of Linux kernel, you will need to have the binary of the Linux kernel (the vmlinux file) in order to label drop points and/or accept points on it. The drop points used in S2E are at binary level, configured in s2e-config.lua in the S2E project folder. A typicial approach to label drop points and/or accept points is to label them in the source code first, by backtracing from sinks such as tcp_drop, kfree_skb, etc, and then map the source code line to binary address with tools such as objdump. There are also some hard coded memory address in the S2E plugin we wrote, you will also need to update those addresses according to the new version of Linux kernel.

Publication

Check our NDSS'20 paper for more technical details[PDF]

SymTCP: Eluding Stateful Deep Packet Inspection with Automated Discrepancy Discovery. Zhongjie Wang, Shitong Zhu, Yue Cao, Zhiyun Qian, Chengyu Song, Srikanth Krishnamurthy, Tracy D. Braun, Kevin S. Chan. DOI:https://dx.doi.org/10.14722/ndss.2020.24083

@inproceedings{wang2020symtcp,
  title={SymTCP: Eluding Stateful Deep Packet Inspection with Automated Discrepancy Discovery},
  author={Wang, Zhongjie and Zhu, Shitong and Cao, Yue and Qian, Zhiyun and Song, Chengyu and Krishnamurthy, Srikanth V and Chan, Kevin S and Braun, Tracy D},
  booktitle={NDSS},
  year={2020}
}

More Repositories

1

INTANG

C
2,858
star
2

tcp_exploit

Off-Path TCP Exploit: How Wireless Routers Can Jeopardize Your Secret
JavaScript
105
star
3

KOOBE

Towards Facilitating Exploit Generation of Kernel Out-Of-Bounds Write Vulnerabilities
82
star
4

SUTURE

Precise and high-order static points-to/taint analysis based on LLVM IR.
C++
69
star
5

SADDNS

SADDNS: Side Channel Based DNS Cache Poisoning Attack
C
51
star
6

SyzDescribe

C++
50
star
7

LLift

The source code of project "LLift" (Enhancing static analysis with LLM)
Python
46
star
8

SyzGen_setup

Go
42
star
9

UBITect

C++
38
star
10

IncreLux

Progressive Scrutiny: Incremental Detection of UBI bugs in the Linux Kernel
C++
29
star
11

GPT-Expr

Assisting Static Analysis with Large Language Models: A ChatGPT Experiment
27
star
12

SyzBridge

SyzBridge is a research project that adapts Linux upstream PoCs to downstream distributions. It provides rich interfaces that allow you to do a lot of cool things with Syzbot bugs
Python
24
star
13

Themis

Themis: Ambiguity-Aware Network Intrusion Detection based on Symbolic Model Comparison
C++
20
star
14

ShadowBlock

Code release for our WWW 2019 paper entitled "ShadowBlock: A Lightweight and Stealthy Adblocking Browser".
C++
18
star
15

Unias

A Hybrid Alias Analysis
C++
18
star
16

K-LEAK

C++
14
star
17

SADDNS2.0

Go
11
star
18

CLAP

This repository hosts the implementation of CLAP (Context Learning-based Adversarial Protection) that is proposed in our CoNEXT 2020 paper titled "You Do (Not) Belong Here: Detecting DPI Evasion Attacks with Context Learning". The code here can be used to reproduce the main results in the paper.
Python
9
star
19

SyzGenPlusPlus

Python
8
star
20

SCENT

TCP Side Channel Excavation Tool
C++
7
star
21

CCS24Mesh

7
star
22

Patchlocator

An Investigation of the Android Kernel Patch Ecosystem Usenix security 21
Python
6
star
23

PAPP

Prefetcher-Aware Prime+Probe
C
5
star
24

A4

Code and dataset release for our ACSAC 2021 paper titled "Eluding ML-based Adblockers With Actionable Adversarial Examples".
HTML
5
star