Deep Packet
Details in blog post: https://blog.munhou.com/2020/04/05/Pytorch-Implementation-of-Deep-Packet-A-Novel-Approach-For-Encrypted-Tra%EF%AC%83c-Classi%EF%AC%81cation-Using-Deep-Learning/
Changelog
EDIT: 2022-11-30
- Add the ResNet model. Kudos to Taehyun for implementing ResNet.
EDIT: 2022-09-27
- Update dataset and model
- Update dependencies
- Add more data to
chat
,file_transfer
,voip
,streaming
andvpn_voip
- Remove tor and torrent related data as they are no longer available
EDIT: 2022-01-18
- Update dataset and model
EDIT: 2022-01-17
- Update code and model
- Drop
petastorm
, use huggingface'sdatasets
instead for data loader
How to Use
- Clone the project
- Create environment via conda
- For Mac
conda env create -f env_mac.yaml
- For Linux (CPU only)
conda env create -f env_linux_cpu.yaml
- For Linux (CUDA 10.2)
conda env create -f env_linux_cuda102.yaml
- For Linux (CUDA 11.3)
conda env create -f env_linux_cuda113.yaml
- For Mac
- Download the train and test set I created at here, or download the full dataset if you want to process the data from scratch.
Data Pre-processing
python preprocessing.py -s /path/to/CompletePcap/ -t processed_data
Create Train and Test
python create_train_test_set.py -s processed_data -t train_test_data
Train Model
Application Classification
For CNN model
python train_cnn.py -d train_test_data/application_classification/train.parquet -m model/application_classification.cnn.model -t app
For Resnet model
python train_resnet.py -d train_test_data/application_classification/train.parquet -m model/application_classification.cnn.model -t app
Traffic Classification
For CNN model
python train_cnn.py -d train_test_data/traffic_classification/train.parquet -m model/traffic_classification.cnn.model -t traffic
For Resnet model
python train_resnet.py -d train_test_data/traffic_classification/train.parquet -m model/traffic_classification.cnn.model -t traffic
Evaluation Result (CNN)
Application Classification
Traffic Classification
Model Files
Download the pre-trained CNN models here.
Elapsed Time
Preprocessing
Code ran on AWS c5.4xlarge
7:01:32 elapsed
Train and Test Creation
Code ran on AWS c5.4xlarge
2:55:46 elapsed
Traffic Classification Model Training (CNN)
Code ran on AWS g5.xlarge
24:41 elapsed
Application Classification Model Training (CNN)
Code ran on AWS g5.xlarge
7:55 elapsed