Knowledge Tracing Collection with PyTorch

This repository is a collection of the following knowledge tracing algorithms:

Deep Knowledge Tracing (DKT)
Deep Knowledge Tracing + (DKT+)
Dynamic Key-Value Memory Networks for Knowledge Tracing (DKVMN)
Knowledge Query Network for Knowledge Tracing (KQN)
A Self-Attentive model for Knowledge Tracing (SAKT)
Graph-based Knowledge Tracing (GKT)

More algorithms will be added on this repository soon.

In this repository, ASSISTment2009 "skill-builder" dataset are used. You need to download the dataset on the following path:

datasets/ASSIST2009/

Also, you can use the ASSISTment2015 "skill-builder" dataset. Similarly you need to download them on the following path:

datasets/ASSIST2015/

Other datasets, Algebra 2005-2006 and Statics 2011 dataset can be used to train your knowledge tracing model. The pathes to download each dataset are as follows:

datasets/Algebra2005

datasets/Statics2011

Install Dependencies

Install Python 3.
Install the Python packages in requirements.txt. If you are using a virtual environment for Python package management, you can install all python packages needed by using the following bash command:
```
$ pip install -r requirements.txt
```
Install PyTorch. The version of PyTorch should be greater or equal than 1.7.0. This repository provides the CUDA usage.

Note: There are some bugs in the pytorch.utils.data module on the PyTorch version 1.9.0. If you want to run this repository safely, you need to install the PyTorch version 1.7.0 or 1.8.0. You can check the bugs closely in the following links:
- pytorch/pytorch#44714
- dunbar12138/DSNeRF#3

Training and Running

Modify config.json as your machine setting. The following explanations are for understanding train_config of config.json:
- batch_size: The batch size of the training process. Default: 256
- num_epochs: The number of epochs of the training process. Default: 100
- train_ratio: The ratio of the training dataset to split the whole dataset. Default: 0.9
- learning_rate: The learning of the optimizer for the training process. Default: 0.001
- optimizer: The optimizer to use in the training process. The possible optimizers are ["sgd", "adam"]. Default: "adam"
- seq_len: The sequence length for the dataset to use in the training process. Default: 100
Execute training process by train.py. An example of usage for train.py are following:
```
$ python train.py --model_name=dkvmn
```
The following bash command will help you:
```
$ python train.py -h
```

Training Results

Training Configurations

Dataset	Configurations
ASSISTment2009	`batch_size`: 256, `num_epochs`: 100, `train_ratio`: 0.9, `learning_rate`: 0.001, `optimizer`: "adam", `seq_len`: 100
ASSISTment2015	`batch_size`: 256, `num_epochs`: 100, `train_ratio`: 0.9, `learning_rate`: 0.001, `optimizer`: "adam", `seq_len`: 50
Algebra 2005-2006	`batch_size`: 256, `num_epochs`: 200, `train_ratio`: 0.9, `learning_rate`: 0.001, `optimizer`: "adam", `seq_len`: 200
Statics 2011	`batch_size`: 256, `num_epochs`: 200, `train_ratio`: 0.9, `learning_rate`: 0.001, `optimizer`: "adam", `seq_len`: 200

ASSISTment2009 Result

Model	Maximum Test AUC (%)	Hyperparameters
DKT	82.15 ± 0.05	`emb_size`: 100, `hidden_size`: 100
DKT+	82.25 ± 0.06	`emb_size`: 100, `hidden_size`: 100, `lambda_r`: 0.01, `lambda_w1`: 0.03, `lambda_w2`: 0.3
DKVMN	81.18 ± 0.16	`dim_s`: 50, `size_m`: 20
KQN	79.82 ± 0.11	`dim_v`: 100, `dim_s`: 100, `hidden_size`: 100
SAKT	81.06 ± 0.08	`n`: 100, `d`: 100, `num_attn_heads`: 5, `dropout` 0.2
GKT (PAM)	82.12 ± 0.08	`hidden_size`: 30
GKT (MHA)	81.88 ± 0.17	`hidden_size`: 30

ASSISTment2015 Result

Model	Maximum Test AUC (%)	Hyperparameters
DKT	72.99 ± 0.04	`emb_size`: 50, `hidden_size`: 50
DKT+	72.78 ± 0.06	`emb_size`: 50, `hidden_size`: 50, `lambda_r`: 0.01, `lambda_w1`: 0.03, `lambda_w2`: 0.3
DKVMN	72.29 ± 0.05	`dim_s`: 50, `size_m`: 10
KQN	71.97 ± 0.14	`dim_v`: 50, `dim_s`: 50, `hidden_size`: 50
SAKT	72.80 ± 0.05	`n`: 50, `d`: 50, `num_attn_heads`: 5, `dropout` 0.3
GKT (PAM)	73.02 ± 0.13	`hidden_size`: 30
GKT (MHA)	73.14 ± 0.07	`hidden_size`: 30

Algebra 2005-2006 Result

Model	Maximum Test AUC (%)	Hyperparameters
DKT	82.29 ± 0.06	`emb_size`: 100, `hidden_size`: 100
DKT+	82.53 ± 0.06	`emb_size`: 100, `hidden_size`: 100, `lambda_r`: 0.01, `lambda_w1`: 0.03, `lambda_w2`: 1.0
DKVMN	81.20 ± 0.14	`dim_s`: 50, `size_m`: 20
KQN	77.08 ± 0.14	`dim_v`: 100, `dim_s`: 100, `hidden_size`: 100
SAKT	81.28 ± 0.07	`n`: 200, `d`: 100, `num_attn_heads`: 5, `dropout` 0.2

Statics 2011 Result

Model	Maximum Test AUC (%)	Hyperparameters
DKT	82.56 ± 0.09	`emb_size`: 50, `hidden_size`: 50
DKT+	83.36 ± 0.08	`emb_size`: 50, `hidden_size`: 50, `lambda_r`: 0.01, `lambda_w1`: 0.03, `lambda_w2`: 3.0
DKVMN	81.80 ± 0.08	`dim_s`: 50, `size_m`: 10
KQN	81.10 ± 0.13	`dim_v`: 50, `dim_s`: 50, `hidden_size`: 50
SAKT	80.90 ± 0.13	`n`: 200, `d`: 50, `num_attn_heads`: 5, `dropout` 0.3

The fact that Adam Optimizer has better performance on the training of DKT and DKVMN can be checked easily by running this repository.

SAKT looks like suffering an over-fitting. It seems that other tools to decrease the over-fitting will help the performance of SAKT. In fact, the results show that the dropout methods can relieve the over-fitting of the performance of SAKT.

Recent Works

Fixed some critical errors in DKT.
Modified the initialization of some parameters in DKVMN and SAKT.
Refactored models.utils.py.
Implemented DKT+.
Implemented PAM and MHA of GKT.
Implemented KQN.
Updated the performance results of KQN.

Future Works

Implement SKVMN.

hcnoh/knowledge-tracing-collection-pytorch

hcnoh

Reviews

Repository Details