This repository is a collection of the following knowledge tracing algorithms:
- Deep Knowledge Tracing (DKT)
- Deep Knowledge Tracing + (DKT+)
- Dynamic Key-Value Memory Networks for Knowledge Tracing (DKVMN)
- Knowledge Query Network for Knowledge Tracing (KQN)
- A Self-Attentive model for Knowledge Tracing (SAKT)
- Graph-based Knowledge Tracing (GKT)
More algorithms will be added on this repository soon.
In this repository, ASSISTment2009 "skill-builder" dataset are used. You need to download the dataset on the following path:
datasets/ASSIST2009/
Also, you can use the ASSISTment2015 "skill-builder" dataset. Similarly you need to download them on the following path:
datasets/ASSIST2015/
Other datasets, Algebra 2005-2006 and Statics 2011 dataset can be used to train your knowledge tracing model. The pathes to download each dataset are as follows:
datasets/Algebra2005
datasets/Statics2011
-
Install Python 3.
-
Install the Python packages in
requirements.txt
. If you are using a virtual environment for Python package management, you can install all python packages needed by using the following bash command:$ pip install -r requirements.txt
-
Install PyTorch. The version of PyTorch should be greater or equal than 1.7.0. This repository provides the CUDA usage.
Note: There are some bugs in the
pytorch.utils.data
module on the PyTorch version 1.9.0. If you want to run this repository safely, you need to install the PyTorch version 1.7.0 or 1.8.0. You can check the bugs closely in the following links:
-
Modify
config.json
as your machine setting. The following explanations are for understandingtrain_config
ofconfig.json
:batch_size
: The batch size of the training process. Default: 256num_epochs
: The number of epochs of the training process. Default: 100train_ratio
: The ratio of the training dataset to split the whole dataset. Default: 0.9learning_rate
: The learning of the optimizer for the training process. Default: 0.001optimizer
: The optimizer to use in the training process. The possible optimizers are ["sgd", "adam"]. Default: "adam"seq_len
: The sequence length for the dataset to use in the training process. Default: 100
-
Execute training process by
train.py
. An example of usage fortrain.py
are following:$ python train.py --model_name=dkvmn
The following bash command will help you:
$ python train.py -h
Dataset | Configurations |
---|---|
ASSISTment2009 | batch_size : 256, num_epochs : 100, train_ratio : 0.9, learning_rate : 0.001, optimizer : "adam", seq_len : 100 |
ASSISTment2015 | batch_size : 256, num_epochs : 100, train_ratio : 0.9, learning_rate : 0.001, optimizer : "adam", seq_len : 50 |
Algebra 2005-2006 | batch_size : 256, num_epochs : 200, train_ratio : 0.9, learning_rate : 0.001, optimizer : "adam", seq_len : 200 |
Statics 2011 | batch_size : 256, num_epochs : 200, train_ratio : 0.9, learning_rate : 0.001, optimizer : "adam", seq_len : 200 |
Model | Maximum Test AUC (%) | Hyperparameters |
---|---|---|
DKT | 82.15 ± 0.05 | emb_size : 100, hidden_size : 100 |
DKT+ | 82.25 ± 0.06 | emb_size : 100, hidden_size : 100, lambda_r : 0.01, lambda_w1 : 0.03, lambda_w2 : 0.3 |
DKVMN | 81.18 ± 0.16 | dim_s : 50, size_m : 20 |
KQN | 79.82 ± 0.11 | dim_v : 100, dim_s : 100, hidden_size : 100 |
SAKT | 81.06 ± 0.08 | n : 100, d : 100, num_attn_heads : 5, dropout 0.2 |
GKT (PAM) | 82.12 ± 0.08 | hidden_size : 30 |
GKT (MHA) | 81.88 ± 0.17 | hidden_size : 30 |
Model | Maximum Test AUC (%) | Hyperparameters |
---|---|---|
DKT | 72.99 ± 0.04 | emb_size : 50, hidden_size : 50 |
DKT+ | 72.78 ± 0.06 | emb_size : 50, hidden_size : 50, lambda_r : 0.01, lambda_w1 : 0.03, lambda_w2 : 0.3 |
DKVMN | 72.29 ± 0.05 | dim_s : 50, size_m : 10 |
KQN | 71.97 ± 0.14 | dim_v : 50, dim_s : 50, hidden_size : 50 |
SAKT | 72.80 ± 0.05 | n : 50, d : 50, num_attn_heads : 5, dropout 0.3 |
GKT (PAM) | 73.02 ± 0.13 | hidden_size : 30 |
GKT (MHA) | 73.14 ± 0.07 | hidden_size : 30 |
Model | Maximum Test AUC (%) | Hyperparameters |
---|---|---|
DKT | 82.29 ± 0.06 | emb_size : 100, hidden_size : 100 |
DKT+ | 82.53 ± 0.06 | emb_size : 100, hidden_size : 100, lambda_r : 0.01, lambda_w1 : 0.03, lambda_w2 : 1.0 |
DKVMN | 81.20 ± 0.14 | dim_s : 50, size_m : 20 |
KQN | 77.08 ± 0.14 | dim_v : 100, dim_s : 100, hidden_size : 100 |
SAKT | 81.28 ± 0.07 | n : 200, d : 100, num_attn_heads : 5, dropout 0.2 |
Model | Maximum Test AUC (%) | Hyperparameters |
---|---|---|
DKT | 82.56 ± 0.09 | emb_size : 50, hidden_size : 50 |
DKT+ | 83.36 ± 0.08 | emb_size : 50, hidden_size : 50, lambda_r : 0.01, lambda_w1 : 0.03, lambda_w2 : 3.0 |
DKVMN | 81.80 ± 0.08 | dim_s : 50, size_m : 10 |
KQN | 81.10 ± 0.13 | dim_v : 50, dim_s : 50, hidden_size : 50 |
SAKT | 80.90 ± 0.13 | n : 200, d : 50, num_attn_heads : 5, dropout 0.3 |
The fact that Adam Optimizer
has better performance on the training of DKT and DKVMN can be checked easily by running this repository.
SAKT looks like suffering an over-fitting. It seems that other tools to decrease the over-fitting will help the performance of SAKT. In fact, the results show that the dropout methods can relieve the over-fitting of the performance of SAKT.
- Fixed some critical errors in DKT.
- Modified the initialization of some parameters in DKVMN and SAKT.
- Refactored
models.utils.py
. - Implemented DKT+.
- Implemented PAM and MHA of GKT.
- Implemented KQN.
- Updated the performance results of KQN.
- Implement SKVMN.
- Attention Is All You Need
- DKT: Deep Knowledge Tracing
- DKT+: Addressing Two Problems in Deep Knowledge Tracing via Prediction-Consistent Regularization
- DKVMN: Dynamic Key-Value Memory Networks for Knowledge Tracing
- SKVMN: Knowledge Tracing with Sequential Key-Value Memory Networks
- SAKT: A Self-Attentive model for Knowledge Tracing
- For the implementation of SAKT: PyTorch Transforme Encoder Layer
- GKT: Graph-based Knowledge Tracing: Modeling Student Proficiency Using Graph Neural Network
- KQN: Knowledge Query Network for Knowledge Tracing
- AKT: Context-Aware Attentive Knowledge Tracing
- CKT: Convolutional Knowledge Tracing: Modeling Individualization in Student Learning Process