Welcome to Generalizable Mixture-of-Experts for Domain Generalization
Wondering why GMoEs have astonishing performance?
Preparation
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
python3 -m pip uninstall tutel -y
python3 -m pip install --user --upgrade git+https://github.com/microsoft/tutel@main
pip3 install -r requirements.txt
Datasets
python3 -m domainbed.scripts.download \
--data_dir=./domainbed/data
Environments
Environment details used in paper for the main experiments on Nvidia V100 GPU.
Environment:
Python: 3.9.12
PyTorch: 1.12.0+cu116
Torchvision: 0.13.0+cu116
CUDA: 11.6
CUDNN: 8302
NumPy: 1.19.5
PIL: 9.2.0
Start Training
Train a model:
python3 -m domainbed.scripts.train\
--data_dir=./domainbed/data/OfficeHome/\
--algorithm GMOE\
--dataset OfficeHome\
--test_env 2
Hyper-params
We put hparams for each dataset into
./domainbed/hparams_registry.py
Basically, you just need to choose --algorithm
and --dataset
. The optimal hparams will be loaded accordingly.
License
This source code is released under the MIT license, included here.
Acknowledgement
The MoE module is built on Tutel MoE.