• Stars
    star
    362
  • Rank 117,671 (Top 3 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created almost 4 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)

AIJack: Security and Privacy Risk Simulator for Machine Learning

❤️ If you like AIJack, please consider becoming a GitHub Sponsor ❤️




What is AIJack?

AIJack is an easy-to-use open-source simulation tool for testing the security of your AI system against hijackers. It provides advanced security techniques like Differential Privacy, Homomorphic Encryption, K-anonymity and Federated Learning to guarantee protection for your AI. With AIJack, you can test and simulate defenses against various attacks such as Poisoning, Model Inversion, Backdoor, and Free-Rider. We support more than 30 state-of-the-art methods. For more information, check our documentation and start securing your AI today with AIJack.

Installation

You can install AIJack with pip. AIJack requires Boost and pybind11.

apt install -y libboost-all-dev
pip install -U pip
pip install "pybind11[global]"

pip install aijack

If you want to use the latest-version, you can directly install from GitHub.

pip install git+https://github.com/Koukyosyumei/AIJack

We also provide Dockerfile.

Quick Start

We briefly introduce the overview of AIJack.

Features

  • All-around abilities for both attack & defense
  • PyTorch-friendly design
  • Compatible with scikit-learn
  • Fast Implementation with C++ backend
  • MPI-Backend for Federated Learning
  • Extensible modular APIs

Basic Interface

Python API

For standard machine learning algorithms, AIJack allows you to simulate attacks against machine learning models with Attacker APIs. AIJack mainly supports PyTorch or sklearn models.

# abstract code

attacker = Attacker(target_model)
result = attacker.attack()

For distributed learning such as Federated Learning and Split Learning, AIJack offers four basic APIs: Client, Server, API, and Manager. Client and Server represent each client and server within each distributed learning scheme. You can execute training by registering the clients and servers to API and running it. Manager gives additional abilities such as attack, defense, or parallel computing to Client, Server or API via attach method.

# abstract code

client = [Client(), Client()]
server = Server()
api = API(client, server)
api.run() # execute training

c_manager = ClientManagerForAdditionalAbility(...)
s_manager = ServerManagerForAdditionalAbility(...)
ExtendedClient = c_manager.attach(Client)
ExtendedServer = c_manager.attach(Server)

extended_client = [ExtendedClient(...), ExtendedClient(...)]
extended_server = ExtendedServer(...)
api = API(extended_client, extended_server)
api.run() # execute training

For example, the bellow code implements the scenario where the server in Federated Learning tries to steal the training data with gradient-based model inversion attack.

from aijack.collaborative.fedavg import FedAVGAPI, FedAVGClient, FedAVGServer
from aijack.attack.inversion import GradientInversionAttackServerManager

manager = GradientInversionAttackServerManager(input_shape)
FedAVGServerAttacker = manager.attach(FedAVGServer)

clients = [FedAVGClient(model_1), FedAVGClient(model_2)]
server = FedAVGServerAttacker(clients, model_3)

api = FedAVGAPI(server, clients, criterion, optimizers, dataloaders)
api.run()

AIValut: A simple DBMS for debugging ML Models

We also provide a simple DBMS named AIValut designed specifically for SQL-based algorithms. AIValut currently supports Rain, a SQL-based debugging system for ML models. In the future, we have plans to integrate additional advanced features from AIJack, including K-Anonymity, Homomorphic Encryption, and Differential Privacy.

AIValut has its own storage engine and query parser, and you can train and debug ML models with SQL-like queries. For example, the Complaint query automatically removes problematic records given the specified constraint.

# We train an ML model to classify whether each customer will go bankrupt or not based on their age and debt.
# We want the trained model to classify the customer as positive when he/she has more debt than or equal to 100.
# The 10th record seems problematic for the above constraint.
>>Select * From bankrupt
id age debt y
1 40 0 0
2 21 10 0
3 22 10 0
4 32 30 0
5 44 50 1
6 30 100 1
7 63 310 1
8 53 420 1
9 39 530 1
10 49 1000 0

# Train Logistic Regression with the number of iterations of 100 and the learning rate of 1.
# The name of the target feature is `y`, and we use all other features as training data.
>>Logreg lrmodel id y 100 1 From Select * From bankrupt
Trained Parameters:
 (0) : 2.771564
 (1) : -0.236504
 (2) : 0.967139
AUC: 0.520000
Prediction on the training data is stored at `prediction_on_training_data_lrmodel`

# Remove one record so that the model will predict `positive (class 1)` for the samples with `debt` greater or equal to 100.
>>Complaint comp Shouldbe 1 Remove 1 Against Logreg lrmodel id y 100 1 From Select * From bankrupt Where debt Geq 100
Fixed Parameters:
 (0) : -4.765492
 (1) : 8.747224
 (2) : 0.744146
AUC: 1.000000
Prediction on the fixed training data is stored at `prediction_on_training_data_comp_lrmodel`

For more detailed information and usage instructions, please refer to aivalut/README.md.

Please use AIValut only for research purpose.

Resources

You can also find more examples in our tutorials and documentation.

Supported Algorithms

Collaborative Horizontal FL FedAVG, FedProx, FedKD, FedGEMS, FedMD, DSFL
Collaborative Vertical FL SplitNN, SecureBoost
Attack Model Inversion MI-FACE, DLG, iDLG, GS, CPL, GradInversion, GAN Attack
Attack Label Leakage Norm Attack
Attack Poisoning History Attack, Label Flip, MAPF, SVM Poisoning
Attack Backdoor DBA
Attack Free-Rider Delta-Weight
Attack Evasion Gradient-Descent Attack
Attack Membership Inference Shaddow Attack
Defense Homomorphic Encryption Paiilier
Defense Differential Privacy DPSGD, AdaDPS
Defense Anonymization Mondrian
Defense Debugging Model Assertions, Rain, Neuron Coverage
Defense Others Soteria, FoolsGold, MID, Sparse Gradient

Contact

welcome2aijack[@]gmail.com

Citation

@software{Hideaki_AIJack_2023,
author = {Hideaki, Takahashi},
month = jun,
title = {{AIJack}},
url = {https://github.com/Koukyosyumei/AIJack},
year = {2023}
}