mpc4j
Introduction
Multi-Party Computation for Java (mpc4j
) is an efficient and easy-to-use Secure Multi-Party Computation (MPC) and Differential Privacy (DP) library mainly written in Java.
mpc4j
aims to provide an academic library for researchers to study and develop MPC/DP in a unified manner. As mpc4j
tries to provide state-of-the-art MPC/DP implementations, researchers could leverage the library to have fair and quick comparisons between the new algorithms/protocols they proposed and existing ones.
We note that mpc4j
is mainly focused on research and mpc4j
assumes a very strong system model. Specifically, mpc4j
assumes never-crash nodes with a fully synchronized network. In practice, crash-recovery nodes with a partially synchronized network would be a reasonable system model. Aside from the system model, mpc4j
tries to integrate tools that are suitable to be used in the production environment. We emphasize that additional engineering problems need to be solved if you want to develop your own MPC/DP applications. A reasonable solution would be to implement communication APIs on your own, develop protocols by calling tools in mpc4j
, and referring protocol implementations in mpc4j
as a prototype.
Features
mpc4j
has the following features:
aarch64
support:mpc4j
can run on bothx86_64
andaarch64
. Researchers can develop and test protocols on Macbook M1 (aarch64
) and then run experiments on Linux OS (x86_64
).- SM series support: Developers may want to use SM series algorithms (SM2 for public-key operations, SM3 for hashing, and SM4 for block cipher operations) instead of regular algorithms (like secp256k1 for public-key operations, SHA256 for hashing, and AES for block cipher operations). Also, the SM series algorithms are accepted by ISO/IES, so it may be necessary to support SM series algorithms under MPC settings.
mpc4j
leverages Bouncy Castle to support SM series algorithms.
Contact
mpc4j
is mainly developed by Weiran Liu. Feel free to contact me at [email protected].
- The submodules involving Fully Homomorphic Encryption (FHE) are mainly developed by Liqiang Peng and Qixian Zhou.
- The submodules involving Vector Oblivious Linear Evaluation (VOLE) are mainly developed by Hanwen Feng.
- The components of TFHE are developed by Zhen Gu of Computing Technology Lab (CTL) in Damo, Alibaba. The rest of their TFHE implementation by extending SEAL will be later released in their FHE library.
- The FourQ-related implementations and mobile PSI-friendly OPRF (i.e., single-query OPRF) are developed by Qixian Zhou.
- The submodules for circuits and operations based on the Boolean/arithmetic circuits are mainly developed by Li Peng.
mpc4j
Who Uses Currently, DataTrust is powered by mpc4j
. If your project uses mpc4j
and you do not mind it appearing here, don't hesitate to get in touch with me.
Academic Implementations
Some Implementations of our Works
If you want to test and evaluate our protocol implementations, compile and run the corresponding jar file with the config file. For example, if you want to run implementations related to PSU in the package mpc4j-s2pc-pso
, you can first find example config files located in conf/psu
in mpc4j-s2pc-pso
, and then run java -jar mpc4j-s2pc-pso-X.X.X-jar-with-dependencies.jar conf_file_name.txt
separately on two platforms with direct network connections (using the network channel assigned in config files) or on two terminals in one platform (using local network 127.0.0.1). Note that **you need first to run the server and then run the client. **The server and the client implicitly synchronize before running the protocol, and the first step is the client sends something like "hello" to the server. If the server is offline at that time, the program will get stuck.
- Our paper "Efficient Private Multiset ID Protocols" was accepted to ICICS 2023. Package
pmid
inmpc4j-s2pc-pso
contains the implementation of this paper. The configuration files are underconf/pmid
inmpc4j-s2pc-pso
. - Our paper "Linear Private Set Union from Multi-Query Reverse Private Membership Test" was accepted to USENIX Security 2023. Package
psu
inmpc4j-s2pc-pso
contains the implementation of this paper. The configuration files are underconf/psu
inmpc4j-s2pc-pso
. - Our paper "OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization" was accepted to VLDB 2023. Module
mpc4j-sml-opboost
contains the implementation of this paper. The configuration files are underconf
inmpc4j-sml-opboost
.
Some Implementations of Existing Works
mpc4j
contains some implementations of existing works. See PAPERS.md
for more details.
References
mpc4j
includes some implementation ideas and codes from the following open-source libraries.
Included Libraries
Here are some libraries that are included in mpc4j
.
-
smile: A fast and comprehensive machine learning, NLP, linear algebra, graph, interpolation, and visualization system in Java and Scala. We understand many details of implementing machine learning tasks from this library. We also introduce some codes into
mpc4j
for the dataset management and our privacy-preserving federated GBDT implementation. See packagesedu.alibaba.mpc4j.common.data
inmpc4j-common-data
and packageedu.alibaba.mpc4j.sml.smile
inmpc4j-sml-opboost
for details. Note that we introduce source codes that are released only under the GNU Lesser General Public License v3.0 (LGPLv3). -
Javallier: A Java library for Paillier partially homomorphic encryption based on python-paillier, with modifications to additionally support other schemes and optimizations. See
mpc4j-crypto-phe
for details. -
JNA GMP project: A JNA wrapper around the GNU Multiple Precision Arithmetic Library. We modify the code for supporting the
aarch64
system. Seempc4j-common-jna-gmp
for details. -
Bouncy Castle: A Java implementation of cryptographic algorithms, developed by the Legion of the Bouncy Castle, a registered Australian Charity. We understand many details of how to efficiently implement cryptographic algorithms using Java. We introduce its X25519 and Ed25519 implementations in
mpc4j
to support efficient Elliptic Curve Cryptographic (ECC) operations. See packageedu.alibaba.mpc4j.common.tool.crypto.ecc.bc
inmpc4j-common-tool
for details. -
Rings: An efficient, lightweight library for commutative algebra. We understand how to efficiently do algebra operations from this library. We wrap its polynomial interpolation implementations in
mpc4j
. See packageedu.alibaba.mpc4j.common.tool.polynomial
inmpc4j-common-tool
for details. We also provideJdkIntegersZp
that uses JNA GMP to implement operations in$\mathbb{Z}_p$ . SeeJdkIntegersZp
inmpc4j-common-tool
for details. -
blake2: Faster cryptographic hash function implementations. We introduce its original implementations and compare the efficiency with Java counterparts provided by Bouncy Castle and other hash functions (e.g., blake3). See
crypto/blake2
inmpc4j-native-tool
for details. -
blake3: Much faster cryptographic hash function implementations. We introduce its original implementations and compare the efficiency with Java counterparts provided by Bouncy Castle and other hash functions (e.g., blake2). See
crypto/blake3
inmpc4j-native-tool
for details. -
emp-toolkit: Efficient bit-matrix transpose (See
bit_matrix_trans
inmpc4j-native-tool
), AES-NI implementations (Seecrypto/aes.h
inmpc4j-native-tool
), efficient$GF(2^\kappa)$ operations (Seegf2k
inmpc4j-native-tool
). -
KyberJCE: Kyber is an IND-CCA2-secure key encapsulation mechanism (KEM), whose security is based on the hardness of solving the learning-with-errors (LWE) problem over module lattices. KyberJCE is a pure-Java implementation of Kyber. We introduce its Kyber implementation in
mpc4j
for supporting post-quantum secure oblivious transfer. Seecrypto/kyber
inmpc4j-native-tool
for details. -
xgboost-predictor: Pure Java implementation of XGBoost predictor for online prediction tasks. This work is released under the Apache Public License 2.0. We understand the format of the XGBoost model from this library. We also introduce some codes in
mpc4j
for our privacy-preserving federated XGBoost implementation. See packagesai.h2o.algos.tree
andbiz.k11i.xgboost
inmpc4j-sml-opboost
for details. -
curve25519-elisabeth: A pure-Java implementation of group operations on Curve25519. We introduce its ED25519 and Ristretto implementation in
mpc4j
. See packagecrypto/ecc/cafe
for details. -
FourQlib: A library that implements essential elliptic curve and cryptographic functions based on FourQ, a high-security, high-performance elliptic curve that targets the 128-bit security level. We rewrite
makefile
so that now FourQ can run on MacBook.
Inspired Libraries
Here are some libraries that inspire our implementations.
- mobile_psi_cpp: A C++ library implementing several OPRF protocols and using them for Private Set Intersection. We introduce its LowMC parameters and encryption implementations in
mpc4j
. Seeedu.alibaba.mpc4j.common.tool.crypto.prp.JdkBytesLowMcPrp
andedu.alibaba.mpc4j.common.tool.crypto.prp.JdkLongsLowMcPrp
inmpc4j-common-tool
for details. - emp-toolkit: We follow the implementation of the Silent OT protocol presented in the paper "Ferret: Fast Extension for coRRElated oT with Small Communication," accepted at CCS 2020 (See
cot
inmpc4j-s2pc-pcg
). - Kunlun: A C++ wrapper for OpenSSL, making it handy to use without worrying about cumbersome memory management and memorizing complex interfaces. Based on this wrapper, Kunlun builds an efficient and modular crypto library. We introduce its OpenSSL wrapper for Elliptic Curve and the Window Method implementation in
mpc4j
, seeecc_openssl
inmpc4j-native-tool
for details. - PSI-analytics: The implementation of the protocols presented in the paper "Private Set Operations from Oblivious Switching," accepted at PKC 2021. We introduce its switching network implementations in
mpc4j
. See packagebenes_network
inmpc4j-native-tool
for details. - Diffprivlib: A general-purpose library for experimenting with, investigating, and developing applications in differential privacy. We understand how to organize source codes for implementing differential privacy mechanisms. See
mpc4j-dp-cdp
for details. - b2_exponential_mchanism: An exponential mechanism implementation with base-2 differential privacy. We re-implement the base-2 exponential mechanism in
mpc4j
. See packageedu.alibaba.mpc4j.dp.cdp.nomial
for details. - libOTe: Implementations for many Oblivious Transfer (OT) protocols, especially the Silent OT protocol presented in the paper "Silver: Silent VOLE and Oblivious Transfer from Hardness of Decoding Structured LDPC Codes" accepted at CRYPTO 2021 (See package
cot
inmpc4j-s2pc-pcg
). - PSU: The implementation of the paper "Scalable Private Set Union from Symmetric-Key Techniques," published in ASIACRYPT 2019. We introduce its fast polynomial interpolation implementations in
mpc4j
. See packagentl_poly
inmpc4j-native-tool
for details. The PSU implementation is in packagepsu
ofmpc4j-s2pc-pso
. - PSU: The implementation of the paper "Shuffle-based Private Set Union: Faster and More," published in USENIX Security 2022. We introduce the idea of how to concurrently run the Oblivious Switching Network (OSN) in
mpc4j
. See packagepsu
inmpc4j-s2pc-pso
for details. - SpOT-PSI: The implementation of the paper "SpOT-Light: Lightweight Private Set Intersection from Sparse OT Extension," published in CRYPTO 2019. We introduce many ideas for fast polynomial interpolations in
mpc4j
. See packagepolynomial
inmpc4j-common-tool
for details. - OPRF-PSI: The implementation of the paper "Private Set Intersection in the Internet Setting From Lightweight Oblivious PRF," published in CRYPTO 2020. We introduce its OPRF implementations in
mpc4j
. Seeoprf
inmpc4j-s2pc-pso
for details. - APSI: The implementation of the paper "Labeled PSI from Homomorphic Encryption with Reduced Computation and Communication," published in CCS 2021. For its source code, we understand how to use the Fully Homomorphic Encryption (FHE) library SEAL. Most of the codes for Unbalanced Private Set Intersection (UPSI) are partially from ASPI. We also adapt the encoding part of 6857-private-categorization to support arbitrary bit-length elements. See
mpc4j-native-fhe
andupsi
inmpc-s2pc-pso
for details. - MiniPSI: The implementation of the paper "Compact and Malicious Private Set Intersection for Small Sets," published in CCS 2021. We understand how to implement Elliagtor encoding/decoding functions on Curve25519. See package
crypto/ecc/bc/X25519BcByteMulElligatorEcc
inmpc4j-common-tool
for details. - Ed25519: Ed25519 in for Go. We understand how to implement Elliagtor in Ed25519. See package
crypto/ecc/bc/X25519BcByteMulElligatorEcc
inmpc4j-common-tool
for details. - dgs: Discrete Gaussians over the Integers. We learn many ways of discrete Gaussian sampling. See package
common/sampler/integral/gaussian
inmpc4j-common-sampler
for details. - Pure-DP: a Python package that provides simple implementations of various state-of-the-art LDP algorithms (both Frequency Oracles and Heavy Hitters) with the main goal of providing a single, simple interface to benchmark and experiment with these algorithms. We learn many efficient LDP implementation details.
- PantheonPIR, SimplePIR, MulPIR, Constant-weight PIR, FastPIR, Onion-PIR, SealPIR, and XPIR: We understand many details for implementing PIR schemes. We re-implement some protocols based on SEAL instead of NFLlib, since we found we cannot compile NFLlib on Macbook M1 with
aarch64
.
Acknowledge
We thank Prof. Benny Pinkas and Dr. Avishay Yanai for many discussions on implementing Private Set Intersection protocols. They also greatly help our Java implementations for Oblivious Key-Value Storage (OKVS) presented in the paper "Oblivious Key-Value Stores and Amplification for Private Set Intersection," accepted at CRYPTO 2021. See package okve/okvs
in mpc4j-common-tool
for more details.
We thank Dr. Stanislav Poslavsky and Prof. Benny Pinkas for many discussions on implementations of fast polynomial interpolations when we try to implement the PSI protocol presented in the paper "SpOT-Light: Lightweight Private Set Intersection from Sparse OT Extension."
We thank Prof. Mike Rosulek for the discussions about the implementation of Private Set Union (PSU). Their implementation for the paper "Private Set Operations from Oblivious Switching" brings much help for us to understand how to implement PSU.
We thank Prof. Xiao Wang for discussions about fast bit-matrix transpose. From the discussion, we understand that the basic idea of fast bit-matrix transpose is from the blog The Full SSE2 Bit Matrix Transpose Routine. He also helped me realize that there exists an efficient polynomial operation implementation in galoisfield/gf2k
in mpc4j-common-tool
for more details.
We thank Prof. Peihan Miao for discussions about the implementation of the paper "Private Set Intersection in the Internet Setting From Lightweight Oblivious PRF." From the discussion, we understand there is a special case for the lightweight OPRF when oprf
in mpc4j-s2pc-pso
for more details.
We thank Prof. Yu Chen for many discussions on various MPC protocols. Here we recommend his open-source library Kunlun, a modern crypto library. We thank Minglang Dong for her example codes about implementing the Window Method for fixed-base multiplication in ECC.
We thank Dr. Bolin Ding for many discussions on introducing MPC into the database field. Here we recommend the open-source library FederatedScope, an easy-to-use federated learning package, from his team.
We thank anonymous USENIX Security 2023 Artifact Evaluation (AE) reviewers for many suggestions for the mpc4j
documentation and for mpc4j-native-tool
. These suggestions help us fix many memory leakage problems. Also, the comments help us remove many duplicate codes.
License
This library is licensed under Apache License 2.0.
Specifications
C/C++ Modules
Most of the codes are in Java, except for very efficient implementations in C/C++. You need OpenSSL, GMP, NTL , MCL, libsodium, and FourQ that we rewrite (in mpc4j-native-fourq
) to compile mpc4j-native-tool
and SEAL 4.0.0 to compile mpc4j-native-fhe
. Please see READMD.md in mpc4j-native-cool
and mpc4j-native-fhe
on how to install C/C++ dependencies.
After successfully obtaining the compiled C/C++ libraries (named libmpc4j-native-tool
and libmpc4j-native-fhe
, respectively), you need to assign the native library location when running mpc4j
using -Djava.library.path
.
Tests
mpc4j
has been tested on MAC (x86_64
/ aarch64
), Ubuntu 20.04 (x86_64
/ aarch64
), and CentOS 8 (x86_64
). We welcome developers to do tests on other platforms.
We note that you may need to run test cases in mpc4j-s2pc-pir
separately, especially for test cases in IndexPirTest
and KwPirTest
. The reason is that PIR and related implementations heavily consume the main memory, and direct running all test cases may (automatically) involve frequent fullGC, introducing problems.
Performances
We have received a lot of suggestions and some performance reports from users. We thank Dr. Yongha Son for providing performance reports for Private Set Union (PSU) on his development platform (Intel Xeon 3.5GHz) under the Unit Test. He reported that:
Well, I tested other protocols, particularly JSZ22 SFC, GMR21, and KRTW19, from unit tests.
JSZ22 takes 4x faster time.
KRTW19 and GMR21 take 1.5x slower.
ZCL22 takes 2.5-3x slower time.
than the reported numbers in ZCL22.
We have a deep discussion about the performance gap. Here are the following reasons:
- In Unit Test, we use an optimized way of implementing JSZ22. Roughly speaking, we can use batched related-key OPRF proposed by Kolesnikov et al. instead of the more general multi-point OPRF proposed by Chase and Miao to speed up the underlying OPRF. The reason is that JSZ22 used cuckoo hash binning the input elements, suitable for related-key OPRF. See our paper "Private Set Operations from Multi-Query Reverse Private Membership Test" for more details.
- As far as we know, server-version CPUs (like Intel Xeon 3.5GHz) provide more efficient instructions than desktop-version CPUs (like Intel i9900k). Note that NTL and GMP would automatically detect the underlying platform to choose the most efficient way for their configurations. We doubt these instructions would help NTL and GMP libraries run faster. It seems that such efficient instructions would bring little help to ECC operations. As a comparison, Dr. Yongha Son ran
EccEfficiencyTest
on his platform. The result shows ECC operations on his platform withasm
are much slower (about 5x) than on our Macbook M1 platform withoutasm
.
We have to say that we underestimated the performance gap between different platforms. The performance comparison result also reflects that having fair comparisons for different protocols is very challenging. Aside from that, we still try to provide a unified library for trying to have a relatively fair comparison.
aarch64
Notes for Running on When using or developing mpc4j
on aarch64
systems (like MacBook M1), you may get java.lang.UnsatisfiedLinkError
with a description like "no mpc4j-native-tool / mpc4j-native-fhe in java.library.path", even if you correctly compile the native libraries and config the native library paths using -Djava.library.path
. The reason is that some Java Virtual Machines (JVM) with versions less than 17 do not fully support aarch64
. JDK 17 Release Notes stated that (In JEP 391: macOS / Aarch64 Port):
macOS 11.0 now supports the AArch64 architecture. This JEP implements support for the macos-aarch64 platform in the JDK. One of the features added is support for the W^X (write xor execute) memory. It is enabled only for macos-aarch64 and can be extended to other platforms at some point. The JDK can be either cross-compiled on an Intel machine or compiled on an Apple M1-based machine.
We recommend using Java 17 (or higher versions) to run or develop mpc4j
on aarch64
systems. If you still want to use Java with versions less than 17, we test many JVMs and found that Azul Zulu fully supports aarch64
.
Development
Development Guideline
We develop mpc4j
using Intellij IDEA and CLion. After successfully compiling mpc4j-native-tool
and mpc4j-native-fhe
, you need to configure IDEA with the following procedures so that IDEA can link to these native libraries.
- Open
Run->Edit Configurations...
- Open
Edit Configuration templates...
- Select
JUnit
. - Add the following command into
VM Options
:
-Djava.library.path=/YOUR_MPC4J_ABSOLUTE_PATH/mpc4j-native-tool/cmake-build-release:/YOUR_MPC4J_ABSOLUTE_PATH/mpc4j-native-fhe/cmake-build-release
Demonstration
We thank Qixian Zhou for writing a guideline demonstrating configuring the development environment on macOS (x86_64). We believe this guideline can also be used for other platforms, e.g., macOS (M1), Ubuntu, and CentOS. Here are the steps:
- Follow any guidelines to install JDK 8 and IntelliJ IDEA. If you successfully install JDK8, you can obtain similar information in the terminal when executing
java -version
.
java version "1.8.0_301"
Java(TM) SE Runtime Environment (build 1.9.0_301-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.301-b09, mixed mode)
-
Clone
mpc4j
source code usinggit clone https://github.com/alibaba-edu/mpc4j.git
. -
Follow the documentation in https://github.com/alibaba-edu/mpc4j/tree/main/mpc4j-native-tool to compile
mpc4j-native-tool
. If all steps are correct, you will see:
[100%] Linking CXX shared library libmpc4j-native-tool.dylib
[100%] Built target mc4j-native-tool
- Follow the documentation in https://github.com/alibaba-edu/mpc4j/tree/main/mpc4j-native-fhe to compile
mpc4j-native-tool
. If all steps are correct, you will see:
[100%] Linking CXX shared library libmpc4j-native-fhe.dylib
[100%] Built target mc4j-native-fhe
- Using IntelliJ IDEA to open
mpc4j
. - Open
Run->Edit Configurations...
.
- Open
Edit Configuration templates...
.
- Select
JUnit
, and add the following command intoVM Options
(Note that you must replace/YOUR_MPC4J_ABSOLUTE_PATH
with your own absolute path forlibmpc4j-native-tool.dylib
andlibmpc4j-native-fhe.dylib
.):
-Djava.library.path=/YOUR_MPC4J_ABSOLUTE_PATH/mpc4j-native-tool/cmake-build-release:/YOUR_MPC4J_ABSOLUTE_PATH/mpc4j-native-fhe/cmake-build-release
- Now, you can run tests of any submodule by pressing the Green Arrows showing on the left of the source code in test packages.
TODO List
Possible Missions
- Provide more documentation.
- Translate JavaDoc and comments in English.
- We are still adjusting our implementations on many Private Set Intersection protocols. We will soonly release the source code whenever available.
- More secure two-party computation (2PC) protocol implementations.
- More secure three-party computation (3PC) protocol implementations. Specifically, release the source code of our paper "Scape: Scalable Collaborative Analytics System on Private Database with Malicious Security" accepted at ICDE 2022.
- More differentially private algorithms and protocols, especially for the Shuffle Model implementations of our paper "Privacy Enhancement via Dummy Points in the Shuffle Model."
Impossible Missions, but We Will Try
- What about implementing "Deep Learning with Differential Privacy" and its following works using Java, e.g., based on Deep Java Library?
- (Suggested by Prof. Joe Near) What about implementing Distributed Noise Generation protocols, like "Our Data, Ourselves: Privacy via Distributed Noise Generation"?