HH-suite3 for sensitive sequence searching

The HH-suite is an open-source software package for sensitive protein sequence searching based on the pairwise alignment of hidden Markov models (HMMs).

Documentation

We provide an extensive user guide with many usage examples, frequently asked questions and guides to build your own databases.

Installation

HH-suite3 can also be installed by downloading a statically compiled version, conda or Docker. HH-suite3 requires a 64-bit system (check with uname -a | grep x86_64). On AMD/Intel CPUs it requires at least support for the SSE2 instruction set (check by executing cat /proc/cpuinfo | grep sse2 on Linux or sysctl -a | grep machdep.cpu.features | grep SSE2 on macOS). AVX2 is roughly 2x faster compared to SSE2. HH-suite3 also works on Linux systems with ARM64 and PPC64LE CPUs. Precompiled binaries for all supported systems can be found at mmseqs.com/hhsuite.

# install via conda
conda install -c conda-forge -c bioconda hhsuite 
# install docker
docker pull soedinglab/hh-suite
# static SSE2 build
wget https://github.com/soedinglab/hh-suite/releases/download/v3.3.0/hhsuite-3.3.0-SSE2-Linux.tar.gz; tar xvfz hhsuite-3.3.0-SSE2-Linux.tar.gz; export PATH="$(pwd)/bin:$(pwd)/scripts:$PATH"
# static AVX2 build
wget https://github.com/soedinglab/hh-suite/releases/download/v3.3.0/hhsuite-3.3.0-AVX2-Linux.tar.gz; tar xvfz hhsuite-3.3.0-AVX2-Linux.tar.gz; export PATH="$(pwd)/bin:$(pwd)/scripts:$PATH"

❗ Only the self-compiled HH-suite3 version includes MPI support, since MPI configuration is specific to the local environment.

Available Databases

List of available database for HH-suite3:

Uniclust30 [pub]
BFD (consists of 2.5 billion, mostly enviromental, protein sequences) [pub]
Pfam/SCOP/PDB70/dbCAN

Also checkout the databases (COG/ECOG/CD/...) maintained by the MPI Bioinformatics Toolkit [pub].

Compilation

To compile from source, you will need a recent C/C++ compiler (at least GCC 4.8 or Clang 3.6) and CMake 2.8.12 or later.

To download the source code and compile the HH-suite execute the following commands:

git clone https://github.com/soedinglab/hh-suite.git
mkdir -p hh-suite/build && cd hh-suite/build
cmake -DCMAKE_INSTALL_PREFIX=. ..
make -j 4 && make install
export PATH="$(pwd)/bin:$(pwd)/scripts:$PATH"

❗ To compile HH-suite3 on macOS, first install the gcc compiler from Homebrew. The default macOS clang compiler does not support OpenMP and HH-suite3 will only be able to use a single thread. Then replace the cmake call above with the following one:

CC="$(brew --prefix)/bin/gcc-10" CXX="$(brew --prefix)/bin/g++-10" cmake -DCMAKE_INSTALL_PREFIX=. ..

Usage

For performing a single search iteration of HHblits, run HHblits with the following command:

hhblits -i <input-file> -o <result-file> -n 1 -d <database-basename>

For generating an alignment of homologous sequences:

hhblits -i <input-file> -o <result-file> -oa3m <result-alignment> -d <database-basename>

A detailed list of options for HHblits is available by running HHblits with the -h parameter.

Reference

Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger S J, and Söding J (2019) HH-suite3 for fast remote homology detection and deep protein annotation, BMC Bioinformatics, 473. doi: 10.1186/s12859-019-3019-7

soedinglab/hh-suite

soedinglab

Reviews

Repository Details