• Stars
    star
    149
  • Rank 248,619 (Top 5 %)
  • Language
    HTML
  • License
    BSD 3-Clause "New...
  • Created about 10 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

VCF visualization interface

Analyze VCFs and collaborate on solving rare diseases quicker

Build Status - GitHub PyPI Version GitHub Release Date GitHub commits latest GitHub commit rate GitHub commits GitHub issues-closed Average time to resolve an issue Percentage of issues still open Coverage Status CodeFactor Code style: black Woke

What is Scout?

  • Simple - Analyze variants in a simple to use web interface.
  • Aggregation - Combine results from multiple analyses and VCFs into a centralized database.
  • Collaboration - Write comments and share cases between users and institutes.

Documentation

This README only gives a brief overview of Scout, for a more complete reference, please check out our docs: https://clinical-genomics.github.io/scout .

Runnable demo image - does not require installing of software and database

A simple demo instance of Scout requires the installation of Docker and can be launched either by using the command: docker-compose up -d or make up.

The repository includes a Makefile with common shortcuts to simplify setting up and working with Scout. To see a full list and description of these shortcuts run: make help.

This demo is consisting of 3 containers:

  • a MongoDB instance, on the default port 27017 in the container, mapped to host port 27013
  • scout-cli --> the Scout command line, connected to the database. Populates the database with demo data
  • scout-web --> the Scout web app, that serves the app on localhost, port 8000.

Once the server has started you and open the app in the web browser at the following address: http://localhost:8000/

The command to stop the demo are either docker-compose down or make down.

Instructions on how to run a Scout image connected to your local database or a custom database are present on this page.

Installation

git clone https://github.com/Clinical-Genomics/scout
cd scout
pip install --editable .

Scout PDF reports are created using Flask-WeasyPrint. This library requires external dependencies which need be installed separately (namely Cairo and Pango). See platform-specific instructions for Linux, macOS and Windows available on the WeasyPrint installation pages.

NB: in order to convert HTML reports into PDF reports, we have recently switched from the WeasyPrint lib to python-pdfkit. For this reason, when upgrading to a Scout version >4.47, you need to install an additional wkhtmltopdf system library.

You also need to have an instance of MongoDB running. I've found that it's easiest to do using the official Docker image:

docker run --name mongo -p 27017:27017 mongo

Usage

Demo - requires pip-installing the app in a container and a running instance of mongodb

Once installed, you can setup Scout by running a few commands using the included command line interface. Given you have a MongoDB server listening on the default port (27017), this is how you would setup a fully working Scout demo:

scout setup demo

This will setup an instance of scout with a database called scout-demo. Now run

scout --demo serve

And play around with the interface. A user has been created with email [email protected] so use that address to get access

Initialize scout

To initialize a working instance with all genes, diseases etc run

scout setup database

for more info, run scout --help

The previous command initializes the database with a curated collection of gene definitions with links to OMIM along with HPO phenotype terms. Now we will load some example data. Scout expects the analysis to be accomplished using various gene panels so let's load one and then our first analysis case:

scout load panel scout/demo/panel_1.txt
scout load case scout/demo/643594.config.yaml

Integration with chanjo for coverage report visualization

Scout may be configured to visualize coverage reports produced by Chanjo. Instructions on how to enable this feature can be found in the document chanjo_coverage_integration.

Integration with loqusdb for integrating local variant frequencies

Scout may be configured to visualize local variant frequencies monitored by Loqusdb. Instructions on how to enable this feature can be found in the document loqusdb integration.

Integration with Gens for displaying copy number profiles for variants

Scout may be configured to link to a local Gens installation. Instructions on how to enable this feature can be found in the document Gens integration.

Server setup

Scout needs a server config to know which databases to connect to etc. Depending on which information you provide you activate different parts of the interface automatically, including user authentication, coverage, and local observations.

This is an example of the config file:

# scoutconfig.py

# list of email addresses to send errors to in production
ADMINS = ['[email protected]']

MONGO_HOST = 'localhost'
MONGO_PORT = 27017
MONGO_DBNAME = 'scout'
MONGO_USERNAME = 'testUser'
MONGO_PASSWORD = 'testPass'

# enable user authentication using Google OAuth 2.0
GOOGLE = dict(
   client_id="client_id_string.apps.googleusercontent.com",
   client_secret="client_secret_string",
   discovery_url="https://accounts.google.com/.well-known/openid-configuration"
)

# enable Phenomizer gene predictions from phenotype terms
PHENOMIZER_USERNAME = '???'
PHENOMIZER_PASSWORD = '???'

# enable Chanjo coverage integration
SQLALCHEMY_DATABASE_URI = '???'
REPORT_LANGUAGE = 'en'  # or 'sv'

# other interesting settings
SQLALCHEMY_TRACK_MODIFICATIONS = False  # this is essential in production
TEMPLATES_AUTO_RELOAD = False  			# consider turning off in production
SECRET_KEY = 'secret key'               # override in production!

Most of the config settings are optional. A minimal config would consist of SECRET_KEY and MONGO_DBNAME.

Starting the server in now really easy, for the demo and local development we will use the CLI:

scout --flask-config config.py serve

Scout Interface demo

Hosting a production server

When running the server in production you will likely want to use a proper Python server solution such as Gunicorn. This is also how we can multiprocess the server and use encrypted HTTPS connections.

SCOUT_CONFIG=./config.py gunicorn --workers 4 --bind 0.0.0.0:8080 scout.server.auto:app

For added security and flexibility, we recommend a reverse proxy solution like NGINX.

Setting up a user login system

Scout currently supports 3 mutually exclusive types of login:

  • Google authentication via OpenID Connect (OAuth 2.0)
  • LDAP authentication
  • Simple authentication using userid and password

The first 2 solutions are both suitable for a production server. A description on how to set up an advanced login system is available in the admin guide

Integration with Matchmaker Exchange

Starting from release 4.4, Scout offers integration for patient data sharing via Matchmaker Exchange. General info about Matchmaker and patient matching could be found in this paper. For a technical guideline of our implementation of Matchmaker Exchange at Clinical Genomics and its integration with Scout check scouts matchmaker docs. A user-oriented guide describing how to share case and variant data to Matchmaker using Scout can be found here.

Development

To keep the code base consistent, formatting with Black is always applied as part of the PR submission process via GitHub Actions. While not strictly required, to avoid confusion, it is suggested that developers apply Black locally. Black defaults to 88 characters per line, we use 100.

To format all the files in the project run:

black --line-length 100 .

We recommend using Black with pre-commit. In .pre-commit-config.yaml you can find the pre-commit configuration. To enable this configuration run:

pre-commit install

Test

To run unit tests:

pytest

Contributing to Scout

If you want to contribute and make Scout better, you help is very appreciated! Bug reports or feature requests are really helpful and can be submitted via github issues. Feel free to open a pull request to add a new functionality or fixing a bug, we welcome any help, regardless of the amount of code provided or your skills as a programmer. More info on how to contribute to the project and a description of the Scout branching workflow can be found in CONTRIBUTING.

More Repositories

1

genmod

Annotate models of genetic inheritance patterns in variant files (vcf files)
Python
75
star
2

chanjo

Chanjo provides a better way to analyze coverage data in clinical sequencing.
Python
49
star
3

BALSAMIC

Bioinformatic Analysis pipeLine for SomAtic Mutations In Cancer
Python
44
star
4

MIP

Mutation Identification Pipeline. Read the latest documentation:
Perl
42
star
5

stranger

Tool to annotate outfiles from ExpansionHunter and TRGT with the pathologic implications of the repeat
Python
30
star
6

fusion-report

Tool for parsing outputs from fusion detection tools. Part of a nf-core/rnafusion pipeline. Checkout a live demo at https://matq007.github.io/fusion-report/example/
Python
24
star
7

patientMatcher

A MatchMaker Exchange server
Python
11
star
8

downsampling

Downsample fastq files in an automated way
Shell
9
star
9

loqusdb

A simple observation count database
Python
9
star
10

genotype

Simple genotype comparison of VCF files
Python
8
star
11

demultiplexing

To keep scripts associated with execution of the Illumina demultiplexing pipeline
Python
5
star
12

cg

Glue between Clinical Genomics apps
Python
5
star
13

trailblazer

Keep track of and manage analyses
Python
5
star
14

build-podman

Build Podman with Github actions
Shell
4
star
15

cg_lims

Python
3
star
16

clinical

General python modules for e.g. database access
Python
2
star
17

SAISMATTERS

Fetal fraction estimation through fragment lenght profiling
Python
2
star
18

mutacc

Mutation accumulation in a genomic background of reference samples
Python
2
star
19

housekeeper

File data orchestrator
Python
2
star
20

meatballs

An open source recipe book from the awesome staff of Clinical Genomics
Python
2
star
21

loqusdbapi

A simple REST api for loqusdb
Python
2
star
22

preClinVar

A ClinVar API submission helper written in FastAPI
Python
2
star
23

microSALT

Microbial Sequence Analysis and Loci-based Typing pipeline for use on NGS WGS data.
Python
2
star
24

backup

Shell
2
star
25

chanjo2

Persistent coverage analysis tool using the d4 format
Python
2
star
26

cglims

LIMS interface for the specific usage at Clinical Genomics
Python
1
star
27

varg

variant validation report generator
Python
1
star
28

orderportal

Clinical genomics web interface for placing orders
1
star
29

versioning

Scripts and modules to automate versioning
Shell
1
star
30

hermes

Communication layer between CG and the pipelines.
Python
1
star
31

Metoid

Pipeline for metagenomic organism identification
Python
1
star
32

cgp-preparation-for-swedac-audit-2020

Preparation for Swedac audit 2020-06-10
1
star
33

shipping

Tool for deploying software
Python
1
star
34

statina

Python
1
star
35

databases

Scripts for interfacing with databases, SQL snippets, and schema overviews.
Python
1
star
36

reference-files

Small reference files
Lua
1
star
37

cgbeacon

Instructions and files to set up an Elixir-based beacon connected to a MySQL database.
Python
1
star
38

snippets

Code snippets and one liners
Shell
1
star
39

binaries

1
star
40

maintenance

Scripts for surveillance and monitoring, (backup), and general maintenance of data.
Shell
1
star
41

clinstatweb

Frontend overview webpages
JavaScript
1
star
42

cancerpipeline

Scripts handling our canceer pipeline
Shell
1
star
43

deliver

Scripts for automating data delivery and report generation of sequencing results
Python
1
star
44

genotype-ui

User interface for Genotype
CSS
1
star