• Stars
    star
    110
  • Rank 316,770 (Top 7 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created almost 12 years ago
  • Updated 28 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Metadata database for ENCODE project

ENCODE Metadata Database

Running the application locally using Docker

Install

  1. Download and install Docker.

  2. Start Docker

  3. Open Docker preferences, find the Advanced tab under Resources. Make sure the engine has at least 8GB of memory.

Build

All the following commands should be run in the root of this repository.

  1. Clean up possible previous build artifacts.
$ make clean
  1. Build the docker image (first time you run this it will take up to 15 minutes):
$ docker build -t encoded-devcontainer:latest -f .devcontainer/Dockerfile .
  1. Start the container with the appropriate ports forwarded, and this directory mounted on /workspaces/encoded in the container.
$ docker run --rm -it -p 6378:6378 -p 6543:6543 -p 8000:8000 -p 9201:9201 -v $(pwd):/workspaces/encoded --workdir /workspaces/encoded --name encode-container encoded-devcontainer:latest bash
  1. In the shell that opens within the container you started in step 3. run the following commands:
$ make devcontainer
$ dev-servers development.ini --app-name app --clear --init --load
  1. In other terminal open a shell in the running container:
$ docker exec -it encode-container bash
  1. In the shell that you opened in step 5. run:
$ pserve development.ini
  1. Browse the app at localhost:6543

  2. Closing both terminals will cause the container to exit. You do not need to do anything else to stop the app.

Running the application in Github Codespaces

  1. In this repository, click the green Code button, choose the Codespaces tab and then click the ... to create a new Codespace.

  2. In the options you can choose the branch (you can also check out your branch later), and the machine size (the second smallest with 4 cores and 8GB of memory is enough).

  3. Click Create codespace

  4. Building the image (and specifically the npm ci command) will take about 15 minutes.

  5. Once the build completes you will be take to a VSCode editor running in your browser. Wait for the postCreateCommand to finish.

  6. Choose the branch you want to run the app from (if you did not do it in the step 2.)

  7. In the terminal run dev-servers development.ini --app-name app --clear --init --load

  8. Open a new terminal tab (the button with a + -symbol).

  9. Run pserve development.ini.

  10. You can now browse the app via the pop-up, or the address shown next to the pserve(6543) Local Address column in ports tab above the terminal window.

Deploying an AWS demo

Building the application is not necessary to deploy a demo. All you need is a python virtual environment with boto3 package installed. In the root of this repository run:

$ python src/encoded/commands/deploy.py <options>

System Installation (Deprecated method, using Docker or Codespaces recommended)

See Snovault OSX System Installation. ENCODE installs Snovault as it is a dependency. The System Installation is the same for both. However, you do not need to set up a running Snovault instance yourself.

Application Installation

For issues see Snovault OSX App Installation first.

  1. Create a virtual env in your work directory.

    This example uses the python module venv. Other options would also work, like conda or pyenv. Please note that older versions of pip may cause issues when updating the application. On MacOS pip 21.0.1 is known to work.

    $ cd your-work-dir
    $ python3 -m venv encoded-venv
    $ source encoded-venv/bin/activate
    $ pip install -U pip==21.0.1
  2. Clone the repo and cd into it

    git clone [email protected]:ENCODE-DCC/encoded.git
    cd encoded
  3. Build Application

    make clean && make install

    If you need to develop snovault side by side you can use the following commands instead, assuming encoded and snovault are present at the same level in your filesystem and virtual environment is activated.

    $ cd .. && pip install -e snovault
    $ cd encoded && make clean && make install
  4. Run Application

    $ dev-servers development.ini --app-name app --clear --init --load
    # In a separate terminal, make sure you are in the encoded-venv
    $ pserve development.ini
  5. Browse to the interface at http://localhost:6543

  6. Run Tests

    # Make sure you are in the encoded-venv
    ./circle-tests.sh bdd
    ./circle-tests.sh indexing
    ./circle-tests.sh indexer
    ./circle-tests.sh not-bdd-non-indexing
    ./circle-tests.sh npm

    You can also invoke pytest directly if you need more granular control over which Python tests to run.

    # Make sure you are in your venv
    # Run a specific test in a specific file
    $ pytest TEST_FILE_PATH::TEST_NAME
    # Run tests with the given mark
    $ pytest -m $PYTEST_MARK

Working on the Pyramid configuration

The Pyramid INI files are templated out using Jsonnet. To update these configurations, install the jsonnet executable with brew install jsonnet. Running make config will generate the new configuration and format the jsonnet files, make sure to run this before pushing or CircleCI will fail.

The Jsonnet files and generated config are located in conf/pyramid/. The file sections.libsonnet is a library of functions that each returns a representation of a single section of an INI file. The file config.jsonnet assembles these sections and outputs a concrete INI file.

More Repositories

1

atac-seq-pipeline

ENCODE ATAC-seq pipeline
Python
385
star
2

chip-seq-pipeline2

ENCODE ChIP-seq pipeline
Python
244
star
3

kentUtils

UCSC command line bioinformatic utilities
C
167
star
4

rna-seq-pipeline

Python
141
star
5

chip-seq-pipeline

ENCODE Uniform processing pipeline for ChIP-seq
Python
120
star
6

long-rna-seq-pipeline

STAR based ENCODE Long RNA-Seq processing pipeline
Python
92
star
7

hic-pipeline

HiC uniform processing pipeline
WDL
56
star
8

dna-me-pipeline

DCC/DAC methylation pipeline source
Perl
55
star
9

caper

Cromwell/WDL wrapper for Python
Python
54
star
10

long-read-rna-pipeline

ENCODE long read RNA-seq pipeline
WDL
44
star
11

wgbs-pipeline

ENCODE whole-genome bisulfite sequencing (WGBS) pipeline
Python
29
star
12

mirna-seq-pipeline

WDL
17
star
13

snovault

The SnoVault general purpose hybrid object-relational database
Python
16
star
14

croo

Cromwell output organizer
Python
13
star
15

dnase_pipeline

ENCODE DNase-seq pipeline essentials for running on dnanexus.
Shell
12
star
16

demo-pipeline

Python
11
star
17

pyencoded-tools

Jupyter Notebook
10
star
18

encode-data-usage-examples

Jupyter Notebook
9
star
19

uniformAnalysis

Uniform analysis pipeline work at UCSC for ENCODE
Python
8
star
20

submission_sample_scripts

Scripts to demonstrate the ENCODE REST API for metadata submission.
Python
8
star
21

dnase-seq-pipeline

ENCODE DNase-seq pipeline
WDL
6
star
22

WranglerScripts

Collection of scripts used by the wranglers to interact with the servers
Python
5
star
23

Bismark-ENCODE-WGBS

DNANexus Whole Genome Bisulphite Analysis Pipeline
Perl
5
star
24

encValData

AngelScript
5
star
25

encodeOntologies

Python module to download, parse and index ontology files.
Python
4
star
26

genomic-data-service

Flask based web service providing genomic region search, based on regulomedb.org
Python
4
star
27

qc_metrics

Module to grab QC metrics from ENCODE uniform processing pipelines
Python
3
star
28

accession

Python module to upload experiment files and metadata to the ENCODE Portal
Python
3
star
29

geo-submission

Python
3
star
30

qc2tsv

Converts multiple QC objects (JSON/TSV/CSV) into a spreadsheet
Python
3
star
31

s3-md5-hash

Lambda function to compute MD5 hashes of S3 objects
Python
3
star
32

pipeline-container

Containerization infrastructure for ENCODE analysis pipelines
Python
3
star
33

ucscGb

Python code out of the UCSC Genome Brower "kent/src" tree
Python
2
star
34

file-validation-pipeline

ENCODE / DNA nexus pipeline for file validation
HTML
2
star
35

segway-pipeline

Python
2
star
36

ENCODE-DAC-pipelines

Hub for data analysis pipelines and software in ENCODE 3.
2
star
37

users-meeting-2020-workshop

2
star
38

atac-seq-pipeline-test-data

Test data for ENCODE atac-seq-pipeline
HTML
2
star
39

dxencode

Utility module to interface encoded metadatabase, AWS, and DNANexus api for Universal Pipelines
Python
2
star
40

snovault-search

Python
2
star
41

imputation_challenge

ENCODE Imputation Challenge scoring & validation scripts
Python
2
star
42

trackhub_example

Simple examples of common data and organization types used to vizualize data in the UCSC Genome Browser using Track Hubs
2
star
43

checkfiles

Files are checked to see if the MD5 sum (both for gzipped and ungzipped) is identical to the submitted metadata, as well as run through the validateFiles program from jksrc.
Python
2
star
44

cvDjango

Repository to try out Django in the implementation of controlled vocabulary and experimental meta-data
Python
2
star
45

chromhmm-pipeline

WDL pipeline for chromhmm
WDL
1
star
46

dna-nexus-collaboration

ENCODE-DNANexus collaboration
R
1
star
47

wgot

Peformant parallel GET extracted from aws-cli
Python
1
star
48

Mappings

Official ENCODE mappings for a variety of terms
1
star
49

dccMetadataImport

Tables of metadata to import to the DCC.
HTML
1
star
50

encoded-walkme

CSS
1
star
51

modencode

modENCODE temporary site
HTML
1
star
52

ptools_bin

Ptools to pypi
Python
1
star
53

metadata-to-pipelines

Repository for code used in translating encoded metadata to pipelines.
1
star
54

regulome-encoded

Temp repo for regulome development
JavaScript
1
star
55

encodemouseportal

Python
1
star
56

gcs-s3-transfer-service

A Flask service on Google App Engine to upload files from Google Cloud Storage to AWS S3
Python
1
star
57

ptools

Pipeline to convert bams into pbams
Python
1
star
58

encode_slims

1
star