• Stars
    star
    121
  • Rank 293,924 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created over 10 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

VCF-kit: Assorted utilities for the variant call format

Build Status Coverage Status Documentation Status

VCF-kit - Documentation

VCF-kit is a command-line based collection of utilities for performing analysis on Variant Call Format (VCF) files. A summary of the commands is provided below.

Command Description
calc Obtain frequency/count of genotypes and alleles.
call Compare variants identified from sequences obtained through alternative methods against a VCF.
filter Filter variants with a minimum or maximum number of REF, HET, ALT, or missing calls.
geno Various operations at the genotype level.
genome Reference genome processing and management.
hmm Hidden-markov model for use in imputing genotypes from parental genotypes in linkage studies.
phylo Generate dendrograms from a VCF.
primer Generate primers for variant validation.
rename Add a prefix, suffix, or substitute a string in sample names.
tajima Calculate Tajima’s D.
vcf2tsv Convert a VCF to TSV.

Installation

VCF-Kit has been upgraded to Python 3

VCF-kit has been tested with Python 3.6. VCF-kit makes use of additional software for a variety of tasks:

  • bwa (v 0.7.12)
  • samtools (v 1.3)
  • bcftools (v 1.3)
  • blast (v 2.2.31+)
  • muscle (v 3.8.31)
  • primer3 (v 2.5.0)

You can install these dependencies and VCF-kit using conda, or you can use a Docker image.

Conda

conda config --add channels bioconda
conda config --add channels conda-forge
conda create -n vcf-kit \
  danielecook::vcf-kit=0.2.6 \
  "bwa>=0.7.17" \
  "samtools>=1.10" \
  "bcftools>=1.10" \
  "blast>=2.2.31" \
  "muscle>=3.8.31" \
  "primer3>=2.5.0"

conda activate vcf-kit

Docker

You can also run VCF-kit with all installed dependencies using docker:

docker run -it andersenlab/vcf-kit vk

More Repositories

1

IBiS-Bootcamp

Shell
40
star
2

cegwas2-nf

GWA mapping with C. elegans
R
8
star
3

Genetic-Analysis

8
star
4

pyPipeline

Andersen Lab Python-based pipeline for variant calling.
Python
6
star
5

liftover-utils

Utilities for lifting over genome coordinates in C. elegans
Perl
6
star
6

CeNDR

HTML
5
star
7

CAENDR

HTML
4
star
8

CellProfiler

CellProfiler Image Analysis Tools
Shell
3
star
9

cegwas

Pipeline for performing GWAS mappings with C. elegans phenotype data
R
3
star
10

dry-guide

The Guide to Computing in the Andersen Lab
2
star
11

andersenlab.github.io

Andersen Lab Website
HTML
2
star
12

Ce-328pop-div

HTML
2
star
13

HawaiiMS

OpenEdge ABL
2
star
14

easyXpress

easyXpress is an R package for the analysis and visualization of high-throughput image-based nematode data
HTML
2
star
15

Coding_Resources

2
star
16

NemaScan

GWA Mapping and Simulation with C. elegans, C. tropicalis, and C. briggsae
R
2
star
17

BZRNA-seq-nf

Benzimidazole RNA-Seq and smRNA-Seq Pipeline
Python
2
star
18

post-gatk-nf

Subset isotype-only vcf, build tree etc. steps after variant calling and isotype assignment
Nextflow
2
star
19

code_club

HTML
2
star
20

C.-elegans-Benzimidazole-Resistance-Manuscript

This repo contains all data, scripts, and outputs (figures and tables) associated with this manuscript.
R
2
star
21

protocols

Andersen Lab protocols
1
star
22

NOAA

Integrating NOAA with C.elegans wild isolate data
R
1
star
23

hisat2-pipeline

Pipeline for generating hisat2 reference
Shell
1
star
24

wormbase-api

R
1
star
25

impute-nf

Nextflow
1
star
26

bam-toolbox

A bam toolbox
Python
1
star
27

noaa-nf

R
1
star
28

easyfulcrum

easyfulcrum
R
1
star
29

Ce-PopGen

OpenEdge ABL
1
star
30

denovo-concordance

De novo concordance pipeline using minihash
1
star
31

Hawaii_Manuscript

OpenEdge ABL
1
star
32

alignment-nf

A nextflow pipeline for genome sequences alignment
Nextflow
1
star
33

Transposons2

Shell
1
star
34

microPub_chemotaxis

Data, scripts, and figures for chemotaxis micro publication
Python
1
star
35

chemotaxis-cli

R
1
star
36

sv-nf

A nextflow pipeline to call structural variants using short or long read data
Nextflow
1
star
37

wi-gatk

The new GATK-based pipeline for wild isolate C. elegans strains
Nextflow
1
star
38

annotation-nf

Annotate VCF with snpeff and bcsq
Nextflow
1
star
39

GMVK

Gene Model Visualization Tools (GMVT) is a suite of R functions used to visualize gene models and variant annotations
R
1
star
40

ril-nf

R
1
star
41

Andersen-Knowledge-Base

1
star
42

sfs-dfe

R
1
star
43

mmp-telseq

Code for analyzing telomere length from million mutations project using telseq
Python
1
star
44

DauerSRG3637

Repository for dauf-1 QTL paper
1
star
45

cellprofiler-nf

A nextflow pipeline to run CellProfiler pipelines on raw images and process output
Nextflow
1
star
46

CePopulationGenetics-nf

archive of Stefan Z. CePopulationGenetics-nf repo with Tim C. edits
Roff
1
star