HBCR - Human BP Codon Resource Variant Annotation

Overview

The MBC of Massachusetts Eye and Ear Infirmary, and Dr. Marni Falk at the Children's Hospital of Philadelphia. We have used extensively in discovering multiple novel disease genes, as well as many novel mutations in "known" diseases genes, for both inherited retinal diseases and Mitochondrial Diseases.

The annotation module of our pipeline is our custom Human BP Codon Resource (HBCR), which maps each base position in the human reference genome based on Ensembl gene annotations (GRCh37, release: 75), to its corresponding, if any, transcripts, genes, codons, and translation frames. Additional annotations of each variant call are provided using data sets downloaded from the 1000 Genomes Project website, NHLBI Exome Sequencing Project Exome Variant Server, and UCSC Genome Browser. These annotations include allele frequencies, SIFT and PolyPhen 2 predictions, and phastCons conservation scores. As such, HBCR is a fairly comprehensive annotation resource indexed by chromosomal position that we compile from extracting and parsing annotations from a diverse set of existing genomic databases and annotation resources.

Once the variants are annotated, custom scripts can be easily developed by end users to facilitate identification of candidate variants that fit different filtering criteria, such as genetic models. We make this annotation resource publicly available through the MSeqDR effort through a Web-based query interfaces.

Usage

HBCR supports 3 input file formats: Text, HGVS, and VCF.  Sample input: TextVCFHGVS

The text input format (HBCR's native format) for variants requires 8 fields:
Chromosome
Position
Reference Allele
Variant Allele
Strand
Homozygosity Type
Total Depth
Variant Depth.

Strand must be always 1, assumed to be on + strand, you need to convert it if not. The last 3 columns are place holders and currently not used in HBCR.

#Chromosome*

Position

Reference Allele

Variant Allele

Strand

Homozygosity Type

Total Depth

Variant Depth

1

1007203

A

G

1

HOM

16

16

1

1007222

G

T

1

HOM

23

19

1

1007432

G

A

1

HET

104

47

1

1017341

G

T

1

HOM

121

120

* Please start header line with "#" to make it a comment line, or do not include the header line!

How to submit your variant list input

Option 1: Input Your Variant List

Using the specified format

Option 2: Upload Input File

Using the specified format. Files will be default tabbed format as above example, or VCF format. Uploaded files are listed in the drop-down list. "Refresh" after you finished uploading to show it in list. Click "Annotate" button to annotate the selected file.


**To keep your data private from public users, please register and login. Then your uploaded data will be saved at your own folders and hidden from other users. In addition, the annotations for public users may overwrite each other if they are using at same time, so we strongly recommend users to register account.

Additional benefits: Registered users can request to data owners for accessing more controlled data from MSeqDR community.

Current HBCR setting can process 1 million rows in one hour. We recommend to limit per uploaded file to 500,000 rows to avoid time-out issues.

User must provide email to ensure getting result links from email in case the web page time out for large input files. You will get 2 emails, one at start, and one at finish of HBCR annotation. Please do not resubmit if the page is time out, because the submitted job will always be run to end. Just follow links in the email, or this page (if no email address is provided).

We expect annotation speed of up to 1.5 million variants per hour. Please check back results accordingly.