MSeqDR/Genesis Tutorial

Presentation: Marni Falk, MD, MSeqDR Organizer; Member, SIMD falkm@email.chop.edu

                        Lishuang Shen, PhD, MSeqDR Developer        Lishen@chla.usc.edu

Assistants:      Colleen Clarke Muraresku, MS, LCGC clarkec@email.chop.edu

                        Elizabeth McCormick, MS, LCGC mccormicke@email.chop.edu

                        Zolkipli Cunningham, Zarazuela  ZolkipliZ@email.chop.edu

                        Xiaowu Gai, PhD , MSeqDR Co-Organizer  xgai@chla.usc.edu

Date:               Saturday, June 18, 2016

Time:               2:30-4:00 pm

Location:         DoubleTree by Hilton, 18740 International Boulevard, Seattle, WA 98188

Welcome! Today we are going to walk you through a hands-on tutorial for using MSeqDR, Genesis/GEM.app. These databases can be utilized as tools for researchers and clinicians with real time queries, data organization, and more... all for FREE to academic users!  After this tutorial you will be able to login, navigate within the database, submit, explore, and share data.

We are focusing more on variant and clinical data submission and phenotype guided analysis on exome sequencing data.

MSeqDR is described in more detail:

MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease. SHEN L et al.. Hum Mutat. 2016 Jun;37(6):540-8.. Pubmed 26919060

Mitochondrial Disease Sequence Data Resource (MSeqDR): A global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities. Falk MJ, et al. Mol Genet Metab. 2015 Mar;114(3):388-396.  Pubmed 25542617

 

MSeqDR WEB PORTAL: https://mseqdr.org

 

 

MSeqDR Workflow from Clinics to Interpretation


 

Demo Account 1 (both MSeqDR and its Phenotips server):

User ID: UMDF15  , Password: Mito15

Demo Account 2.(MSeqDR Phenotips server only):

User ID: UMDF16  , Password: umdf16

Demo Account 3.(MSeqDR Phenotips server only):

User ID: DemoMSeqDR, Password: DemoMSeqDR

LinkedIn Login:  Use your LinkedIn Account to login

 

MSeqDR server https://mseqdr.org  and  https://mseqdr.org/clinical

Backup site: https://mseqdr.org/demo  and  https://mseqdr.org/demo/clinical

MSeqDR Phenotips server:

MSeqDR PhenoTips  http://mseqdr.org:8080/phenotips/

Or

MSeqDR PhenoTips Demo1:  http://mseqdr.org:8090/

 

Part I Variant Submission Example, in HGVS format, for mtDNA, genomic DNA, genes and transcript notations:

m.15800C>T

1:g.215821999G>A

AARS2:c.1774C>T

NM_178857.5:c.134G>A

19:g.54621653_54621662del

 

Phenotype tools and patient input example:

Mutations inFBXL4, Encoding a Mitochondrial Protein, Cause Early-Onset Mitochondrial Encephalomyopathy

http://www.sciencedirect.com/science/article/pii/S0002929713003364#tblfn1

Subject 5 (S5):

 

"Subject 5 (S5) is a 9 yo girl, the only child from nonconsanguineous parents of European origin. She was born at the 39th week, small for gestational age (2,410 g). Apgar scores were 7 - 9. She has chronic lactic acidosis and RTA, global developmental delay (she is nonverbal) with truncal hypotonia, ataxia, and choreoathetoid movements, neutropenia, frequent infections, and severe GI dysmotility and swallowing difficulty, requiring PEG. She had several generalized seizures. She has dismorphic features of face and limb malformations (Figures 1A and S1A), with moderate microcephaly; her BMI is 17.4 (Underweight BMI < 18.5)

 

Finding:

 

Subjects 5 and 8 were compound heterozygous carrying both a nonsense mutation together with a missense variant. Subject 5 harbored a c.1067del (p.Gly356Alafs15) nonsense mutation in the maternal FBXL4 allele and a c.1790A>C (p.Gln597Pro) missense mutation in the paternal FBXL4 allele; 

 

Input VCF from WES whole exome sequencing: FBXL4_patient_exome.vcf  (~142K variants)

Input HPO from above: HP:0003593,HP:0001518,HP:0001263,HP:0001999,HP:0003128,HP:0002059,HP:0002134,HP:0007042,HP:0000297,HP:0001250,HP:0001251,HP:0001266,HP:0030195

Run analysis at: https://mseqdr.org/exomiser.php

For demo purpose, the FBXL4 vcf file is not run due to the computation burden on webserver. Pre-computed results are shown as demo.

Save files: AR phenix 1. Original 2. Result: HTML, Genes, Variants, Enhanced Annotation VCF 3. log

Save files: AR hiphive 1. Original 2. Result: HTML, Genes, Variants, Enhanced Annotation VCF 3. log


 

Part I. MSeqDR Genomic Data Submission

 

Part I-a. Pathogenic Variant Submission: Quick Addition and ClinVar Style Refinement

1.      Click "Submission" Menu, go to "Submit Variant" at https://mseqdr.org/ms_variant.php.

2.      Input one or more pathogenic variants in HGVS format or VCF format. The 1st 5 columns of VCF input are required (Figure 1a).

3.      Select one of the studies of your own. Or one of MSeqDR's default Pathogenic Variant Submission Public Stations: M8 for GRCh39/hg19 coordinate, or M9 for GRCh38/hg38 coordinate.

4.      Click "Submit" button. Review the result report.

 

Figure 1a. MSeqDR Pathogenic Variant Submission Part I-a

 

5.      After submission, refine, update and complete ClinVar style annotations as below (Figure 1b) at https://mseqdr.org/ms_variant_submit_clinvar.php .

6.      Select one of your raw variant submissions from the list under "Saved Variants".

7.      Update the information in textbox fields. Mouse over the textbox to view instructions.

8.      Note: fields marked with * are required according to ClinVar's submission template. Pay special attention to "Clinical significance", "Mode of inheritance", and "Phenotype ID value".

9.      Click on the "Save to MSeqDR" or "Update" button.

10.   If you receive a warning message "You already have 1 variants at same position in same study: xxxx. Please check here and press 'Save' button again if you still want to add it.", follow the instructions to add it for new variant. Or leave alone to discard changes.

 

Figure 1b. Pathogenic Variant Submission Part I-b. ClinVar Style Refine Annotation

 

Part I-b. Expert Panel Recruiting: MSeqDr is organizing an expert panel for mitochondrial diseases and pathogenic variants review.

1.      Select MSeqDR "Tools" -> "Expert" menu, or type the URL into the address bar: http://mseqdr.org/expert.php (Figure 2).

2.       Review the goals and requirements about the expert panel. More documents at ClinGen.

3.      Click the "Genes" tab, select the genes from the page you are willing to work on using checkbox to add them

4.      Or type other genes in the text box manually if not in this page. You can also type "mtDNA" if not limited to coding mtDNA variants.

5.      Fill in personal information, and description of your expertise to aid in review your qualification

6.      Click the "Go" button, and your offer will be sent to MSeqDR. Decision will be sent to you after internal review.

7.      Alternatively, click the "Diseases" tab, select the diseases you have expertise and submit your offer of help. You can type in other OMIM ID, MESH ID, with other additional free text comment.

8.      Your entry numbers for genes, disease, variants are shown at their tabs.

Figure 2. MSeqDR Expert Panel Signup

 

Part I-c. Advanced Exercise (not covered today): . Full Study (Data Set) Submission to MSeqDR: https://mseqdr.org/mb.php?site=sub&url=ms_experiment_design.php 

1.      Select Full Study: Create' under submission tab.

2.      Select an example beginning with "Template"

3.      Click use as a template underneath the selection

4.      REVISE study name for the template

5.      Click "Save/Continue"

6.      Upload page - VCF file for variants, and other files can be added as you regard being helpful.

7.      Then click "FINALIZE" for the submission; a study accession number will be generated.


 

Part II. MSeqDR Phenotype-Guided Tools for Clinical Sequencing Data

 

Part II-a. Phenotype-Guided Exome Variant Prioritization with HPO and Exomiser

1.      Upload Variant File (VCF v4, the sample column with genotype details is required) using the "Add files" button, or drag-n-drop input vcf file (Figure 3a).

2.      Click the "Refresh" button to show newly uploaded VCF files

3.      Prepare PED file for multi-sample VCF files, click "Choose File" button to open it.

4.      Input clinical symptoms and diagnosis description as free text, or as HPO terms. One phenotype per line, or separated by commas, periods, semicolons.

5.      Click the "HPO Annotator" button twice to find HPO IDs for the input. Check the page bottom to review the HPO hits ranked according to semantic similarity to HPO dictionary for each input term. Use check box to select most specific matches.

6.      The "HPO ID" text box is the final phenotype list.

7.      Alternatively, directly type HPO IDs in the "HPO ID" text box, separated by commas.

8.      Fill in email address to receive result report and notice.

9.      Optional: If you have an targeted Mendelian disease, input it into "OMIM ID".

10.  Click the "Annotate" button to submit your job.

11.  This tool runs Exomiser in batch mode by running all 6 possible combinations (Prioritization method x Inheritance model).

12.    The run may need about 6-15 minutes for input with 10,000 variants. Use email function to receive result. Please limit input size to avoid time-out. If possible, use pre-filtered vcf files with only meaningful variants.

13.  Review the result from report page. Each combination has 7 links to different result file: "1. Original 2. Result: HTML, Genes, Variants, Enhanced Annotation, VCF, Log"

14.  Click on "Enhanced Annotation" link to get the full VariantOneStop annotations (genomic, external link, population frequency, dbNSFP) for top 25 candidate variants.

 

Figure 3a. Phenotype-Guided Exome Variant Annotation and Prioritization

Figure 3b. Phenotype-Guided Exome Variant Annotation and Prioritization


Part II-b. Clinical Phenotype-Guided Tools Using HPO, OMIM, ClinVar

1.      Click the "Tools" menu then select "Phenotype Mining" menu, or type the URL: https://mseqdr.org/clinical/pa.php (Figure 4a).

2.      Clear the text area, input a list of free text of clinical descriptions, one phenotype per line or separated with semicolons

3.      Double click on "MSeqDR HPO Annotator" button to search HPO terms that are first matched by term name, then by synonyms and definitions.

4.      In the result table display below, the HPO top matches for each input line are ranked semantically. Use the checkbox to select/unselect the most specific matches to input.

5.      Click on the HPO IDs to check the term detail and associated diseases in ontology tree browser.

6.      Click on "Primary Genes" to display the diseases and genes matching any of the selected HPO terms. Save the result files (Figure 4b).

7.      Click on "Phenotype Profiler" and check the semantically matched and ranked diseases in Monarch Initiative site (Figure 4c).

8.      Click on click on "Monarch Text Annotator" button to search for HPO terms using the free text annotation tool at Monarch Initiative.

 Figure 4a. Phenotype Semantic Mapping and Profiling at MSeqDR and Monarch Initiative

Figure 4b. HPO-based Primary Gene List and Associated Diseases

Figure 4c. HPO-based Diagnosis Aided by Phenotype Profiling at Monarch Initiative


 

Part III. MSeqDR Patient and Clinical Data Submission

 

Part III-a. General Usage: Single Case Direct Clinical Data Submission through MSeqDR Phenotip Server

 

1.1. Important Disclaimer: Please do not input any PHI or PII information.

1.1.1.   Consents is required as this: MSeqDR only takes de-identified patient clinical description and phenotype information. In compliance with HIPPA regulation, MSeqDR is not reporting Personal Identification Information (PII), or Personal Health Information (PHI) unless de-identified. No PII should be input in Phenotips at MSeqDR. Please discuss discussions about handling such data with the MSeqDR team.

1.2.   Open a web browser (minimal requirement: Internet Explorer v.11, Mozilla Firefox v.43, Google Chrome v.49). Javascript must be enabled in web browser.

1.3.   Select MSeqDR PhenoTips menu, or type the URL for MSeqDR Phenotip Server into the address bar: http://mseqdr.org:8080/phenotips/

1.4.   Create account in standalone PhenoTips server with user ID/password. User IDs and passwords must be kept securely and updated periodically.

1.4.1.   For this workshop, demo account: User ID: DemoMSeqDR, Password: DemoMSeqDR (password is case-sensitive)

1.5.   Log in to MSeqDR Phenotip Server using username and password.

 

Figure 5. Registration Screen for Standalone PhenoTips Server

 

1.6.   In PhenoTips, Select a record using "BROWSE ALL RECORD" link, review the patient information against your record. Update personal information (Figure 6). For instructions, read PhenoTips documentation at: https://phenotips.org/UserGuide/GettingStarted.

Figure 6. MSeqDR PhenoTips Patient Information, Consent



1.7.Click on the "Clinical symptoms and physical findings" tab, traversing the HPO ontology tree to locate and select the matching term, or type in medical symptom description in the "Quick phenotype search" textbox and select matching terms from autocomplete hints (Figure 7).

 

Figure 7. MSeqDR PhenoTips Phenotype Annotation

 

1.8.In PhenoTips, click the "Family history and pedigree" tab to create the pedigree if information from other family members are available. Then click on "Draw Pedigree", and in the popup windows, select a pedigree template, click on each node to input personal and clinical information, click the top "Save" and "Close" buttons to finish drawing pedigree (Figure 8).

 

Figure 8 PhenoTips Family and Pedigree Editor

 

1.9.After each revision, click the bottom "Quick Save" or "Save and View Summary" buttons to save your work

 

1.10.                    Case Ownership, Visibility and Adding Collaborator: Click the top right "Modify Permission" menu, and control the access rights for the patient record. Add a collaboration group, select "Can view and modify patient data" and press "Update". This allows other designated people to access and work with the same record (Figure 9).

 

Figure 9. PhenoTips Access Right Setting: Case Ownership, Visibility, Collaborator


 

Part III-b. Advanced Usage: Clinical Data Batch Submission Using Sample Sheet

 

2.      Select Import Sample Sheet menu, or type the URL for MSeqDR Phenotip Server into the address bar https://mseqdr.org/clinical/cpmupload.php 

3.      Login with your account username and password (Figure 2). LinkedIn account can be used. For this workshop, demo account: User ID: DemoMSeqDR, Password: DemoMSeqDR (password is case-sensitive)

4.      Specify input flat file source and input format from drop-down list, as one of: CMA, MSeqDR Default, OTHER.

5.      If Choose "OTHER" as input source, please provide a brief name for the input source at the bottom right textbox. 

6.      First data input method: "Import CSV file",

6.1.   Click "Browse" at top left and

6.2.   Locate and select your input file from your computer, and click "Open".

6.3.   File content is automatically processed for review before saving to MSeqDR.

7.      Second data input method: Copy or Type in the Text Box.

7.1.   Open the input file in Excel or similar tool. It is recommended that your input use EXACTLY the same columns headers as the  "Core_Sample_Column"which can be automatically matched by keywords.

7.2.   Copy content from input file, include the header column as the first row.

7.3.   Paste into top text area. File content is automatically processed for review before saving to MSeqDR.

8.      Check the result as being formatted in the box under "Review your input", make sure it is consistent with your input.

9.      Map your input data columns to "Core_Sample_Column"

9.1.   Click "Review Header Mapping" Button in the middle to check automatic direct text match result. This is also required to enable drag-and-drop column header mapping, as described below.

9.2.   Use mouse to move to the left or right to show the columns you will work on, from the tables under "MSeqDR Core Sample Columns" or "Review your input".

9.3.   Required or important "MSeqDR Core Sample Columns" are highlighted in yellow. For clinical samples, your information must have at least one of: (1) unique "PATIENT_MRN" that does not conflict with other records in the database, or/and (2) a unique combination of "PATIENT_NAME" and "PATIENT_DOB". The information must allow a patient to be uniquely identified in the MSeqDR registry.

9.4.   For research samples, the information must allow a subject or sample to be uniquely identified in the MSeqDR registry. This may be achieved by providing unique sample name, and/or accession number that is not in conflict with other records in the database.

9.5.   Use mouse to drag and drop column headers from "Review your input" table to the table under  "MSeqDR Core Sample Columns", and place it to cells in the "User Input Column" under the "Core_Sample_Column" it matches (Figure 10, blue arrow).

9.6.   Click "Review Header Mapping" Button in the middle again to review the column match result. Repeat mapping if necessary.

10.  Add your "Data Description" in the text box below. Multiple lines are allowed. Standard data information is automatically generated if you do not specify own description. Revise it with your description if deemed necessary.

11.  Check the "Create MSEQ_SID  " checkbox if you want to assign MSEQ_SID  to the records. Check the "Research Sample" checkbox if you are working on research samples or validation samples with limited personal identification information.

12.  When you finish column mapping, proceed to finish data import by clicking the "Import" button.

13.  Review and manage submitted dataset. Existing MSEQ_SID  will be used for newly added duplicate records from same person.

14.  If import is not successful, check the input file for the record and try again. Save the warning message and contact the website administrator for assistance using the "feedback" link at the bottom of the page.

 


 

Figure 10. MSeqDR Sample Portal Flat File Import

Drag-and-Drop Column Matching


 

14.1.                    Push Imported Samples into PhenoTips (Figure 11).

14.1.1.                       In web browser, click the "Push2PhenoTips" menu, or type the URL: https://phenotips.chla.usc.edu/ca.php .

14.1.2.                       Select a patient record from drop-down list, click the generated link to PhenoTips, and annotate patient information in PhenoTips. MSEQ_SID  is automatically generated when selected from the drop-down list, and used in creating the PhenoTips record. MSEQ_SID  may have already been generated in step 7.4.9 if "Create MSEQ_SID  " was selected when importing.

 

Figure 11. Push Patient Records into PhenoTips and Synchronize with PhenoTips

 

 

 

14.1.3.                       In PhenoTips, review the patient information against your record. Update personal information. For instructions, read PhenoTips documentation at part one, and at: https://phenotips.org/UserGuide/GettingStarted .

 


 

Part IV. MSeqDR Other Newly Added Tools Since UMDF 2015.

1.      LinkedIn Integration: OAuth2 based authorization and automatic account creation for LinkedIn users. https://mseqdr.org/bblogin.php

2.      Gene Panel Examiner: Input Refseq Gene Symbols, Entrez Gene IDs, or Ensembl Gene IDs and obtain Panel-style summary for all resources per gene in MSeqDR including diseases and pathogenic variant. https://mseqdr.org/genepanel_design.php

3.      MSeqDR-CHOP-NAMDC Core Mitochondrial Disease Phenotypes, integrated into Mito-PhenoTips as "Phenotypes displayed by default" and to be used as MSeqDR clinical phenotype data capture dictionary to get standardized mitochondrial disease patient description.  http://mseqdr.org:8080/phenotips/view/PhenoTips/MSEQDR-CHOP-NAMDC

4.      MSeqDR Private Patient Exome Variant and Sample Name for Rare Disease Patient Lookup: https://mseqdr.org/portal_gene.php?gene=POLG&exp=M1   from ~1600 mainly patient exomes

5.      Important Diagnosis Gene Panels, such as Transgenomic NuclearMitome Test Panel for which MSqDR holds aggregated variant data: https://mseqdr.org/genepanel_nuclearmitome.php

 

CONTACT AND FEEDBACK:

Interested in joining MSeqDR consortium and collaboration, including depositing your aggregate data sets from your clinical or research laboratory, adding central access to additional data resources, or contributing to curation of specific gene(s)?

Please contact Dr. Marni Falk at falkm@email.chop.edu, or Dr. Xiaowu Gai at xgai@chla.usc.edu

 

Technical questions about the tutorial, data, tools and website?                                                                Please contact Dr. Lishuang Shen at lishen@chla.usc.edu, Shen_Lishuang@yahoo.com, or feedback