Bina Technologies, now part of Roche Sequencing

Trio Analysis on GiaB High-Confidence SVs Bina Technologies, Roche Sequencing Marghoob Mohiyuddin, Jian Li, Hugo Lam

Motivation • GiaB recently released high-confidence SVs for NA12878 for validating SV methods • 2,676 deletions, 68 insertions • Trio sequences are available from Illumina • MetaSV, recently published, calls SVs using multiple methods with high-confidence • Validation of GiaB SV Goldset using trio analysis with MetaSV assures quality

For Research Use Only. Not for use in diagnostic procedures.

Methodology • Validation using trio analysis (50x coverage) • MetaSV calls on parents (NA12891, NA12892) • MetaSV calls on NA12878 • Criteria • Deletions ≥ 100bp considered (2,348/88% in GiaB) • Reciprocal overlap of 50% used • A GiaB deletion is validated if • • •

Detected by MetaSV in any parent or, Detected by MetaSV as a PASS in NA12878 or, Reported in previous literatures with validation

For Research Use Only. Not for use in diagnostic procedures.

What is MetaSV?

An Ensemble Approach

For Research Use Only. Not for use in diagnostic procedures.

MetaSV Workflow ●





For Research Use Only. Not for use in diagnostic procedures.

Ensemble SV calling ○ Merge SVs from multiple methods and tools ○ SVs detected by multiple methods are high-confidence Enhanced insertion detection ○ Existing tools weak in detecting insertions ○ Use a combination of soft-clip analysis and assembly Assembly and alignment to refine breakpoints

MetaSV Accuracy ● ●

● ● ●

VarSim simulation of 50x Illumina 2x100bp reads for NA12878 Reciprocal overlap of 100% and wiggle of 100bp to access both breakpoint precision and accuracy Performance varies for tools/methods across sizes MetaSV has best stable accuracy across all SV sizes Achieved 90.2% sensitivity against Complete Genomics high-confidence SVs for NA12878

For Research Use Only. Not for use in diagnostic procedures.

Are GiaB SVs of high quality?

Total Validated

Total not validated

Additionally Validated

GiaB HC

0 (0%)

2,348 (100%)

0 (0%)

GiaB HC Validated by Parents (MetaSV ALL)

2,302 (98.0%)

46 (2.0%)

2,302 (98.0%)

GiaB HC Validated by Child (MetaSV PASS)

2,306 (98.2%)

42 (1.8%)

4 (0.2%)

2,342 (99.7%)

6 (0.3%)

36 (1.5%)

GiaB Deletion Validation

GiaB HC Validated by Child (curated)

For Research Use Only. Not for use in diagnostic procedures.

Manual Inspection of Unvalidated Calls Using IGV & SVVIZ (1: 19151770-19152035)



Paired-end support: Only reported by BreakDancer For Research Use Only. Not for use in diagnostic procedures.

Manual Inspection of Unvalidated Calls (2:233764771233765484)



Split-read support: only Pindel reported this

For Research Use Only. Not for use in diagnostic procedures.

Manual Inspection of Unvalidated Calls (7:8973733589738051)



Not reported by any tool. Read-depth support is weak. Read-pair support present.

For Research Use Only. Not for use in diagnostic procedures.

Manual Inspection of Unvalidated Calls (11:2900058329012888)



PASS imprecise call with reciprocal overlap of 0.48! Seems to be misaligned

For Research Use Only. Not for use in diagnostic procedures.

Manual Inspection of Unvalidated Calls (14:106798247106822961)



Both paired-end and read-depth support but BreakDancer reports a much larger 298K deletion For Research Use Only. Not for use in diagnostic procedures.

Manual Inspection of Unvalidated Calls (16:8398408483984359)



Coverage around the region is unusually higher than 50x (> 200x).

For Research Use Only. Not for use in diagnostic procedures.

Is GiaB SV missing anything?

GiaB Deletion Validation using Mendelian Rule

GiaB HC Dels 2,348

GiaB Private 222

Common 2,126

MetaSV Private 456

MetaSV Trio Dels 2,582

(142 not in literature)



MetaSV High Quality Trio Deletions: ○ Mendelian Inheritance Consistency with Genotypes ○ Pass in Child and ALL in Parents ○ Considering no call as reference call

For Research Use Only. Not for use in diagnostic procedures.

MetaSV PASS Dels 2,671 96.7% are Mendelian consistent (98.7% if ignoring genotypes)

Manual Inspection of MetaSV Private Calls (1:9569051095690829)

Reported by all 4 tools, genotyped as 1/1 For Research Use Only. Not for use in diagnostic procedures.

Manual Inspection of MetaSV Private Calls (1:153215877153216600)

Reported by CNVnator, BreakDancer, genotyped as 0/1 For Research Use Only. Not for use in diagnostic procedures.

Manual Inspection of MetaSV Private Calls (12:5027256550273865)

Reported by BreakDancer, Pindel, genotyped as 0/1 For Research Use Only. Not for use in diagnostic procedures.

What’s next?

Conclusions • GiaB SVs have a high validation rate using MetaSV trio analysis • Only 6 unvalidated SVs do not have strong support in IGV or SVVIZ • ⇒ GiaB SVs of high quality • Almost all (up to 98.7%) MetaSV PASS calls are Mendelian consistent making them high quality • Significant number (456) of MetaSV trio calls not in the GiaB set, possibly missed due to stringent GiaB requirements since 321 of those in literature • MetaSV trio validation can help validate and extend the existing gold set For Research Use Only. Not for use in diagnostic procedures.

Future Work

• Using MetaSV genotype information on the two additional trios from GiaB • Four levels of quality classification for the child • HighQual (validated by strict mendelian inheritance) • PASSII (validated by presence in parents) • PASSI (validated by multiple methods) • LowQual (none of the above)

For Research Use Only. Not for use in diagnostic procedures.

Acknowledgement • Genome in a Bottle • Hemang Parikh • Justin Zook • The SV Team

For Research Use Only. Not for use in diagnostic procedures.

• Bina Technologies • Jian Li • Hugo Lam • The Science Team

Bina Technologies, now part of Roche Sequencing

Bina Technologies, now part of Roche Sequencing -

VarSim simulation of 50x Illumina 2x100bp reads for NA12878. ○ Reciprocal overlap of 100% and wiggle of. 100bp to access both breakpoint precision and accuracy. ○ Performance varies for tools/methods across sizes. ○ MetaSV has best stable accuracy across all. SV sizes. ○ Achieved 90.2% sensitivity against.

6MB Sizes 0 Downloads 204 Views

Recommend Documents

Sequencing technologies — the next generation - RainDance ...
Dec 8, 2009 - mon theme among NGS technologies is that the template .... amplification is performed within these droplets to create beads containing several ...

Applications of New Sequencing Technologies for ...
Aug 6, 2009 - email: [email protected], [email protected], [email protected]. Annu. Rev. ..... copies of the template, which are fed into subsequent steps ...

[PDF BOOK] Next-generation Sequencing: Current Technologies and ...
High throughput next generation sequencing NGS technologies are capable of ... 32 Rapid advances in the development of sequencing technologies in recent ...

Bart Roche Scholarship Award
Signature: wefwefwefwf wfewefwe fwefwef wefwef wefwef ... Signature: Academic Title: Page 7. Community ... Signature: Title: Page 8. High School Counselor's ...

Bart Roche Scholarship Award
Signature: wefwefwefwf wfewefwe fwefwef wefwef wefwef wefwefwef wfwefwef ... If more space is needed, please attach a separate sheet. Signature: Academic ...

KRIDA BINA OBAT.pdf
Page 3 of 20. KRIDA BINA OBAT.pdf. KRIDA BINA OBAT.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying KRIDA BINA OBAT.pdf. Page 1 of 20.

PSIKOTES BINA UNNES.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. PSIKOTES BINA ...

Benjamin Roche
transmission in wild birds suggested by higher prevalence in ... environmental transmission. • Developing an individual- ... the end of WWI. • Caused by a large ...

rOCHE LA MOLIERE.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. rOCHE LA ...

roche papier ciseaux.pdf
Roche papier ciseaux cuure l 39 hebdo journal. Roy dupuis, samian et. roger léger présentent roche, papier. Jeu roche papier ciseaux. grenade! gratuit ...

Asas Bina Site.pdf
dalam petak. Klik sini. Page 4 of 12. Asas Bina Site.pdf. Asas Bina Site.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying Asas Bina Site.pdf.

PSIKOTES BINA KANWIL.pdf
Page 1 of 2. Page 1 of 2. Page 2 of 2. Page 2 of 2. PSIKOTES BINA KANWIL.pdf. PSIKOTES BINA KANWIL.pdf. Open. Extract. Open with. Sign In. Details. Comments. General Info. Type. Dimensions. Size. Duration. Location. Modified. Created. Opened by me. S

Latihan Bina Ayat.pdf
(A) 5. (C) 6. (B) 7. (D) 8. (C) 9. (C) 10. (B) 11. (D) 12. (D) 13. (D) 14. (D) 15. (A). ANSWERS. SPACE FOR ROUGH WORK. Whoops! There was a problem loading this page. Retrying... Latihan Bina Ayat.pdf. Latihan Bina Ayat.pdf. Open. Extract. Open with.

CS 368 WEB TECHNOLOGIES Question Bank PART ... -
CS 368 WEB TECHNOLOGIES. Question Bank. PART A. 1. Write short note on Web Servers. 2. Brief about URL and MIME. 3. Differentiate Internet and WEB. 4.

Capture and Sequencing Illumina Sequencing Library ...
The large amount of DNA sequence data generated by high-throughput sequencing technologies ..... To avoid a downstream failure of Illumina's image analysis software, subsets of indexes must be .... Max-Planck-Society for financial support.

High throughput DNA sequencing: The new sequencing revolution
Aug 3, 2010 - “cloud computing”[24]. 2.3.3. Improving efficiency and throughput. All companies and sequencing centres regularly update instru- ments ...

SK Bina Keluarga Remaja.pdf
SK Bina Keluarga Remaja.pdf. SK Bina Keluarga Remaja.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying SK Bina Keluarga Remaja.pdf. Page 1 ...

High throughput DNA sequencing: The new sequencing revolution
Aug 3, 2010 - NGSTs can be applied to various domains of plant biology, and we identify ...... SNP and InDel markers will be affordable for most crops, thus.

Genomic Sequencing
deletion. +. +++. ++. ++++. Inversion. +. +++. ++. ++++ complex rearrangement. +. +++. ++. ++++. Large rearrangement. +. ++. +++. ++++ only by combing short and ... hIgh quaLIty data. Illumina sequencing provides high throughput sequence informa tion

The development and impact of 454 sequencing
Oct 9, 2008 - opment of the 454 Life Sciences (454; Branford, CT, USA; now Roche, ... benefits inherent in the solutions 454 provided is that in one form or ... the development of the integrated circuit at the heart of the computer ..... but the degr

SK Bina Keluarga Balita.pdf
SK Bina Keluarga Balita.pdf. SK Bina Keluarga Balita.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying SK Bina Keluarga Balita.pdf. Page 1 of 3.

wetlands charlotte roche pdf
Page 1 of 1. File: Wetlands charlotte roche pdf. Download now. Click here if your download doesn't start automatically. Page 1 of 1. wetlands charlotte roche pdf. wetlands charlotte roche pdf. Open. Extract. Open with. Sign In. Main menu. Displaying

Sequencing Nativity.pdf
Charlotte Braddock 2013 www.teacherspayteachers.com/Store/Charlottes-Clips-4150. Page 1 of 1. Sequencing Nativity.pdf. Sequencing Nativity.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying Sequencing Nativity.pdf. Page 1 of 1.