Structural variations are genomic alterations affecting large portions of the chromosomes and play an important role in human genomics and precision medicine. Despite intense efforts over the years, the discovery of structural variations remains challenging due to the diploid and highly repetitive structure of the human genome. In this context, we proposed SVDSS, a novel approach for the characterization of structural variations from reads produced by PacBio Sequel II, a sequencing system launched in 2019 and able to produce long reads (ten of thousands of base pairs) with relatively low error rate (>99.9% accuracy). At its core, SVDSS relies on the notion of specific strings, a computer science notion we introduced to determine the differences between two sets of strings. By effectively leveraging mapping-free, mapping-based, and assembly-based methodologies, SVDSS is able to outperform state-of-the-art approaches, especially in characterizing structural variations in repetitive regions and heterozygous loci.
In this talk, I will introduce the notion of specific strings and present SVDSS. Then, I will summarize the challenges in the field of structural variations characterization I am currently investigating.
For internal attendees
Symbiose seminars : https://www.cesgo.org/symbiose/seminars/structural-variations-discovery…