Yu Lin joined the Research School of Computer Science in September 2016. Prior to this, he was a postdoctoral fellow at the Department of Computer Science and Engineering, University of California, San Diego. His research focuses on computational biology and bioinformatics, and he has been working on algorithms for genome assembly, the analysis of genome rearrangements, and phylogenetic reconstruction.
He received his PhD in Computer Science from École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. He also holds a master's degree in Computer Science from Chinese Academy of Sciences and a bachelor's degree in Computer Science from at University of Science and Technology of China (USTC).
He has received two awards from the Swiss National Science Foundation. He is also the recipient of Chinese Government Award for Outstanding Self-Financed Students Abroad (2012), Director's Award from Institute of Computing Technology, Chinese Academy of Sciences (2007) , Guo Moruo Presidential Award from University of Science and Technology of China (2004).
My research focuses on computational models and algorithms for the study of genome assembly and genome evolution.
The de Bruijn graph approaches dominated genome assembly in the last decade and resulted in many software tools for assembling short and accurate reads (e.g., Illumina reads). The recent breakthroughs in assembling long error-prone reads (e.g., PacBio SMRT and Oxford Nanopore reads) were all based on the overlap-layout-consensus approach which requires all-against-all comparison of reads and remains computationally challenging.
We generalized de Bruijn graphs and designed the A-Bruijn assembler to assemble long error-prone reads. The A-Bruijn assembler directly uses long error-prone reads to build the A-Bruijn graph and produces assemblies from the A-Bruijn graph, and thus avoids error-correcting of individual reads through extensive pairwise alignments. The A-Bruijn assembler also utilizes a new error correction approach and generates highly accurate genome reconstructions.
The A-Bruijn graph model also benefits the classic de Bruijn graph model for assembling short and accurate reads. The A-Bruijn assembler, for the first time, allows one to automatically choose larger values of the k-mer size in high-coverage regions to reduce repeat collapsing and smaller values of the k-mer size in the low-coverage regions to avoid fragmentation of the de Bruijn graph. Moreover, in the case of error-free reads, the A-Bruijn graph does not require any parameter setup and can be constructed in linear time and space.
Comparative and phylogenetic studies have proved useful in a myriad of applications ranging from novel medicines to outbreak analysis. As whole genomes are sequenced at increasing rates, using genome rearrangements for phylogenetic analyses is attracting increasing interest, especially as researchers uncover links between genome rearrangements and various diseases. However, previous phylogenetic studies of genome rearrangements were limited to small collections of genomes, low-resolution data (i.e., small numbers of synteny blocks), oversimplified models (e.g., no segmental duplications) and lack an effective assessment of robustness.
We proposed various evolutionary models for genome rearrangements and designed algorithms for analyzing the process of whole-genome evolution. Combining these models and algorithms, we designed the first phylogenetic reconstruction tool that overcomes all of the above difficulties: it supports a general model of genomic evolution, is very accurate, scales as well as sequence-based approaches, is robust against typical assembly errors, and supports standard bootstrapping methods. Moreover, this tool allows one to go beyond the inference of evolutionary relationships, and to infer both large-scale changes and local changes in the phylogney as well as to reconstruct ancestral genomes.
The 22nd Annual International Conference on Intelligent Systems for Molecular Biology (ISMB'14),
The joint 23rd Annual International Conference of Intelligent Systems for Molecular Biology and 14th European Conference on Computational Biology (ISMB/ECCB'15),
The 14th Asia Pacific Bioinformatics Conference (APBC'16),
The 24th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB'16),
The 15th European Conference on Computational Biology (ECCB'16)
The 15th Asia Pacific Bioinformatics Conference (APBC'17)