New QC Pipeline for Iso-Seq Data Increases Confidence in Transcript Results
Thursday, April 13, 2017
A preprint from scientists at the University of Florida, Centro de Investigaciones Principe Felipe, and other institutes describes a new analysis tool to help boost quality of transcriptome studies. “SQANTI: extensive characterization of long read transcript sequences for quality control in full-length transcriptome identification and quantification” comes from lead author Manuel Tardaguila, senior author Ana Conesa, and collaborators.
The automated pipeline for Structural and Quality Annotation of Novel Transcript Isoforms (SQANTI) was developed as a quality-assessment tool for transcripts discovered with SMRT Sequencing. SQANTI “calculates up to 35 different descriptors of transcript quality and creates a wide range of summary graphs to aid in the interpretation of the sequencing output,” the authors report.
Development of this new pipeline was spurred by the realization that different transcript analysis tools yielded different results, even for the same data set. “As an example, sequencing the mouse neural transcriptome with PacBio long reads, we obtained ~ 80,000, 12,000 and 16,000 different transcripts when applying Tapis, IDP or the ToFU pipeline, respectively,” the scientists write. “Implementing a comprehensive, quality aware analysis of PacBio reads is fundamental at a time when long read transcriptome sequencing is becoming more popular and important conclusions on transcriptome diversity will be drawn from these data.”
SQANTI consists of tools to classify transcripts by comparison to a reference annotation, analyze data by more than 30 metrics, and generate graphs to report results. The team tested it using neural tissue from mice, performing extensive RT-PCR validation to measure transcript expression. PacBio sequencing of the tissue identified many novel transcripts, but “an important fraction of the novel sequences are presumably bioinformatics or retrotranscription artifacts that can be removed by using SQANTI descriptors,” the scientists report.
They also evaluated results against data from short-read sequencing. “A comparison of Iso-Seq over the classical RNA-seq approaches solely based on short-reads demonstrates that the PacBio transcriptome not only succeeds in capturing the most robustly expressed fraction of transcripts, but also avoids quantification errors caused by unaccounted 3’ end variability in the reference,” Tardaguila et al. write. “SQANTI allows the user to maximize the analytical outcome of long read technologies by providing the tools to deliver quality-evaluated and curated full-length transcriptomes.”
基因有限公司作为Pacific Biosciences公司在中国区的独家代理商,自2011年以来将PacBio第三代单分子实时测序技术引入国内,一直为国内用户提供专业的三代测序系统的安装培训,技术支持,应用培训与售后维护工作,赢得客户的一致好评与信任。基因有限公司将一如既往的支持越来越多的PacBio用户。

