In-depth analysis of interrelation between quality scores and real errors in illumina reads

Sunyoung Kwon, Seunghyun Park, Byunghan Lee, Sungroh Yoon

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

27 Scopus citations

Abstract

In sequencing results, the quality score is reported for each base, representing the probability that the base is called incorrectly. The notion of quality scores was initially developed for conventional Sanger sequencing, but is widely used for next-generation sequencing techniques, including Illumina. In this paper, we carry out in-depth analysis of quality scores reported for Illumina reads and present how they are related to real errors in the reads. We confirmed strong interrelation between quality scores and real errors in Illumina reads, and observed that reverse reads tend to have lower quality scores than forward reads in paired-end reads do. In addition, we discovered other interesting patterns from quality score analysis. Our hope is that the findings in this paper will be helpful for designing error-correction and/or filtering methods for next-generation sequencing.

Original languageEnglish
Title of host publication2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2013
Pages635-638
Number of pages4
DOIs
StatePublished - 2013
Event2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2013 - Osaka, Japan
Duration: 3 Jul 20137 Jul 2013

Publication series

NameProceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
ISSN (Print)1557-170X

Conference

Conference2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2013
Country/TerritoryJapan
CityOsaka
Period3/07/137/07/13

Fingerprint

Dive into the research topics of 'In-depth analysis of interrelation between quality scores and real errors in illumina reads'. Together they form a unique fingerprint.

Cite this