Background
HIV strains are frequently mutated from one HIV generation to the next, resulted in high genetic diversity of the HIV populations (named “quasispecies”) in a given infected host over a time period [1, 2]. Particularly under certain selective pressure (e.g. antiretroviral treatment), certain HIV quasispecies with special characteristics (e.g. drug resistant, high transmission) could be propagated [1, 3]. Therefore, sequencing and analysis of the HIV quasispecies is important for improving personalized treatment plan, developing early prevention action, designing more effective vaccine for patients [1, 3, 4].
Even though the next generation sequencing (NGS) technology of Illumina shortgun sequencing offers improved variant detection ability (~1% detection limit) over the traditional Sanger-based sequencing method (~20% detection limit) [4, 5], both have the common weakness on sequencing HIV strains, that is, their detection of the mutations is at the individual mutation level and the important linkage relationship among mutations is lost during the procedure [6]. Furthermore, the recently emerging third generation sequencing, PacBio, can continuously sequence up to 10 kb for each read which in theory not only provides high detection sensitivity but also maintain the relationship of mutations [6, 7]. The PacBio technology shows the potential to change the HIV genetic study from individual mutation detection to explicit quasispecies detection [6, 7]. However, the high-noise nature of PacBio reads and lacks of effective data analysis tools still pose a barrier to fully utilize its power in the field of HIV genetics [6, 8]. Several works have addressed the challenges with different data analysis strategies, such as tag-focusing, error-correction, and clustering [4, 5]. Some shortages of these works include but are not limited to: 1. The error correction method heavily relies on certain mathematic assumptions concerning the errors in a certain statistical distribution, which may not be always held for all situations in reality due to complicated noise sources. 2. Most of the tools are reference-based approach, which could be a problem if sequenced sample is significantly different (i.e. large deletion; different HIV types) from wild type reference. 3. Most importantly, the error correction method might be over-trained on a simpler artificial training dataset without further cross-testing on real clinical patient samples, which latter could be much more complex and challenged than the artificial dataset. The unmet need in bioinformatics analysis requires further improvement of the algorithms.
You are watching: A Laptop Assembly Is Sibject To A Final
See more : How To Boost Volume Of Laptop
In this report, we describe an improved de novo assembly procedure to accurately construct HIV quasispecies with high sensitivity. The PacBio read fasta file is the only required input without the need of reference sequence and other prior knowledge of the sequences. The procedure was successfully applied not only on HIV benchmark datasets, but also on real-life HIV relapse patient samples, leading to the early detection of the dynamic of HIV drug resistance strains.
Source: https://tholansonnha.com
Category: laptop