Secondary Analysis Algorithms Provided by Pacific Biosciences

Following are descriptions of the secondary analysis algorithms provided by Pacific Biosciences.

AHA

AHA ("A Hybrid Assembler") is the Pacific Biosciences hybrid assembly algorithm. It is based on the open source assembly software package AMOS, with additional software components tailored to Pacific Biosciences' long reads and error profile.

AmpliconAssembly

Calls phased consensus sequences from pooled amplicon sequence data. Up to 20 distinct amplicons can pooled.
Reads are clustered into high-level groups, then each group is phased and consensus is called using the Quiver algorithm.
If the sample is barcoded, separate calls are made for each barcode.

Base Modification Detection with Motif Finding

Identifies putative sites of base modification as well as common bacterial base modifications (6-mA, 4-mC, and optionally TET-converted 5-mC), and then analyzes the methyltransferase recognition motifs.
Detection can use either a control sample or an in silico control consisting of expected kinetic signals.

BLASR

Maps reads to genomes by finding the highest scoring local alignment or set of local alignments between the read and the genome. The initial set of candidate alignments is found by querying a rapidly searched pre-computed index of the reference genome, and then refining until only high scoring alignments are retained. The base assignment in alignments is optimized and scored using all available quality information, such as insertion and deletion quality values.
Because alignment approximates an exhaustive search, alignment significance is computed by comparing optimal alignment score to the distribution of all other significant alignment scores.

BridgeMapper

Reports when the unmapped portions of reads, above a threshold, have significant mapping to a specified reference.
Visualizes split alignments of Pacific Biosciences subreads by displaying reads with portions mapped to separate locations.

Genomic Consensus (Quiver)

Identifies haploid SNPs and single-base indels by comparing a multiple sequence alignment of mapped reads against a reference sequence.
Variant calls are made using a simple plurality algorithm.

GMAP

Third-party application that maps Pacific Biosciences reads onto a reference as if they were cDNA, allowing for large insertions corresponding to putative introns.

HGAP (Hierarchical Genome Assembly Process)

Performs high quality de novo assembly using a single PacBio library preparation.
HGAP consists of pre-assembly, de novo assembly with Celera® Assembler, and assembly polishing with Quiver.

PacBioToCA/CeleraAssembler

ReadsofInserts

Computes single-molecule consensus including Reads of Insert and Circular Consensus Sequences (CCS).
Provides DNA barcode analysis on these reads when samples have been multiplexed.

SMRT® Portal Help