Use this protocol to classify, cluster, and map cDNA sequences.
Supports applications in transcriptomics.
Enables the functional characterization of transcripts and splice variants analyzed from the PacBio platform.
Note: This protocol replaces the RS_cDNA_Mapping protocol in previous releases.
The protocol includes three main steps:
Classify: Extract reads of insert from PacBio movies; remove cDNA primers and poly-As; then classify the reads of insert into chimeric or non-chimeric, and full-length or non-full-length reads.
Cluster: Predict de novo consensus isoforms of transcripts from the classified reads using the ICE (Iteratively Clustering and Error Correction) algorithm.
Map: Using GMAP, align the classified reads and predicted consensus isoforms to the user-specific reference sequence.
Filtering Parameters (IsoSeq Reads of Insert)
Isoseq_classify Parameters (IsoSeq Classify v1):
Maximum Number Of Paths Per Isoform or Read: The maximum number of GMAP paths to show for each isoform. If set to 0, outputs two paths if chimeras are detected; one path if chimeras are not detected. (This is the same as the GMAP --Npaths option.)
Isoseq_cluster Parameters (IsoSeqCluster v1):
Predict Consensus Isoforms Using The ICE Algorithm: Specify whether or not to predict consensus isoforms using the Iterative Clustering and Error Correction (ICE) algorithm.
Parallel Tasks: The number of tasks to run in parallel while performing iterative clustering.
Estimated CDNA Size: Specify the estimated cDNA sequence size - from under 1 kbp to over 3 kbp.
Call Quiver to Polish Consensus Isoforms: Specify whether or not to use the Quiver algorithm to polish the resulting consensus isoforms.
Minimum Quiver Accuracy to Classify an Isoform as HQ: The minimum Quiver accuracy needed to classify an isoform as high-quality.