Getting Started

Recommended bulk Iso-Seq workflow

Command	Description	Output format
lima	Remove cDNA primers	`fl.bam`
isoseq refine	Remove polyA tail and artificial concatemers	`flnc.bam`
isoseq cluster2	De novo isoform-level clustering scalable to large number of reads (e.g. 40-100M FLNC reads)	`clustered.bam`
pbmm2	Align to the genome	`mapped.bam`
isoseq collapse	Collapse redundant transcripts based on exonic structures	`collapsed.gff`
pigeon classify	Classify transcripts against annotation	GFF and TXT files
pigeon filter	Filter transcripts for potential artifacts	GFF and TXT files

Begin with the bulk workflow which ends at isoseq cluster, then continue to pigeon workflow for transcript mapping, collapse, and classification.

Command	Description	Output format
lima	Remove cDNA primers	`fl.bam`
isoseq tag	Extract UMI and cell barcodes	`flt.bam`
isoseq refine	Remove polyA tail and artificial concatemers	`flnc.bam`
isoseq correct	Correct cell barcodes and tag reads that are real cells	`corrected.bam`
isoseq bcstats	Summarize barcode statistics for real/non-real cells	`bcstats_report.tsv`
isoseq groupdedup	Deduplicate reads	`dedup.bam`
pbmm2	Align to the genome	`mapped.bam`
isoseq collapse	Collapse redundant transcripts based on exonic structures	`collapsed.gff`
pigeon classify	Classify transcripts against annotation	GFF and TXT files
pigeon filter	Filter transcripts for potential artifacts	GFF and TXT files
pigeon make-seurat	Make gene- and isoform-level matrices	MTX and TSV files

Begin with the single cell-specific worfklow which ends at isoseq groupdedup, then continue to pigeon workflow for transcript mapping, collapse, and classification.