Protocol for dealing with 16s full-length sequencing data by Pacbio

Dear all
Do you have a protocol to deal with the Pacbio sequencing data using mothur.

I don’t have a web-based SOP, but you can read through my paper to get an idea of what to do…


Hi Pat so you think that this SOP will deal with the near full length PacBio HiFi reads we are getting?



ps would you be willing to share the logfile?

Here is the entire codebase for that paper:

well getting your file, thanks so much :grin:

sorry to be dumb, but i looked at the github page and can’t find the mothur commands for running PabBio seqs

I tried running Mothur using these cmds, pacbio=T)
summary.seqs(fasta=PB_16S_group.fasta, count=PB_16S_group.count_table, processors=6)
screen.seqs(fasta=PB_16S_group.fasta, count=PB_16S_group.count_table, maxambig=0, maxlength=1493, maxhomop=8)
summary.seqs(fasta=current, count=current)
align.seqs(fasta=current, reference=silva.nr_v138_1.align, processors=10)
summary.seqs(fasta=current, count=current)
screen.seqs(fasta=current, count=current, start=1044, end=43116)
summary.seqs(fasta=current, count=current)
filter.seqs(fasta=current, vertical=T, trump=.)

and then it crashed with this error msg

Using PB_16S_group.good.unique.good.good.align as input file for the fasta parameter.

Using 8 processors.
Creating Filter…
[ERROR]: Sequences are not all the same length, please correct.
It took 63 secs to create filter for 202895 sequences.

These files work in DADA2 but I want to also test an good old fashion 97% OTU approach - any help welcome

It looks like you have too many "good"s in the file name for the pipeline you gave. At what command did it start complaining? You might go back to where you don’t get an error message and then step through each of the commands