Phylogenetic tree count_table error

Hello,

I am currently analyzing 16S rRNA data as ASVs with Mothur and processing them in R with the phyloseq package. As of right now, my .shared and .taxonomy files come out fine from my batch script and form a correctly formatted phyloseq object in R. However, when I try to overlay my .tre file which is generated from a phyllip dist matrix it doesn’t work. I think the issue is because the .tre file doesn’t contain any of the ASV numbers or count information beyond the sequence names. Previous workflows on the internet use “get.oturep” to extract each ASV and count for the representative sequence which is then used to create a phylip.dist file. This phylip.dist file is then input into clearcut(phylip=current) to make the .tre file I need. However, when I tried to run the “get.oturep” command it gives me an error stating my “count_table” file has more than one sequence named a certain thing. I am unsure why this is happening. I think this is odd because my full batch script runs through mothur with the count_table file without any errors. Do you have any suggestions?

#mothur batch script that worked correctly
make.contigs(file=stability.files, trimoverlap=t)
summary.seqs(fasta=current, processors=128)
screen.seqs(fasta=current, count=current, maxambig=0, maxlength=294)
unique.seqs(fasta=current, count=current)
align.seqs(fasta=current, reference=~/silva.nr_v138.align, flip=t)
summary.seqs(fasta=current, count=current, processors=128)
screen.seqs(fasta=current, count=current, start=11895, end=25318, maxhomop=5)
summary.seqs(fasta=current, count=current, processors=128)
filter.seqs(fasta=current, vertical=T, trump=.)
unique.seqs(fasta=current, count=current)
pre.cluster(fasta=current, count=current, diffs=2)
chimera.uchime(fasta=current, count=current, dereplicate=t)
remove.seqs(fasta=current, accnos=current)
classify.seqs(fasta=current, count=current, reference=~/silva.nr_v138.align, taxonomy=~/silva.nr_v138.tax, cutoff=60, probs=F)
remove.lineage(fasta=current, count=current, taxonomy=current, taxon='Chloroplast;-Eukaryota;-unknown;-Mitochondria;-Oxyphotobacteria_unclassified;')
cluster.split(fasta=current, count=current, taxonomy=current, splitmethod=classify, taxlevel=4, cutoff=0.01)
make.shared(list=current, count=current, label=0.03)
classify.otu(list=current, count=current)
make.shared(count=current)
classify.otu(list=current, count=current, label=asv)
dist.seqs(fasta=current, output=lt, processors=24)
clearcut(phylip=current) #this worked and created the phylip.dist object


#running mothur interactively to see if I can get the "get.oturep" command to work.


get.oturep(column=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.pick.dist,list=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.asv.list, fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table, label=0.03)


Hi there,

I’d suggest running get.oturep and then dist.seqs/clearcut on the output of that. You’re running clearcut on all of the sequences, not just the representative sequences.

Pat