I am trying to remove singletons and go from there for alpha and beta diversity analysis. Can anyone advise which step this command should be inserted?
I did these:
Following through Mothur SOP,
mothur > sub.sample(shared=stability.an.shared, size=6105)
Sampling 6105 from each group.
0.03
Get the minimal number of OTUs that were present in all of these samples, then go from there for :
collect.single(shared=stability1.an.0.03.subsample.0.03.pick.shared, calc=chao-invsimpson, freq=100)
Does this insertion make senses? I mean, if I want to compare the results with and without the removing the singletons.
Thank you!
The taxonomy=current does not work (see below): if I use the one produced before removing singletons, will it messes up the classification?
mothur > classify.otu(list=current, count=current, taxonomy=current, label=0.03)
Using stability1.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.pick.count_table as input file for the count parameter.
Using stability1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.an.unique_list.0.03.pick.list as input file for the list parameter.
[WARNING]: no file was saved for taxonomy parameter.
You have no current taxonomy file and the taxonomy parameter is required.
reftaxonomy is not required, but if given will keep the rankIDs in the summary file static.
[ERROR]: did not complete classify.otu.
I used the most current taxonomy file in the workflow, which is after the remove.lineage() for me. That was before removing the singletons, so it should be fine.
FWIW, I strongly discourage the removal of singletons for alpha and beta-diversity analysis. As in, if I got your manuscript, I would raise a red flag. All of the metrics are dependent on the distribution of sequences. Removing singletons will disproportionately affect samples with higher sequencing depths. Instead, the better choice is to rarefy your data to a common number of sequences.