Hello
I have used your pipeline on my 16S rRNA data. I am new in this field.
I would very much like to receive feedback here, if I have done something wrong.
When I used this command:
mothur > pre.cluster(fasta=stability.file.trim.contigs.good.unique.good.filter.unique.fasta, count=stability.file.trim.contigs.good.unique.good.filter.count_table, diffs=2), I loose > 90 % of the sequences.
An example is shown here (from my mothur.logfile):
Processing group cDNA1:
2759 165 2594
Total number of sequences before pre.cluster was 2759.
pre.cluster removed 2594 sequences.
It took 0 secs to cluster 2759 sequences.
Processing group cDNA10:
5236 378 4858
Total number of sequences before pre.cluster was 5236.
pre.cluster removed 4858 sequences.
It took 0 secs to cluster 5236 sequences.
Processing group cDNA11:
1851 221 1630
Total number of sequences before pre.cluster was 1851.
pre.cluster removed 1630 sequences.
It took 0 secs to cluster 1851 sequences.
Processing group cDNA12:
10333 673 9660
Total number of sequences before pre.cluster was 10333.
pre.cluster removed 9660 sequences.
It took 1 secs to cluster 10333 sequences.
Why have so many sequences been removed?
Thank you very much!