First, here is why I am using this command.
I have Salmonella in a positive control.
Using the trad SOP, I have Salmonella after classify.seqs.
But when I classify my OTU, they are not there.
The only way to find them is to cluster at 0.005 (they get merged at 0.01).
I am not a fan of these almost ASV and not a fan of dada2 also (loosing almost all my sequences and gives perfect positive control which is technically impossible to have a pure positive control without any contamination at all).
So, I went for cluster.spli to make OTU after spliting by taxonomy at the genus level, so that I can keep my Salmonella (well I loved them).
This is the commands I ran and what I got.
"
mothur > dist.seqs(fasta=phytosynthese2021.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta,cutoff=0.03)
mothur > cluster.split(column=current, count=current, taxonomy=current, splitmethod=classify, taxlevel=6, delta=0, iters=300,cutoff=0.03)
label | cutoff | numotus | tp | tn | fp | fn | sensitivity | specificity | ppv | npv | fdr | accuracy | mcc | f1score |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.03 | 0.03 | 163578 | 441 | 1.33565e+10 | 29 | 4.90239e+07 | 8.996e-06 | 1 | 0.9383 | 0.9963 | 0.9383 | 0.9963 | 0.002899 | 1.799e-05 |
"
So there is no clustering as I get 163 578 OTU from 163 741 unique.
The weird ting is when I run this command only in my positive control, I am going from 350 uniques to 179 OTU…
More over, when I run mke.shared, I get a weird output (why does the number of sequences to not match??) and even though the log say it made a .shared file, I do not see it!
"
mothur > make.shared(list=current,count=current,label=0.03)
Using phytosynthese2021.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table as input file for the count parameter.
Using phytosynthese2021.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.opti_mcc.list as input file for the list parameter.
[ERROR]: M03992_423_000000000-CK27V_1_1113_10438_12959 is in your groupfile and not your listfile. Please correct.
[ERROR]: M03992_423_000000000-CK27V_1_2104_13401_20983 is in your groupfile and not your listfile. Please correct.
[ERROR]: M03992_423_000000000-CK27V_1_2107_21552_19489 is in your groupfile and not your listfile. Please correct.
[ERROR]: M03992_423_000000000-CK27V_1_2112_18236_19445 is in your groupfile and not your listfile. Please correct.
Your group file contains 163741 sequences and list file contains 163737 sequences. Please correct.
Output File Names:
phytosynthese2021.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.opti_mcc.shared
"
So well, any thought?