and later on did clustering also at cutoff 0.20 using the command
cluster(column=goodtrim.good.good.filter.unique.dist,name=goodtrim.good.good.filter.names,cutoff=0.20)
(Note that after Reading my matrix, I see “changed cutoff to 0.0903472”)
Finally I used the following command to generate data for the rarefaction curve
One possible solution and one possible explanation:
(1) solution: increase your cutoff value of 0.20 to 0.30 or even 0.40 in the “dist.seqs” and “cluster” command. But keep the cutoff value of 0.20 with the “make.shared” command.
(2) explanation: is possible that all your sequence do not share more than 0.0903472 of dissimilarity, they are more phylogenetically related.
**Are these sequences functional genes?*** Maybe you are working with very conserved functional genes.
Hope this help
The reason the cutoff is changing is because you are using average neighbor with a cutoff. When you use a cutoff mothur ignores distances above the cutoff. Then when the averaging occurs, there may be attempts to average two numbers - one above and one below the cutoff. Clearly it can’t average with the number above the cutoff because it’s gone, and so the cluster command adjusts the cutoff down. If you want to get 0.20, then you can do as Vicente suggests and adjust the cutoff up. Alternatively, you can run dist.seqs(output=phylip); cluster.classic(phylip=…, name=…).
As an aside, I’m personally not so convinced anymore that these high cutoffs are really useful. It may be more useful to run phylotype and classify sequences into OTUs at the phylum, class, order levels and perform rarefaction analysis on those data.
Thanks. I will increase the cutoff as you have suggested, and as you predicted, I am working on some highly conserved functional genes.
Best regards,
Tanvir
[quote="vingomez"]
Hi Tanvir,
One possible solution and one possible explanation:
(1) solution: increase your cutoff value of 0.20 to 0.30 or even 0.40 in the “dist.seqs” and “cluster” command. But keep the cutoff value of 0.20 with the “make.shared” command.
(2) explanation: is possible that all your sequence do not share more than 0.0903472 of dissimilarity, they are more phylogenetically related.
**Are these sequences functional genes?*** Maybe you are working with very conserved functional genes.
Hope this help
[/quote]
Thanks Pat for making things clear to me. I am using high cutoff, simply to compare the findings with those generated at less cutoff.
Best regards,
Tanvir
[quote="pschloss"]
The reason the cutoff is changing is because you are using average neighbor with a cutoff. When you use a cutoff mothur ignores distances above the cutoff. Then when the averaging occurs, there may be attempts to average two numbers - one above and one below the cutoff. Clearly it can't average with the number above the cutoff because it's gone, and so the cluster command adjusts the cutoff down. If you want to get 0.20, then you can do as Vicente suggests and adjust the cutoff up. Alternatively, you can run dist.seqs(output=phylip); cluster.classic(phylip=..., name=...).
As an aside, I’m personally not so convinced anymore that these high cutoffs are really useful. It may be more useful to run phylotype and classify sequences into OTUs at the phylum, class, order levels and perform rarefaction analysis on those data.
I would like to ask, regarding rarefaction analysis, what is the suitable cutoff to perform at the phylum, class, order levels for 3 samples which have around 100+ sequences each sample.
How to choose the best cutoff for the graph and to be explain in the journal for bacterial diversity in soil sites?
If you want to do rarefaction at the phylum, class, etc levels, the best thing to do is to use the phylotype command to cluster sequences into those taxonomic levels. There is no way to map distance-based cutoffs to taxonomic levels.
Thank you for your advice. I already used phylotype command to cluster my samples into taxonomix level. Now I got data in tx.list, tx.rabund and tx.sabund. May I know what is the next step to make rarefaction curve graph? Is it by using sabund data or i need to use this file into rarefaction.single command? Please enlighten me.
If your data are all from the same sample then you can run rarefaction.single(sabund=whatever.sabund). If they are from multiple samples, then you need to run make.shared(list=, group=) and then rarefaction.single(shared=).