Steps needed to repeat for the analysis of the subset

Hello everyone,

I want to pick up a subset of the bigger dataset that I have profiled (Clustering, OTU annotation vs) for another analysis. Should I just simply get the profiles of the subset from the exiting profile file of the bigger dataset? or I should do the profiling process all over again for the subset samples? if I need to do the profiling again, from which steps should I repeat the profiling for the subset? clustering? chimera removal? Or should i just start from the make.contigs step?

Thank you

I’m not 100% what you mean, but if you are going to compare these sequences/OTUs to another set of sequences/OTUs then you’ll want to go back before the align.seqs step.


Hi Pat,
Thank you for the reply. Let’s say I have 6 profiled samples in total separated into two groups, such as A, B, C versus D, E, F. And now I want to compare A, B versus D, E, F, or A, B versus E, F. All the new comparisons are for the samples from the initial profiled samples. there is no newly introduced sample.

By the way, do you think that the results from the mothur 1.40 differ very much from the ones from mothur 1.48 ?

As I understand it, you’ve already processed samples A, B, C, D, E, and F together and want to make comparisons between those samples. There shouldn’t be a problem extracting the information from summary.single, dist.shared, etc. when those functions are performed on the full dataset.

I always encrouage people to update their versions of mothur. The one you’re using is a few years old at this point.


Thank you a lot. This is very helpful.

I was using the galaxy platform, which is very nice for organizing jobs and files. that is why I used the old version which is in the galaxy.

Good day

Well, I would simply add a grouping variable to the samples you want to compare together. In R this is relatively easy but in galaxy I do not know.

Hi Alexandre,

thank you for your reply. You are right. The profiles of the samples do not differ very much if one repeats the whole process(starting from make.contigs) for the smaller set if just considering the taxonomic abundance (Even though there was a 0.0001 difference at the phylum level when I tried). So one can just add a grouping variable to compare taxonomic profiles between different groups. But the alpha diversity differed significantly. Right now, i am trying to figure out which step is responsible for the difference in the alpha diversity profiles.



This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.