I use OTUs method for 16S analysis. I have some problem about the input file which should be use to calculate taxonomy. Should I use shared file or subsample.shared file to calculate taxonomy in each group?
and then which one is better between mean or median for relative abundance calculation in each group. I found that some genus give different value between using mean and median.
Could you please suggest me?

Thank you very much.

I add to this question. I also wander about relative abundances… if we have the subsampled shared file, is there any difference in using the absolute abundances or relative abundances for alfa and betadiversity? In which cases it is advisable to use relative abundances and which is the most recommended way to calculate it? I was told TSS, but I read about several ways of scale and transform data and can´t make my mind about when I should use one or another or both, scale and transform?
Thank you for sharing your thoughts/criteria!

Use the subsampled shared file for all analyses. The difference between a sample having 100,000 reads and one that has 5,000 reads (using current PCR based sequencing) is purely technical, it does not reflect anything about the biology of the samples.

@svazquez what types of scaling and transformations are you thinking about doing? it’s easier to answer a specific case than give a general guideline.

Hi, thank you!
Firstly, I’d like to know if using the subsampled shared file is the same to work with absolute abundances (reads) or relative abundances (all in all, the samples will have the same amount of sequences, in a way it is scaled, isnt’t it?).
Then, if I am wrong and for some reason we should work for alfa and betadiversity with relative abundances, is the best way to scale all samples to normalize each sample on it’s total sum? or normalize in another way?
Then, either using absolute or relative abundances, to run betadiversity tests or constrained ordinations, should I clr transfrom the abundances? or any other transformation?
This point is one of my weakest, I could never be good at statistics :frowning:
And if this goes in a way that does not apply anymore to this trhead sorry, I can open a new question :wink:
Thank you!!

Thank you very much for your suggestion @Kendra.
I just do the project about gut microbiome in human for the first time. I found that when I used mean for calculation, genus Escherichia for normal group (7.080%) has a little bit more relative abundance than disease group (7.000%). However, if I used median, genus Escherichia for normal group (0.875%) have less relative abundance than disease group (1.8%). This confuses me. However, I did not check them for statistical significance between them. I maybe check them again.

I don’t transform further. Use mothur to calculate alpha (summary.single) and beta (dist.shared) diversity measured because you can repeatedly subsample and get an average for each index.

