variable number of sequences _ betadiversity measures

Srini_UGA · November 24, 2011, 5:02pm

Hi
My rrNA sequence dataset has 4 independent replications x 2 locations x 3 seasons x 2 host plant species. But number of sequences for each sample varied between 500 to 7000 sequences. UNIFRAC analysis showed biologically meaningful clustering.But I am wondering, whether my analysis is right or not.
Do I have to extract a subset of equal number of sequences for each sample and perform similar analysis?
Thanks
Srini

pschloss · November 28, 2011, 1:37pm

I would suggest running sub.sample to 500 sequence and repeating. Not only will things get thrown off by a >10-fold difference in sampling effort, but you will also have a >10-fold difference in the number of erroneous sequences that show up in your datasets. In general, when you are comparing communities, by any metric, you want them to have the same number of sequences.

Srini_UGA · November 30, 2011, 4:13pm

Hi Patrick
Thanks very much for your suggestion. I made sure that I am dealing with equal number of sequences per sample. The output looks more meaningful.
Thanks
Srini

bansal_raman · January 6, 2012, 2:40pm

Hi Srini
Similar to your data, I have sequencing data for 3 independent replicates each for 2 treatments. So, basically I have 6 samples in group file. I want to compare both treatments as whole (not the individual replicates, say samples A,B,C with D,E,F). May I know how you gave Unifrac command to compare treatments?
Thanks
Raman

gwidmer · January 10, 2012, 4:12pm

Raman,
I suppose you could merge replicates (A with B with C) and create a new group file with 2 groups (ABC, DEF), then repeat analysis.

Giovanni

Topic		Replies	Views
Selecting sequences for unifrac analysis. Commands in mothur	1	2740	October 4, 2011
Comparison of environment by UniFrac Theory behind mothur	1	1777	August 1, 2016
Normalizing sequences in each sample Commands in mothur	8	7777	January 9, 2015
Unifrac with unifrac distance Commands in mothur	2	2232	February 10, 2016
Rarefaction or sub-sampling? Commands in mothur	19	5631	May 21, 2020

variable number of sequences _ betadiversity measures

Related topics