When to normalize by number of sequences?

cgimpel · November 20, 2016, 8:27pm

Hi dear Mothur community,

I have a question regarding the QC for pair-end MiSeq 18S rRNA - V4 Illumina sequencing.

I followed the SOP for MiSeq very successfully. Thanks! Is a great help.
Now I have a question about normalization of my data set.
Why do we cluster the OTUs before normalizing to, in my choice, number of sequences per sample? I have 15 samples between 70 and 120 K seqs.
Could somebody please explain this to me? I tend to think that is needed to have the sequences before clustering them, otherwise don’t we cluster sequences in OTUs that might end out of the subsample?

I was looking for a way to normalize my number of sequences before clustering, but I need a list file that only comes when we cluster.

Please help my reasoning,

Have a nice day,

Carla

Kendra · November 21, 2016, 2:17pm

I follow the sop and cluster all sequence then subsample/rarify/normalize seq number when calculating alpha and beta diversity and create one subsampled OTU table (but don’t calculate diversity off that table)

Topic		Replies	Views
When to normalize data Theory behind mothur	3	1915	May 18, 2017
Normalizing sequences in each sample Commands in mothur	8	7748	January 9, 2015
Getting random seqs Commands in mothur	10	13550	March 2, 2011
sub.sample for use with the classify.otu command? Commands in mothur	1	2756	October 4, 2011
Question regarding subsampling Theory behind mothur	9	9261	March 4, 2013

When to normalize by number of sequences?

Related topics