Hi,
I would like to normalize the number of sequences for each of my samples. The number of reads vary (4000-20000) across 11 samples and i would like to try to normalize it to the sample with the lowest no. of reads (4000). The problem is unless you have a .shared file, the default on the sub.sample command will only select 10% of the number of sequences in the original file - i am currently working with a fasta and group file. Is there any way of selecting a specified number of sequences across all my samples?
Any help is appreciated. Thanks.