Sub-sampling sequences

blairesteven · September 14, 2010, 3:00pm

Hey

Do you know what it would be really awesome if Mothur did?
Sub-sampled Fasta files to the same number of sequences so that comparisons between samples aren’t biased due to different sampling efforts.
Currently I use daisychopper (http://www.genomics.ceh.ac.uk/GeneSwytch), but it would be great if I could just add this to my Mothur pipe line.

Just a thought…

Keep up the great work

Blaire

pschloss · September 15, 2010, 10:27am

Blaire,

Yeah, it’s in the works. However… I think that if people stay away from beta-metrics based on counts (i.e. Bray-Curtis) this shouldn’t be such a problem. The next version of mothur will have something kind of like you want. We’ll have a command that converts the output of get.relabund to a shared file where every line has about the same number of sequences. If you’d be interested in giving this a whirl, pre-release, let us know (mothur.bugs@gmail.com).

Pat

Topic		Replies	Views
sub.sample feature Commands in mothur	2	3918	January 10, 2011
Issues on sub.sampling mothur bugs	5	8067	June 14, 2014
Diversity comparisons between different sized datasets? Theory behind mothur	15	7524	March 18, 2015
Merge the count file Commands in mothur	3	2853	September 4, 2013
sub.sample question Commands in mothur	1	1946	February 25, 2014

Sub-sampling sequences

Related topics