Rationale behind sub.sample with persample=f

Tris · August 13, 2012, 6:18am

Hi Pat and Sarah,

I was wondering if you could explain what’s happening during the subsample command when you don’t specify persample=t. I included a unique fasta, names, and groups file and size and get back varying numbers of sequences in each group when I don’t set persample to true. I am not sure why the random number of sequences per group, they are all within around 10-15% of the specified size but none are actually on the mark. The analysis was done in version 1.24.1, I’m not sure of the commands entered exactly and don’t have the logfile as this was done a while back and I just assumed the groups would each have equal numbers of sequences.

Thanks,
Tris

pschloss · August 13, 2012, 1:23pm

with persample=f, mothur would randomly draw out however many sequences from the pool and so the percentage in each group will remain the percentage (give or take) in the subsampled files. it really doesn’t makes sense to do persample=f if you have multiple groups.

hth,
pat

Topic		Replies	Views
sub.sample - question about "size" and "persample" Commands in mothur	1	2001	April 4, 2013
sub.sample feature Commands in mothur	2	3943	January 10, 2011
sub.sample command Commands in mothur	2	2335	July 13, 2012
sub.sample Commands in mothur	1	1695	January 21, 2015
sub.sample question Commands in mothur	1	1962	February 25, 2014

Rationale behind sub.sample with persample=f

Related topics