I have come recently to trouble with continuing an anlaysis downstream from subsample.
I need to subsample fasta file and list file - but the command will not allow them both together plus a group file option. (It says it wants to build a new groupfile out of either one or another.)
But if I do want to continue with analysis that forks one direction from the fasta and another from list file, how can I do it so that they will remain synchronous. As I understand, if I run subsample(size=170) for fasta and then for subsample(size=170) for list file they will both contain randomly selected 170 sequences per group, this means those sequences will not be the same 170?
To elaborate. The trouble creeps in at this step of my work.
dist.seqs(fasta=CNP.final.fasta, cutoff=0.10, processors=7)
make.shared(list=CNP.final.fn.list, group=CNP.proovikaupa.groups, label=0.05)
#in the count the smallest one is 17904
sub.sample(shared=CNP.final.fn.shared, fasta=CNP.final.fasta, name=CNP.final.names, group=CNP.proovikaupa.groups, persample=T, size=17904)
dist.seqs(fasta=CNP.final.subsample.unique.fasta, output=lt, processors=7)
#then a lot of collect,rarefaction heatmap etc. for the subsampled shared file, (abridged them off now)
#but next, I want to look at the relabund file and , to find out the classification of the most abundand otu-s
#now i would like to do classify.otu-s which requires a list file, how can I get a list file that has been resampled in a same fashion that the shared file(and others were before)
I tried to …
to get a list file from dist. thats from resampled fasta
classify.otu(taxonomy=CNP.final.taxonomy, list=CNP.final.subsample.unique.phylip.fn.list, name=CNP.final.names)
#but it turned out the numer of otus given by this( 0.05 2032) was different from the otu nr of CNP.final.0.05.subsample.shared (2240)