get rep seqs from subsampled shared file

Could anybody give me a hint on how to get a fasta file of the representative sequences ONLY for otus from my subsampled shared file?
(I have the rep seq file of ALL otus, and I have the shared file from the subsampling)
Best, Sonja


Maybe there is another way, but you can subsample .fasta, .names, .list files etc. with the get.seqs() cmd using an accnos file containing the sequences you wanna subsample.


list.seqs( … this gives you
get.seqs(, fasta=final.fasta, name=final.names, list=final.phylip.list, dups=f) … this subsamples from the fasta, names and list file creating final.pick.extension files. The dups=f prevents the whole line of an subsampled sequenced in the .names file to be saved, which would yield a mismatch between e.g. the .fasta and the .names file. This .fasta file only contains the unique or representative sequences and the .names file keep track of the abundance.

then you can used make.shared(list=.pick.list, label=, groups=) to regenerate your subsampled shared file if you wanna make sure that it matches the other files.

This is one possible way. There is probably another, quicker way?

You can’t go back from the subsampled shared file to get the sequences from the fasta, names and groups, because the shared file contains only counts no names.

sub.sample(fasta=yourFastaFile, name=yourNameFile, group=yourGroupFile, persample=t, size=sizeYouWant) - sample same number of seqs from each group.
list.seqs(group=current) - list seqs in sample
get.seqs(list=yourListFile, accnos=current) - select subsampled sequence from list file to preserve OTUs created
make.shared(list=current, group=current) - make shared file with sample number of seqs in each group.