How can I get representative sequences from a filtered shared file? For example, I used the code filter.shared(shared=current, mintotal=2, minnumsamples=2, makerare=F, label=0.03) to remove all doubleton OTUs from my table at the 0.03 level. By doing so, mothur removed a total of 1211 OTUs. However, when I run the get.oturep command with the label=0.03 (I only want this level), it returns representative sequences from all the OTUs in the shared file that was originally in the 0.03 level, i.e. the 1211 OTUs included. In other words, instead of having only the sequences from the OTUs that were left after filtering for mintotal=2 and minnumsamples=2, the rep fasta has all of them.
Is there a way in which I can use the labels of the filtered shared file to get representative sequences of only those OTUs that are left in the filtered shared file?
The get.oturep command uses a list file to find the representative sequence. You can use the remove.rare command,http://www.mothur.org/wiki/Remove.rare, to filter your list file and then create the shared file. Here are the commands to run:
mothur > remove.rare(list=yourListFile, nseqs=2) - remove OTUs with abundance 2 or less. NOTE: if you have a count file associated with the list file you must include it to get the accurate OTU abundnaces.
mothur > remove.rare(list=yourListFile, count=yourCountFile, nseqs=2) - remove OTUs with abundance 2 or less. NOTE: with count file.
mothur > list.seqs(list=current) - list sequences in filtered list file.
mothur > get.seqs(fasta=yourFasta, taxonomy=yourTaxonomy, count or group file …other files) - select filtered sequences from all files.
mothur > make.shared(list=current, groupOrCount=current) - create filtered shared file
mothur > dist.seqs(fasta=current, cutoff=yourCutoff) - create filtered distance file. NOTE: you can use the complete distance matrix, but it will be larger, use more memory and make get.oturep run slower.
mothur > get.oturep(list=current, fasta=current, column=current, nameOrCount=current) - get representative sequences for filtered OTUs.
The get.oturep command uses a list file to find the representative sequence. You can use the remove.rare command,http://www.mothur.org/wiki/Remove.rare, to filter your list file and then create the shared file. Here are the commands to run:
mothur > remove.rare(list=yourListFile, nseqs=2) - remove OTUs with abundance 2 or less. NOTE: if you have a count file associated with the list file you must include it to get the accurate OTU abundances.
mothur > remove.rare(list=yourListFile, count=yourCountFile, nseqs=2) - remove OTUs with abundance 2 or less. NOTE: with count file.
mothur > list.seqs(list=current) - list sequences in filtered list file.
mothur > get.seqs(fasta=yourFasta, taxonomy=yourTaxonomy, count or group file …other files) - select filtered sequences from all files.
mothur > make.shared(list=current, groupOrCount=current) - create filtered shared file
mothur > dist.seqs(fasta=current, cutoff=yourCutoff) - create filtered distance file. NOTE: you can use the complete distance matrix, but it will be larger, use more memory and make get.oturep run slower.
mothur > get.oturep(list=current, fasta=current, column=current, nameOrCount=current) - get representative sequences for filtered OTUs.
NOTE: you can use the complete distance matrix, but it will be larger, use more memory and make get.oturep run slower.
Actually, Mothur returns an error if you use the complete distance matrix since it does not find sequences from the complete distance matrix in the new count file. So should run again the dist.seqs function. Anyway thanks for the method.