How to create OTUs and diverse sample from sequence FASTA file to end?

I have data of sequences from a sample and would like to diverse them in intervals based on similartiy to references with 97% as same species, 97%-95% smilartiy is same genius and goes on. Wih mothur would like to star with sequence data as FASTA file and continue from there with trimimn , alignment and so… Please help me wih this. as it sounds I am a lot unexperienced beginning to be a beginner :slight_smile:

Hi, welcome to mothur! You really can’t use a distance based threshold to split sequences into taxonomic levels like family, genus, or species. If you want family-level data, then I would use classify.seqs and phylotype. If you want OTUs that are below the genus level, I’d use the cluster.split approach. Try following the SOPs in the example analysis page and go from there.


Thanks for the quick answer and which SOP do you recommend for me to follow so reach species level deep ?

I would start with this:

Also, keep in mind that few experts think you can really get to the species level with 16S rRNA gene sequences, much less 250 nt from the gene. An OTU at the 97% threshold is about as close as you’re going to get.


I have tried MiSeq_SOP but had some errors. for example couldnt be able to create one output file after chime.uchime command which ends with “denovo.unique…count_table”…

Other than that I have these files with me are those enough to analyze?

myfile-full.fasta, myfile-full.qual <= contain raw sequence data information still have barcodes and primers

myfile-pr.fasta, myfile-pr.qual <= sequencing information without primers and barcodes

myfile-mapping.txt, myfile-mapping2.txt <= contain sample ID, barcode, primer, barcode/primer name

I’m sorry, but I’m not sure what you’re asking. Can you show the exact command you are running and the error message you are getting?