Mothur needs a distance matrix for assigning OTUs. I can get an alignment of about 2,500 sequences very well and get a distance matrix for assigning OTUs with Mothur. However, for a large sequence data set containing >18,000 sequences (above 500bp for each sequence), I cannot get an alignment using the ClustalX, Mega and MUSCLE, therefore a distant matrix for Mothur. Do you have any suggestion on how to use Mothur for OTU assignments on such a huge data set? Or it is impossible on assigning OTUs with Mothur for such a huge data set? Thanks!
Have you tried using the aligner in mothur? http://www.mothur.org/wiki/Align.seqs Then you can use the dist.seqs command to get the distance matrix, http://www.mothur.org/wiki/Dist.seqs , and either the cluster or hcluster command to assign OTUs. http://www.mothur.org/wiki/Cluster http://www.mothur.org/wiki/Hcluster
I think that is the part I didn’t understand, so I used ClustalX to get an alignment, Phylips to get a distant matrix, then used Mothur to assign OTUs. For “align.seqs” of Mothur, I need to use “template=core_set_align.imputed.fasta”. Where I get the template? I only have my sequence file in fasta format. I will appreciate your help.
You can download the various template files from here, http://www.mothur.org/wiki/Alignment_database.