Aliging sequences prior to phylip distance and

I want to apply clustering algorithm to the datasets from http://www.mothur.org/MiSeqDevelopmentData/

so lets consider one of the datasets (human dataset from 121203.tar). First, I performed a few preprocessing steps (pre.cluster, …), which remived the size of the data to 21575, and it looks like this:

>M00967_15_000000000-A2G1J_1_1101_16462_1801
TACGGAAGGTCCGGGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGGCCGGAGATTAAGCGTGTTGTGAAATGTAGATGCTCAACATCTGAACTGCAGCGCGAACTGGTTTCCTTGAGTACGCACAAAGTGGGCGGAATTCGTGGTGTAGCGGTGAAATGCTTAGATATCACGAAGAACTCCGATTGCGAAGGCAGCTCACTGGAGCGCAACTGACGCTGAAGCTCGAAAGTGCGGGTATCGAACAGG
>M00967_15_000000000-A2G1J_1_1101_16752_1822
TACGGAAGGTCCGGGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGGCCGGAGATTAAGCGTGTTGTGAAATGTAGACGCTCAACGTCTGCACTGCAGCGCGAACTGGTTTCCTTGAGTACGCACAAAGTGGGCGGAATTCGTGGTGTAGCGGTGAAATGCTTAGATATCACGAAGAACTCCGATTGCGAAGGCAGCTCACTGGAGCGCAACTGACGCTGAAGCTCGAAAGTGCGGGTATCGAACAGG
>M00967_15_000000000-A2G1J_1_1101_14896_1912

So, for the next step before computing the phylip distance matrix file, I have to align my sequences to some referenece datasets (SILVA or GreenGenes). I tried green-genes but mothur gave an error indicating that sequences in green-genes are not aligned. But when I tried SILVA, it worked!

So, my question is whether using SILVA vs. GreenGenes has different biological applications?

Here is the aligned data:


>M00967_15_000000000-A2G1J_1_2114_9838_26291
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
...........................................................................................T-------AC---GG-AA-GGT---------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------CCG-G-G---------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------C--G--T---T--AT-C-CGG--T------TT-A
--T-T--GG-GT--------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------TT--A-----AA-GG-G
A-GC-------G-TA-G-G-C-C---------------G--G-AG-A-T-T-----------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------AA----G-C-G-T---------------------------
--------------------------G-T-T--G--TG--A-AA-TG--T-A-GA-T-G---------------------------------------------------------------------------
-------------------------------------------------------------------CT-C-AA------------------------------------------------------------
----------------------------------------------------------------------------------------------------C-A-T-C-T-G-A--A-C----T-G--C-A---G
--C----------------------------G--CG-A-A---C----------------------------------------------------T--G-G--TT--T-C-C---------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------

Not sure of the wording of your question, but we strongly discourage the use of greengenes for aligning sequences since it’s pretty shitty. It’s almost random within the variable region. Please save yourself some headache and use SILVA. You can see more about this here:

For classification, feel free to use SILVA, greengenes, or RDP.