I want to apply clustering algorithm to the datasets from http://www.mothur.org/MiSeqDevelopmentData/
so lets consider one of the datasets (human dataset from 121203.tar). First, I performed a few preprocessing steps (pre.cluster, …), which remived the size of the data to 21575, and it looks like this:
>M00967_15_000000000-A2G1J_1_1101_16462_1801
TACGGAAGGTCCGGGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGGCCGGAGATTAAGCGTGTTGTGAAATGTAGATGCTCAACATCTGAACTGCAGCGCGAACTGGTTTCCTTGAGTACGCACAAAGTGGGCGGAATTCGTGGTGTAGCGGTGAAATGCTTAGATATCACGAAGAACTCCGATTGCGAAGGCAGCTCACTGGAGCGCAACTGACGCTGAAGCTCGAAAGTGCGGGTATCGAACAGG
>M00967_15_000000000-A2G1J_1_1101_16752_1822
TACGGAAGGTCCGGGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGGCCGGAGATTAAGCGTGTTGTGAAATGTAGACGCTCAACGTCTGCACTGCAGCGCGAACTGGTTTCCTTGAGTACGCACAAAGTGGGCGGAATTCGTGGTGTAGCGGTGAAATGCTTAGATATCACGAAGAACTCCGATTGCGAAGGCAGCTCACTGGAGCGCAACTGACGCTGAAGCTCGAAAGTGCGGGTATCGAACAGG
>M00967_15_000000000-A2G1J_1_1101_14896_1912
So, for the next step before computing the phylip distance matrix file, I have to align my sequences to some referenece datasets (SILVA or GreenGenes). I tried green-genes but mothur gave an error indicating that sequences in green-genes are not aligned. But when I tried SILVA, it worked!
So, my question is whether using SILVA vs. GreenGenes has different biological applications?
Here is the aligned data:
>M00967_15_000000000-A2G1J_1_2114_9838_26291
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
...........................................................................................T-------AC---GG-AA-GGT---------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------CCG-G-G---------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------C--G--T---T--AT-C-CGG--T------TT-A
--T-T--GG-GT--------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------TT--A-----AA-GG-G
A-GC-------G-TA-G-G-C-C---------------G--G-AG-A-T-T-----------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------AA----G-C-G-T---------------------------
--------------------------G-T-T--G--TG--A-AA-TG--T-A-GA-T-G---------------------------------------------------------------------------
-------------------------------------------------------------------CT-C-AA------------------------------------------------------------
----------------------------------------------------------------------------------------------------C-A-T-C-T-G-A--A-C----T-G--C-A---G
--C----------------------------G--CG-A-A---C----------------------------------------------------T--G-G--TT--T-C-C---------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------