Hello Pat,
I would like to discuss again with you about the output of classified seqs. When I create.database I see that the taxonomic identification to class (& sometimes phylum) level is many times lower than 80%. Which doesn’t make sense because there is a cutoff of 80% similarity to genus level, right? I wonder how poorly identified seqs create noise in the analysis? any ideas?
OTUNumber com519Fbar1 com519Fbar2 com519Fbar3 com519Fbar5 com519Fbar6 com519bar10 com519bar11 com519bar12 com519bar15 com519bar16 com519bar19 com519bar20 com519bar4 com519bar9 repSeqName repSeq OTUConTaxonomy Phylum Class Order Family Genus
Otu00325 0 10 0 5 2 0 0 0 8 0 0 0 0 0 IYW8NF002F75BE GT-AG-GGG-GCG-A-G-CG-TTGT-CC-GG-AT-TT-A–T-T-G-GGC-GTA—AA-GAGC-TC-G-TA-G-G-C-G-G–C-TC-A-A-C-AA—G-T-C-G—G-CCG-TG-A-AA-GC–CC-GAG-G–CT-C-AA—CC-T-C-GG-GA-C—G-C-C-G-G-T–C--GA-A-A-C-T-G-TTGT-G-G-C–T-A-G-G-G-T-C–C-GG–TA-G-A—G-GA-G-AG-T—GG–AATT-CCC-G-GT–GT-A-GCG-GTGAAA-TG-CGC-AGAT-A-TC-G-GGA–GG-A-AC-A-CC-AG–T--A–GC-GAA-G–G-C–G--G–C-T-CTCTG----G-GC-CG-----GC-A-C-C–GA-CG----CT-GA-GG–A-G-CGA–AA-G-C–TA–GGG-GAG-C-A-AACA–GG-ATTA-G-ATA-C-CC-T-G-GTA-G-T Bacteria(100) “Actinobacteria”(76) Actinobacteria(76) unclassified(76) unclassified(76) unclassified(76)
Otu00339 0 0 0 0 4 0 0 0 12 4 0 0 0 4 IYW8NF002IZ54A GA-AC-CGT-ACG-A-A-CG-TTAT-T-CGG-AA-TC-A–C-T–GGGC-TTA—AA-GAGT-GC-G-TA-G-G-C-G-G–C-TT-G-G-C-AA—G-T-T-G—G-GTG-TG-A-AA-TC–CC-TCG-G–CT-C-AA—CC-G-A-GG-AA-T—T-G-C-G-C-T–C--AA-A-A-C-T-G-CTA–A-G-C–T-T-G-A-G-G-G–A-GA–TA-G-G—G-GT-G-AG-C—GG–AACT-AAT-G-GT–GG-A-GCG-GTGAAA-TG-CGTTG-AT-A-TC-A-TTA–GG-A-AC-A-CC-GG–A--G–GC-GAA-A–G-C–G--G–C-T-CACTG----G-GT-CT-----TT-T-C-T–GA-CG----CT-GA-GG–C-A-CGA–AA-G-C—T-AGGG-GAG-C-G-AACG–GG-ATTA-G-ATA-C-CC-C-G-GTA-G-T Bacteria(100) unclassified unclassified unclassified unclassified unclassified
Otu00378 3 10 0 1 2 0 0 3 1 0 0 0 1 0 IYW8NF002ICOTU GT-AG-GGG-GCA-A-G-CG-TTGT-CC-GG-AT-TC-A–T-T-G-GGC-GTA—AA-GAGC-TC-G-TA-G-G-C-G-G–C-TC-A-G-T-AA—G-T-C-G—G-CCG-TG-A-AA-GC–CC-GAG-G–CT-C-AA—CC-T-C-GG-GA-C—G-C-C-G-G-T–C--GA-T-A-C-T-G-CTGT-G-G-C–T-A-G-G-G-T-C–C-GG–TA-G-A—G-GA-G-AG-T—GG–AATT-CCC-G-GT–GT-A-GCG-GTGAAA-TG-CGC-AGAT-A-TC-G-GGA–GG-A-AC-A-CC-AG–T--A–GC-GAA-G–G-C–G--G–C-T-CTCTG----G-GC-CG-----GT-A-C-C–GA-CG----CT-GA-GG–A-G-CGA–AA-G-C–TA–GGG-GAG-C-A-AACA–GG-ATTA-G-ATA-C-CC-T-G-GTA-G-T Bacteria(100) “Actinobacteria”(67) Actinobacteria(67) unclassified(67) unclassified(67) unclassified(67)
Otu00412 0 0 0 1 0 0 0 0 0 0 1 4 0 13 IYW8NF002G4JOU AG-AG-GGC-TCA-A-G-CG-TTAA-T-CGG-AA-TC-A–C-T–GGGC-TTA—AA-GGGT-CC-G-CA-G-G-C-G-G–G-TT-G-G-C-AA—G-T-A-T—C-GAG-TG-A-AA-TA–CC-ACG-G–CT-C-AA—CC-G-T-GG-AA-C—T-G-C-T-C-G–G--TA-A-A-C-T-G-CCA–A-C-C–T-T-G-A-A-C-A–C-GG–TA-G-G—G-GC-C-AT-C—GG–AACT-CTA-G-GT–GG-A-GCG-GTGAAA-TG-CGT-AGAT-A-TC-T-AGA–GG-A-AC-G-CC-AG–A--G–GC-GAA-G–G-C–G--G–A-T-GGCTG----G-GC-CG-----TT-G-T-T–GA-CG----CT-CA-GG–G-A-CGA–AA-G-C—G-TGGG-TAG-C-G-AACG–GG-ATTA-G-ATA-C-CC-C-G-GTA-G-T Bacteria(100) “Planctomycetes”(79) Phycisphaerae(79) Phycisphaerales(79) Phycisphaeraceae(79) Phycisphaera(79)
Otu00426 0 1 0 10 6 1 1 0 0 0 0 0 0 0 IYW8NF002G89C3 GA-AC-CGT-CCA-A-A-CG-TTAT-T-CGG-AA-TC-A–C-T–GGGC-TTA—AA-GGGT-GC-G-TA-G-G-C-G-G–C-CC-T-G-T-AA—G-T-T-G—G-GTG-TG-A-AA-TC–CC-TCG-G–CT-C-AA—CC-G-A-GG-AA-T—T-G-C-G-C-C–C--AA-T-A-C-T-G-CAG–G-G-C–T-A-G-A-G-G-G–A-GA–CA-G-A—G-GT-G-AG-C—GG–AACT-TGT-G-GT–GG-A-GCG-GTGAAA-TG-CGT-TGAT-A-TC-A-CAA–GG-A-AC-A-CC-TG–T--G–GC-GAA-A----G-CG–G--C-T-CACTG----G-GT-CT-----TT-T-C-T–GA-CG----CT-GA-GG–C-A-CGA–AA-G-C—T-GGGG-GAG-C-G-AACG–GG-ATTA-G-ATA-C-CC-C-G-GTA-G-T Bacteria(100) “Planctomycetes”(58) “Planctomycetacia”(58) Planctomycetales(58) Planctomycetaceae(58) unclassified(58)
Otu00447 0 1 0 4 0 3 2 0 0 0 0 2 6 0 IYW8NF002G6DIG GA-AC-CGT-ACG-A-A-CG-TTAT-T-CGG-AA-TC-A–C-T–GGGC-TTA—AA-GAGT-GC-G-TA-G-G-C-G-G–C-TT-G-G-C-AG—G-T-T-G—G-GTG-TG-A-AA-GC–CC-TCG-G–CT-C-AA—CC-G-A-GG-AA-T—T-G-C-G-C-C–C--AA-A-A-C-C-G-CCA–A-G-C–T-T-G-A-G-G-G–A-GA–TA-G-A—G-GT-G-AG-C—GG–AACT-AAT-G-GT–GG-A-GCG-GTGAAA-TG-CGT-TGAT-A-TC-A-TTA–GG-A-AC-A-CC-GG–T--G–GC-GAA-A----G-CG–G--C-T-CACTG----G-GT-CT-----CT-T-C-T–GA-CG----CT-GA-GG–C-A-CGA–AA-G-C—T-AGGG-GAG-C-G-AACG–GG-ATTA-G-ATA-C-CC-C-G-GTA-G-T Bacteria(100)
unclassified(67) unclassified(67) unclassified(67) unclassified(67) unclassified(67)
Otu00561 0 9 0 3 2 0 0 0 0 0 0 0 0 0 IYW8NF002HK82E GG-AG-GGT-GCG-A-G-CG-TTAA-T-CGG-AA-TC-A–C-T–GGGC-GTA—AA-GAGC-GC-G-TA-G-G-T-G-G–T-CT-G-A-T-TA—G-T-C-G—G-ATG-TG-A-AA-GC–CC-TAG-G–CT-C-AA—CC-T-A-GG-AA-C—T-G-C-A-T-T–C--GA-T-A-C-T-G-TCA–G-G-C–T-T-G-A-G-T-A–T-GG–GA-G-A—G-GG-A-AG-C—GG–AATT-CCC-G-GT–GT-A-GCG-GTGAAA-TG-CGT-AGAT-A-TC-G-GGA–GG-A-AC-A-CC-AG–T--G–GC-GAA-G–G-C–G--G–C-T-TCCTG----G-CC-CA-----AT-A-C-T–GA-CA----CT-GA-GG–C-G-CGA–AA-G-C—G-TGGG-GAG-C-A-AACA–GG-ATTA-G-ATA-C-CC-T-G-GTA-G-T Bacteria(100) “Proteobacteria”(79) unclassified(58) unclassified(58) unclassified(58) unclassified(58)
You asked me for this
[quote="pschloss"]
What command syntax are you running for classify.seqs and the database command?
[/quote]
when we first talked about it:
To classify seqs after i remove lineage I run this:
mothur > classify.seqs(fasta=3sites.shhh.trim.unique.good.filter.unique.precluster.pick.fasta, name=3sites.shhh.trim.unique.good.filter.unique.precluster.pick.names, group=3sites.shhh.good.pick.groups, template=/home/zm1/Mothur.cen/Trainset9_032012.pds/trainset9_032012.pds.fasta, taxonomy=/home/zm1/Mothur.cen/Trainset9_032012.pds/trainset9_032012.pds.tax, cutoff=80)
To create the database:
mothur > get.oturep(list=3sitesall.final.woCyano.an.list, label=0.03, fasta=3sitesall.final.woCyano.fasta, column=3sitesall.final.woCyano.dist, name=3sitesall.final.woCyano.names)
********************###########
Reading matrix: |||||||||||||||||||||||||||||||||||||||||||||||||||
0.03 5860
Output File Names:
3sitesall.final.woCyano.an.0.03.rep.names
3sitesall.final.woCyano.an.0.03.rep.fasta
mothur > classify.otu(list=3sitesall.final.woCyano.an.list, name=3sitesall.final.woCyano.names, taxonomy=3sitesall.final.woCyano.taxonomy, label=0.03)
reftaxonomy is not required, but if given will keep the rankIDs in the summary file static.
0.03 5860
Output File Names:
3sitesall.final.woCyano.an.0.03.cons.taxonomy
3sitesall.final.woCyano.an.0.03.cons.tax.summary
mothur > create.database(list=3sitesall.final.woCyano.an.list, label=0.03, repfasta=3sitesall.final.woCyano.an.0.03.rep.fasta, repname=3sitesall.final.woCyano.an.0.03.rep.names, constaxonomy=3sitesall.final.woCyano.an.0.03.cons.taxonomy, group=3sitesall.final.woCyano.groups)
Output File Names:
3sitesall.final.woCyano.an.database
THANKS!!!