remove.lineage: Name of the sequences removed?

Hi
How can I know which sequences were the sequences removed with this command because they were identified as contaminants? and how to know if they were chloroplasts, archaea, etc? with the summary I only know that there were some seqs removed, but that´s all.
Thanks!

Just open the original taxonomy file and search for the relevant lineages to see which sequences would be removed.

Pat

Hi
great! :slight_smile:

I did what you said and found several sequences not affiliated to the Bacteria kingdom at (100):

1 Bacteria(99) Actinobacteria(93) Actinobacteria(93) unclassified unclassified unclassified
1 Bacteria(99) Chloroflexi(80) unclassified unclassified unclassified unclassified
1 Bacteria(97) Chloroflexi(85) unclassified unclassified unclassified unclassified
1 Bacteria(99) Chloroflexi(87) Thermomicrobia(87) unclassified unclassified unclassified
1 Bacteria(99) Chloroflexi(89) unclassified unclassified unclassified unclassified
1 Bacteria(99) Chloroflexi(89) unclassified unclassified unclassified unclassified
1 Bacteria(99) Chloroflexi(90) unclassified unclassified unclassified unclassified
1 Bacteria(99) Chloroflexi(94) Anaerolineae(94) Anaerolineales(94) Anaerolineaceae(94) unclassified
1 Bacteria(99) Planctomycetes(80) unclassified unclassified unclassified unclassified
1 Bacteria(98) Planctomycetes(81) Planctomycetacia(81) Planctomycetales(81) Planctomycetaceae(81) unclassified
1 Bacteria(99) Planctomycetes(87) unclassified unclassified unclassified unclassified
1 Bacteria(99) Planctomycetes(99) Planctomycetacia(99) Planctomycetales(99) Planctomycetaceae(99) Planctomyces(82)
1 Bacteria(99) Proteobacteria(85) Deltaproteobacteria(85) unclassified unclassified unclassified
1 Bacteria(99) Verrucomicrobia(99) Verrucomicrobiae(94) Verrucomicrobiales(94) Verrucomicrobiaceae(94) unclassified
1 Bacteria(88) unclassified unclassified unclassified unclassified unclassified
1 Bacteria(92) unclassified unclassified unclassified unclassified unclassified
1 Bacteria(94) unclassified unclassified unclassified unclassified unclassified
3 Bacteria(95) unclassified unclassified unclassified unclassified unclassified
8 Bacteria(96) unclassified unclassified unclassified unclassified unclassified
14 Bacteria(97) unclassified unclassified unclassified unclassified unclassified
34 Bacteria(98) unclassified unclassified unclassified unclassified unclassified
58 Bacteria(99) unclassified unclassified unclassified unclassified unclassified

In the first column I wrote the number of sequences with the same classification.

I noticed that the 14 sequences that had some information at the phylum level were the 14 detected as contaminants and further removed.
Then, my question is now, what happens with the other 120 sequences that only were affiliated to a certain level to the bacteria kingdom and appear as unclassified to the phylum level? are removed somehow further in the SOP or appear later as unclassified OTUs? Are these sequences to be trusted as real sequences coming from rare populations or should be considered as non-detected chimeras or artifacts?

Thanks!