mothur

Remove.lineages; Removing Cyanobacteria->chloroplast sequences

#1

Dear all,

I am working with prokaryotic sequences from the marine environment that contain a large amount of chloroplast and mitochondria classification hits.

I am trying to determine a way to remove cyanobacterial sequences that match to chloroplast/plastids (using the SINA ref database) at higher taxonomic levels without removing all reads that classify as cyanobacteria because cyanobacteria are important microbial community members.

Any suggestions on how I would be able to accomplish this?

#2

In the SILVA reference the taxonomy string for chloroplasts is “Bacteria;Cyanobacteria;Chloroplast;”. So if you just use taxon=Chloroplast you should be fine.

Pat

#3

Hi Pat.

I just ran this over the weekend
remove.lineage(fasta=current, count=current, taxonomy=current, taxon=Bacteria;Cyanobacteria_Chloroplast;Chloroplast-Mitochondria-unknown-Eukaryota) in my batch file but my final taxonomy file still included chloroplasts. Am I missing something obvious? This is a script we’ve been using a long time so I’m confused as to why I’m suddenly seeing chloroplasts again.

EDIT: So it appears that it IS obvious. We were seeing chloroplasts SPECIFICALLY in the cyanobacteria_chloroplast hybrid phylum before, so that’s why our code was written the way it was. I mistakenly thought I was also following Pat’s advice and selecting for chloroplasts in general, but in my code, you can see that the only chloroplasts I was removing were in that “Class.” Once I added an additional bit to include chloroplasts in general, it was fixed.

remove.lineage(fasta=current, count=current, taxonomy=current, taxon=Bacteria;Cyanobacteria_Chloroplast;Chloroplast-Chloroplast-Mitochondria-unknown-Eukaryota

1 Like
#4

Hi,
if I am not mistaken then you can make your life even more simple when you want to get rid of chloroplasts (or anything else). You don´t have to give the full taxonomy when using the remove.lineage
command: Using just “…taxon=Chloroplast…” will get rid of every sequence that contains the word “Chloroplast” anywhere in its taxonomy.

Best wishes,

René

1 Like