ChimeraSlayer flags reference sequences

Hi all,

I’m currently testing ChimeraSlayer on a functional gene dataset with a self-compiled reference database. However, ChimeraSlayer is flagging some of the reference sequences as chimeras. How is this possible? (it might seem redundant, but I need to include the reference sequences in the dataset to be analysed)
The sequences in both datasets have exactly the same length and are aligned exactly in the same way, since the gene in question is quite conserved (same number of columns, gap positions, etc). I’m running the application with default values and the trim and split options.
This does not happen with UCHIME, despite the great overlap between the chimeras detected.

Thanks in advance!

Ricardo

Well, one possibility is that there was recombination in your gene of interest. If you look at the original ChimeraSlayer paper you’ll see that they also detected a low level of recombination in 16S rRNA gene sequences. You might try using the de novo-based approach instead.

Pat

Dear Patrick, thank you for the reply.

Recombination is certainly a possibility, although I can not verify it at the current stage.
However, my question rather concerned the fundamental functioning of ChimeraSlayer. In this case, it flags sequences that are present in both the dataset analysed and reference database, in one run. This seems somehow redundant, since one would assume that sequences in the template database are intrinsically non-chimeric, and thus a 100% match between query and reference sequences can not possibly be flagged.
Am I just misunderstanding how ChimeraSlayer works, or could this somehow be an issue with the reference database compilation itself, or something else?
I already double-checked for possible mistakes in the alignments/sequence headers due to sequence manipulations.

This is part of a meta-analysis of genes in public databases, and unfortunately I can not apply de novo-based methods without going through all the hundreds of individual datasets.

Ricardo

Do the sequences have the same name? We might need you to send your sequence and the database to us with the command you are running.