chimera.uchime output

Hi,

I have two questions (for now) about the interpretation of the uchime output:

  1. Columns 2-4 in the *.uchime.chimera file have a “ab=number” next to each sequence. What does it mean? Is it relevant to the interpretation of the results?

  2. If chimeras are formed during PCR (which means two independently amplified samples would never see each other’s amplicons), what is the biological significance of a hybrid sequence from sample A whose parents are sourced from samples B and C?

Thanks much,
Pedro.

Pedro,

  1. I suspect ab=number would refer to the number of times that sequence was found in your dataset.

  2. That would be hard to interpret. Perhaps it would argue for keeping the datasets separate until after chimera checking. How frequent do you think this is occurring?

Hello Pat,

Thank you for your reply.

  1. That’s what I suspected–but if I add up the ab numbers the total does not match the total number of reads before the unique.seqs step (I let mothur do that automatically when I run chimera.uchime using the reference=self option).

  2. I would say that over 50% of the chimeric sequences have parents from different samples, with no apparent relationship between sample origin and minh score value. Although only 2% of my non-unique sequences is flagged as chimeric, I get a 40% reduction in the number of estimated OTUs, which, as somewhat expected, does not translate into modified ecological groupings. I re-assessed some samples with a high number of reads (>24K) representing closely-related taxonomic groups and found that only a few (1 or 2) of the sequences in each sample were chimeric. All this makes me wonder whether, for both practical and biological reasons, it might make more sense to search for chimeras in each sample independently using the default settings than to fine-tune the parameters until the results fulfill the representation of what we believe is correct. This may be especially true in cases where the categories being compared yield highly dissimilar microbial compositional patterns. But I’m not sure if these considerations are necessary or even make sense…

Thanks much.
Pedro.

So to follow up, are you sure there aren’t parents in those samples? If you take the parent listed in the output, are you sure that when you then look in the name file that there isn’t another sequence name from the same group as the putative chimera?