Create "fake" contigs from short reads

Hi,
I was wondering if still the data would be processed correctly, if I know that some read-pairs may be too short to overlap.

Let’s say I have a read pair:
read1: AAACCC(...)TTTGGG
read2: CCCAAA(...)CCCGGG (already reversed)

and I know they are too short to overlap and clearly make.contigs will merge them incorrectly so to not lose this data I could “glue” them manually to create such contig:
AAACCC(...)TTTGGGNNNNNNNNNNNNNNNNNNNNCCCAAA(...)CCCGGG

and then I’d mix such contigs with the other “correct” contigs created by make.contigs.

Is it really idea to use the data like this?

I don’t think that this is a good idea. For one thing, if you include Ns then your sequences should be kicked out again in a proper “screen.seqs” step that eliminates every sequence with ambiguities.

so there is no option to make kind of “split alignment”?
lets say I have V3 and V4 region and forward/reverse reads:
-------3333333333333333------------------44444444444444444444--------
-------FFFFFFFFFFFFFFFFnnnnnnnnnRRRRRRRRRRRRRRRRR------
--------FFFFFFFFFFFFFFFnnnnnnnnnnRRRRRRRRRRRRRRRRRR------

As I suspect some of my reads won’t overlap into contig, but I don’t need to cover the full non-variable region between V3-V4. It would be just enough to keep track that the 1st read aligned to V3, the 2nd read aligned to V4. Is it bad idea?

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.