how to use screen command

Hi,
This is the output of my summary file after aligning my sequences to silva database. I want to know is it necessary to do the screening of my alignment and if yes what should be my start and end.
I am not able to understand why do we need to do this step.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 0 0 0 0 1 1
2.5%-tile: 1044 1044 1 0 1 1595
25%-tile: 1044 1106 4 0 1 15947
Median: 42621 43116 14 0 2 31893
75%-tile: 43096 43116 36 0 3 47839
97.5%-tile: 43116 43116 151 0 5 62190
Maximum: 43116 43117 295 0 9 63784
Mean: 25081.6 26126 33.9767 0 2.15349

of unique seqs: 63784

total # of seqs: 63784

When I used flip option in alignment I get this from summary command
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1044 1046 2 0 1 1
2.5%-tile: 1125 6457 151 0 3 1595
25%-tile: 1729 6460 255 0 4 15947
Median: 1789 6460 267 0 5 31893
75%-tile: 2062 7694 276 0 5 47839
97.5%-tile: 13862 34446 552 0 6 62190
Maximum: 43097 43116 581 0 9 63784
Mean: 4412.6 11434.1 280.863 0 4.4837

of Seqs: 63784

I am not able to get which start and end should I use for screening.
What exactly the screen command does I am still not clear.


Thanks!!!

You need to run screen.seqs so that your sequences all overlap the same region of the 16S rRNA gene.

What region of the gene did you amplify? What were your primers?

Thanks for the explanation, now I understood about screen command.
Actually I have three datasets and I have pooled all the reads together and aligned it with silva database. All the three datasets are different and have different regions of 16s rRNA, I want to compare these three. I am wondering is this a right way to do it.


Thanks

Sorry, but if the sequences don’t overlap the same region, then it’s pretty hard to compare them. I would process the 3 regions separately. Really, the only way to compare 3 regions is to build phylotypes and then compare them at a very broad taxonomic level. Ever primer set has its own biases and each region evolves at a different rate. So although people think this is a great way to do microbial ecology, it’s really just a headache.

Pat

Thanks for the answer. I am wondering if I take all the regions of alignments meaning I take everything in the alignment file and do not do the screening, will that be not correct?

You need to separate the reads into separate files so that each file represents a different 16S region.

So, I need to do the alignment an everything separately, but then how will I know if the OTUs formed overlap among groups.

You can’t. If sequences don’t overlap, you can’t see whether the OTUs overlap. Like I said, it’s a headache.