Hi,
This is the output of my summary file after aligning my sequences to silva database. I want to know is it necessary to do the screening of my alignment and if yes what should be my start and end.
I am not able to understand why do we need to do this step.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 0 0 0 0 1 1
2.5%-tile: 1044 1044 1 0 1 1595
25%-tile: 1044 1106 4 0 1 15947
Median: 42621 43116 14 0 2 31893
75%-tile: 43096 43116 36 0 3 47839
97.5%-tile: 43116 43116 151 0 5 62190
Maximum: 43116 43117 295 0 9 63784
Mean: 25081.6 26126 33.9767 0 2.15349
Thanks for the explanation, now I understood about screen command.
Actually I have three datasets and I have pooled all the reads together and aligned it with silva database. All the three datasets are different and have different regions of 16s rRNA, I want to compare these three. I am wondering is this a right way to do it.
Sorry, but if the sequences don’t overlap the same region, then it’s pretty hard to compare them. I would process the 3 regions separately. Really, the only way to compare 3 regions is to build phylotypes and then compare them at a very broad taxonomic level. Ever primer set has its own biases and each region evolves at a different rate. So although people think this is a great way to do microbial ecology, it’s really just a headache.
Thanks for the answer. I am wondering if I take all the regions of alignments meaning I take everything in the alignment file and do not do the screening, will that be not correct?