Dear all:
I am working on some 454 pyrosequencing analysis following the steps used in Costello Stool analysis in sample analysis.
I have about 20 k reads. The summary of the results after “align” and the following “screen” both show that my reads are all >300 bp. But after the following filter to remove gaps and “.”, I found the beginning sequences were truncated and the size becomes 245-288 bp. The command I used was “filter.seqs(fasta=sample.trim.unique.good.align, vertical=T, trump=., processors=2)”. I have used the same method to analyze part of the sequences from the same samples (about 10 k reads in total) and there was no change in size before and after filter. Not sure what happens. I am worried that this truncation of sequences will result in significant difference in final results. please advise the solution. Millions of thanks!
The following is an example of the same sequence before and after filter, the underlined and italic letters are the part trimmed after filter:



That sounds about right. Remember that the trump=. will trim the sequences to the shortest sequence in the alignment space. So you should have a sequence that is 300 bp. Then because sequences have insertions/deletions the sequences will be shorter and longer than that. If you want to be sure that all of the sequences are >300 bp after trump=., then you might specify 350 or so in screen.seqs.

Hope this helps,