Dear all:
I am working on some 454 pyrosequencing analysis following the steps used in Costello Stool analysis in sample analysis.
I have about 20 k reads. The summary of the results after “align” and the following “screen” both show that my reads are all >300 bp. But after the following filter to remove gaps and “.”, I found the beginning sequences were truncated and the size becomes 245-288 bp. The command I used was “filter.seqs(fasta=sample.trim.unique.good.align, vertical=T, trump=., processors=2)”. I have used the same method to analyze part of the sequences from the same samples (about 10 k reads in total) and there was no change in size before and after filter. Not sure what happens. I am worried that this truncation of sequences will result in significant difference in final results. please advise the solution. Millions of thanks!
The following is an example of the same sequence before and after filter, the underlined and italic letters are the part trimmed after filter:
before
ATTGAACGCTGGCGGCATGCCTTACACATGCAAGTCGAACGGCAGCACGGGAGCTTGCTCCTGGTGGCGAGTGGCGAACGGGTGAGTAATATATCGGAACGTACCCAGTGGTGGGGGATAGCCCGGCGAAAGCCGGATTAATACCGCATACGATCTACGGATGAAAGCGGGGGATCGCAAGACCCCGCGCTATTGGAGCGGCCGATATCTGATTAGCTAGTTGGTAGGGTAAAAGCCTACCAAGGCTACGATCAGTAGCTGGTCTGAGAGGACGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGAAGGCA[b]GCAGACGCAC[/b]
after CTTGCTCCTGGTGGCGAGTGGCGAACGGGTGAGTAATATATCGGAACGTACCCAGTGGTGGGGGATAGCCCGGCGAAAGCCGGATTAATACCGCATACGATCTACGGATGAAAGCGGGGGATCGCAAGACCCCGCGCTATTGGAGCGGCCGATATCTGATTAGCTAGTTGGTAGGGTAAAAGCCTACCAAGGCTACGATCAGTAGCTGGTCTGAGAGGACGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGAAGGCA