Hi there,
I am trying to determine the start and end of silva reference database by following the two tutorials.
- https://github.com/mothur/mothur/issues/235
- http://blog.mothur.org/2016/07/07/Customization-for-your-region/
For the first one, pcr.seqs program is used to generate the product based on the primer sequences. The product doesn’t include the primer sequences.
For the second one, no program is used. The product is manually generated and it does include the primer sequences.
Obviously, when the following align.seqs(fasta=product.fasta, reference=silva.bacteria.fasta) and summary.seqs(fasta=product.align) steps are performed, the products with or without primer sequences generated from the first approach or the second approach will lead to different results. (i.e., starting or end positions would be different).
Does this affect the downstream analysis such as pcr.seqs(fasta=silva.bacteria.fasta, start=start,end=end,keepdots=F,processors=8) and align.seqs(fasta=stability.trim.contigs.good.unique.fasta, reference=silva.v4.fasta)?
Which approach do you recommend? the product includes the primer sequences or doesn’t include the primer sequences?
Many thanks,
Dapeng