Pcr.seqs removing the first base with keepdots=f

Hello all,

I am running mothur v.1.46.1, downloaded today. I was trying to clean an alignment with pcr.seqs, and if keepdots=f is included, it removes the first base of the aligment. If not included, keeps the first base. The log with the same step done over the same file, with both summary seqs, and you can see the difference from 692 to 691. I checked the alignments produced by both, and remove the base in 1. Happy to share the alignments.

This is the windows version, but it was affecting a student using a linux server (that came to me after one day banging her head against the computer since my script was not working…).

Windows version

Using Boost
mothur v.1.46.1
Last updated: 9/1/21
by
Patrick D. Schloss

Department of Microbiology & Immunology

University of Michigan
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type 'help()' for information on the commands that are available

For questions and analysis support, please visit our forum at https://forum.mothur.org

Type 'quit()' to exit program

[NOTE]: Setting random seed to 19760620.

Interactive Mode



mothur > 
pcr.seqs(fasta=carolina.align, start=1, end=692)

Using 8 processors.
It took 14 secs to screen 411075 sequences.

Output File Names: 
carolina.pcr.align
carolina.bad.accnos
carolina.scrap.pcr.align



mothur > 
summary.seqs(fasta=current)
Using carolina.pcr.align as input file for the fasta parameter.

Using 8 processors.

		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	688	95	0	3	1
2.5%-tile:	1	692	129	0	4	9950
25%-tile:	1	692	130	0	4	99496
Median: 	1	692	130	0	4	198991
75%-tile:	1	692	134	0	4	298486
97.5%-tile:	1	692	143	0	6	388032
Maximum:	1	692	186	0	9	397981
Mean:	1	691	132	0	4
# of Seqs:	397981

It took 7 secs to summarize 397981 sequences.

Output File Names:
carolina.pcr.summary


mothur > 
pcr.seqs(fasta=carolina.align, start=1, end=692, keepdots=f)

Using 8 processors.
It took 14 secs to screen 411075 sequences.

Output File Names: 
carolina.pcr.align
carolina.bad.accnos
carolina.scrap.pcr.align



mothur > 
summary.seqs(fasta=current)
Using carolina.pcr.align as input file for the fasta parameter.

Using 8 processors.

		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	687	94	0	3	1
2.5%-tile:	1	691	128	0	4	9950
25%-tile:	1	691	129	0	4	99496
Median: 	1	691	129	0	4	198991
75%-tile:	1	691	133	0	4	298486
97.5%-tile:	1	691	142	0	6	388032
Maximum:	1	691	185	0	9	397981
Mean:	1	690	131	0	4
# of Seqs:	397981

It took 8 secs to summarize 397981 sequences.

Output File Names:
carolina.pcr.summary


mothur > 
quit()

Can you try with keepdots=T and then run filter.seqs with vertical=T?

Pat

Hi Pat

Sorry if I was not clear. The problem is not keeping the dots or not - the problem is the difference in behavior of pcr.seqs depending on what is selected in keepdots. If =true, then the fragment left includes the “start” base. If =False, the fragment left starts AFTER the “start” base (at least when start=1).

The problem is that I was pcr.seqs form 1 to 692, and with keepdots false the fragment left was 2-692 (so, 691 bp). Then I was doing a screen.seqs with end=692 (to remove incomplete sequences) and I was getting an empty alignment. Changing to keepdots true then I was keeping 1-692, so screen.seqs worked OK.

Thanks,

Leo

Hi Leo,
Thanks for reporting this bug. I have fixed it and the change will be part of our next release. In the meantime, you can work around the issue by running the following:

mothur >
pcr.seqs(fasta=carolina.align, start=1, end=692, keepdots=t) - sets all positions to be removed to .'s

mothur > filter.seqs(fasta=current, vertical=f, trump=.) - removes all positions with .'s.

Kindly,
Sarah

1 Like

Thanks Sarah!

Again, I had no problems once I noticed - with my pipeline the dots, or no dots, doesn’t matter (with the screen.seqs step I then leave only complete fragments, so no dots are left either way), just that I needed to keep the first base!!

Best,

Leo

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.