# Sequence length range for V4 region analysis

Hi community!!!

• I’m analysing V4 region sequence. As we know this region is around 253 bp long, what range should I use in the first “screen.seqs” in order to remove bad sequences? I’m using “(minlength=230, maxlength=275)”. Is it ok?

• why, in first “screen.seqs” step of mothur SOP, maxlength=275 has been considered?

Thanks & Regards,
DC7

Hi,

There aren’t any good sequences in the database for the V4 region (with priming sites removed) that are longer than 275 nt.

Pat

Thanks sir for your reply. But what minimum length should we consider in any analysis?

with paired 250 nt reads, i’m not sure it’s possible to get much below 250 nt. probably 240 on the low end.

Sir, firstly thanks and secondly pardon for repetetive questions. Sir, what if I take a broader range of length?
When I analysed V4-V6 region (550bp) I considered minlength=525, maxlength=575 for the “screen.seqs” command. Is it considered wrong? What would be your comment as a reviewer?

``````mothur > summary.seqs(fasta=current)
Using /media/dc7/New Volume/OBESITY/prjna321731_16s/prjna321731_16s_nw/merge.paired.trim.contigs.fasta as input file for the fasta parameter.
``````

Using 8 processors.

``````	          Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	301	301	       0	3	1
2.5%-tile:	1	542	542	       0	4	15495
25%-tile:	1	544	544	       0    5	154948
Median: 	1	547	547	       0	5	309896
75%-tile:	1	550	550	       2	5	464843
97.5%-tile:	1	553	553        7	7	604296
Maximum:	1	602	602	      62	299	619790
Mean:	    1	546	546	       1	5
# of Seqs:	619790

It took 22 secs to summarize 619790 sequences.
``````

Output File Names:
/media/dc7/New Volume/OBESITY/prjna321731_16s/prjna321731_16s_nw/merge.paired.trim.contigs.summary

mothur > screen.seqs(fasta=current, group=current, maxambig=0, maxhomop=8, minlength=525, maxlength=575)
Using /media/dc7/New Volume/OBESITY/prjna321731_16s/prjna321731_16s_nw/merge.paired.trim.contigs.fasta as input file for the fasta parameter.
Using /media/dc7/New Volume/OBESITY/prjna321731_16s/prjna321731_16s_nw/merge.paired.contigs.groups as input file for the group parameter.

Using 8 processors.

It took 7 secs to screen 619790 sequences, removed 286319.

Thanks and Regards,
DC7
`

You would need to generate a reference alignment for V4-V6 (without the primers on the sequences) and then run it through `summary.seqs`. The output would show you what ranges you would expect (keep in mind that there might be some weird outliers).

Pat

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.