18s v9 for eukaryotes, max length parameter?

micah_rain · June 16, 2019, 8:13pm

Hi,

I am currently processing data for my PI. It was decided to use 2x300 runs to create the amplicons…

With this in mind, I am wondering what the appropriate max length is… the SOP for 16s v4 shows that you input 275bp for max length (while the amplicons should probably be around 250bp). How is this parameter determined? What is the appropriate number to scale up to/how does one make this decision?

I chose a max length of 350, because the 97.5% tile showed a value of 343 bp. However, I feel like I picked this value somewhat arbitrarily, without any strong justification for ‘why’ I did this…

Thank you,

Micah

Kendra · June 17, 2019, 9:46pm

how long is the v9 supposed to be for your organisms?

micah_rain · June 17, 2019, 10:14pm

If we used 2x300 runs, doesn’t this mean that the amplicon length is ~300bp…?

Kendra · June 18, 2019, 4:22pm

no that’s your sequencing length. If you were to run your amplicon on a gel, how long do you expect it to be?

micah_rain · June 18, 2019, 4:51pm

ohhh…I see. I will have to check with my PI/sequencing facility then…I am not sure. Let’s say they were ~250bp; how would one go about picking the max length?

The SOP shows a max length of 275bp…is an additional 25bp a standard? Or is there more reasoning into picking a max length based on the ‘summary.seqs’ output?

FloHenk · June 19, 2019, 7:05am

Hey micah,

I would always go with your expected length + a little buffer. Your summary.seqs data should mainly give you an idea about how your output is like and to see the quality of it. If 50% of your sample is double the size that you expect it to be, then there most likely something wrong.

A snipped from the SOP that talks about it:
"This tells us that we have 152360 sequences that for the most part vary between 248 and 253 bases. Interestingly, the longest read in the dataset is 502 bp. Be suspicious of this. Recall that the reads are supposed to be 251 bp each. This read clearly didn’t assemble well (or at all). "

Kendra · June 20, 2019, 4:25pm

certainly add a bit of a buffer. variable regions have variable lengths. For example there are some thermophilic clostridia that have an extra 100bp in v2.

pschloss · June 21, 2019, 12:26pm

Everything people have been saying is spot on. One other comment…

I seem to recall that V9 isn’t all that long. If your amplicon is shorter than 300 nt long, then you’re going to get even more errors. We ran into this when we were initially trying to get 2x300 to work with the v4 region, we had to dial back the v3 chemistry to do 2x250 (and it was still horrible).

micah_rain · June 21, 2019, 5:34pm

Ok, yeah…I am having more questions arise the more I get into this…

Our DNA sequencing facility said the the amplicon length was 300-350, depending on the organisms…which makes me wonder about the appropriate max length to use…

I am also wondering, since the variable regions, do indeed vary between organisms, how is that we align primers to a reference sequence, and then trim the database using that reference sequence? Will this exclude some sequences that should actually be part of the downstream analysis? (I guess I am not fully comprehending how this process occurs).

Hopefully these questions are valid and make some sort of sense.

Micah

Kendra · June 21, 2019, 5:57pm

you insert gaps in the sequence so each column of the alignment has what we hope are evolutionary similar bases. do head on the aligned silva file to see what this looks like. or open that alignment in an alignment viewer.

system · July 1, 2019, 6:08pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
screen.seqs - maxlength Commands in mothur	3	3864	March 12, 2014
Sequence length Theory behind mothur	26	9069	May 2, 2017
How to decide maxlength? Theory behind mothur	4	4314	July 20, 2015
Sequence length range for V4 region analysis Commands in mothur	6	1005	June 7, 2020
Miseq, long reads vs short reads Theory behind mothur	2	4905	August 13, 2014

18s v9 for eukaryotes, max length parameter?

Related topics