Analysis Seqs from different region

Ray · May 13, 2022, 5:36pm

Hi all,

I’ve got seqs from V3 and V8 region from my collaborator (for 1 sample, I have files from V3 and V8 at the same time). Should I analysis them separately or I could simply combine data from the two regions and analysis them as a whole.

Thanks,
Ray

Ray · May 13, 2022, 6:59pm

Some of my thoughts here: I think I should align V3 and V8 regions separately to guarantee a good alignment quality. My main question is where should I combine my separate analysis on V3 and V8? Could I combine the fasta and count files of V3 and V8 as soon as I finish alignment and filter step? Could command merge.files do the work?

Thanks

leocadio · May 16, 2022, 5:33pm

I would not combine… You do not know with V3 reads correspond to V8 reads. Some lineages would be in both, some would be absent in one, biases would be different… Analyze them separately, IMO.

Ray · May 16, 2022, 6:32pm

Thanks for your reply! I am analyzing V3 region. However, there are too many unique reads after the whole mothur process. Originally there are 30 million reads with 10 million unique and after mothur there are 2 million unique reads, obviously too many. Do you have any idea how I can further reduce my unique reads. I’ve specified silva to V3 region in alignment procedure.

pschloss · May 17, 2022, 5:34pm

The V3 region is 195 nt long (Customize your reference alignment for your favorite region). I’m not sure how long the V8 region is, but I think it’s probably also shorter than 250 nt.

If you are using 2x250 nt then you likely have barcodes and primers on the sequences still and it’s possible that you are sequencing beyond the length of the fragment. The error goes way up when your sequence reads are longer than the region. The MiSeq can be run with 2x250 chemistry but where you tell the instrument to only generate 195 nt in each direction.

if you are using a different read length, then you don’t have complete overlap of the two reads and won’t get adequate denoising of the data (Why do I have such a large distance matrix).

Ultimately, I think your problem is a data quality problem. Do you know how the sequencing data were generated?

Pat

leocadio · May 19, 2022, 1:15pm

Do you run a pre.cluster? If you post your commands, we might be able to tell you if that in noisy or not…

Ray · May 25, 2022, 6:11pm

Thanks for your help! I think I may figure out my problem, I’m working on 16S V1-V3 region. The reason of so many uniques is probably caused by the high error rate of my data.

system · June 4, 2022, 6:11pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
combining analysis of reads of different V regions (V3, V4) Commands in mothur	3	1092	March 30, 2017
HiSeq V4 region with V2 chemistry Theory behind mothur	2	1885	April 7, 2016
Mothur analyze data with v3-v4 region and v4 region	1	705	March 14, 2019
V3/V4 Region Commands in mothur	4	2915	March 27, 2017
Align.seqs Theory behind mothur	5	3649	February 27, 2015

Analysis Seqs from different region

Related topics