Using pre-made contigs in a Mothur pipeline

sdbrehm · June 6, 2022, 3:21pm

Hello,

I consider myself a novice NGS and Mothur user. So while I understand some things, other things i may not understand deeply. Please bear with me.

I received my data set from LC Sciences, I was expecting a 2 x 250 of the v3-v4. What I received was two sets of data one of the typical R1 and R2 raw files, but no barcodes or adaptor key. The other dataset is a “cleaned” contig resulting from making a contig of R1 and R2.

I have inquired about getting the barcode/adaptor key so I can use make.contigs (), I have not received that yet, but they told me to use the cleaned contigs in my analysis.

Upon looking into the cleaned contigs they are ~400 nuc long, while the R1 and R2 are 250 nuc long. I can manually align the R1 and R2, and they form the contig of 400 nuc, with overhangs and a overlap of 20-30 nuc. Though the R1 and R2 contain extra information that the cleaned version does not have.

r1 - CNTACGGGGGGCTGCAG
r2 - GGATTAGATACCCCAGTAGTCGA

So right now I do not think I can use the make.contig command without a key to remove these extra bits of information from my 3.8 million seq data set of raw R1 and R2. Currently I do not know of any mothur command where I can enter the cleaned contigs, so I can match to SILVA for analysis.

If understand my problem here, I need the barcode key to proceed correct? because there is not much i can do with the cleaned contigs?

Thank you.

Scott

pschloss · June 7, 2022, 2:32pm

Hi,

It would be best to get the original, raw, fastq files. You will need the raw data for depositing to the SRA. You will need the barcodes to figure out what sequences go with each sample. There’s more to aligning the raw reads than just aligning them. The method outlined in Kozich regularly shows much better error correction than what you could get by alignment or using things like pandaseq.

Also, regarding the V3-V4 region, you might want to check out this blog post…

Pat

sdbrehm · June 7, 2022, 3:10pm

Pat,

Thank you.

I put in another request to get the barcodes for our raw data.

Yeah, I was a bit surprised at what they did, and I remember that blog post.

Sigh. Okay damage control!

Scott

Topic		Replies	Views
make.contigs vs trim.seqs using illumina Commands in mothur	8	7349	February 12, 2014
Three small issues with make.contigs() on MiSeq data Commands in mothur	7	8564	June 13, 2013
importing itag sequences into mothur from JGI Integrating mothur with other programs	3	3933	February 16, 2016
Analysis of Illumina data - problem with make.contigs Commands in mothur	10	9917	September 10, 2014
Make.contigs primer/barcode information Commands in mothur	6	7622	June 27, 2014

Using pre-made contigs in a Mothur pipeline

Related topics