make.contigs - did not assemble

Dear colleagues,

I’m new using Mothur and I’m still learning from tutorials. I am analyzing 50 samples of soil communities, obtained with MiSeq. The data, fastq files, have already been demultiplexed and I’m working with two files per sample foward and reverse (Ex: C2O4_S33_L001_R1_001.fastq, C2O4_S33_L001_R2_001.fastq).

I have already followed the following paths:

  1. I already set the input and output directories

  2. I used the make.file (inputdir = … / raw, type = gz) command to create stability.files

  3. I ran make.contigs (file = stability.files, processors = 8). This step occurred satisfactorily.

  4. summary.seqs (fasta = stability.trim.contigs.fasta). At that stage is the problem. The sequences were expected to have about 275 bp (v4 region), however, the samples had more than 400 bp (results below). I assumed the reads did not assemble. Any suggestions to solve? I would be grateful if anyone could help.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 298 298 0 3 1
2.5%-tile: 1 441 441 0 4 167681
25%-tile: 1 444 444 0 5 1676804
Median: 1 460 460 0 5 3353608
75%-tile: 1 467 467 1 6 5030411
97.5%-tile: 1 471 471 13 8 6539534
Maximum: 1 602 602 59 301 6707214
Mean: 1 456.624 456.624 1.50144 5.4271

of Seqs: 6707214


Best Regards,

Hi there,

I’d double check with your sequencing center that they really sequenced the V4 region. If you follow the Kozich method your contigs should be ~250 nt. Even if they sequenced off the Illumina adpaters and resequenced the barcodes and primers, the contigs would be shorter than 275 (at this point in the pipeline those should have been removed). If the reads didn’t assemble, you would have 500 nt contigs.

Pat

Dear Mr. Schloss,

Thank you for your guidance. I checked with the company and they sequenced the v3-v4 region, so the results. I´ll follow the steps about other regions that are in the mothur blog.

Thank you again for your guidance. It was very clarifying and helped me a lot.

Best Regards,