Questions about how to "make.contigs" works

holyknightt · September 23, 2021, 11:49pm

Hello. I have a question regarding the make.contigs syntax.

The sequence I got using iSeq equipment is 151bp.

However, if you make forward and reverse contigs with the make.contigs syntax, a total sequence of nearly 300bp is created.

I expected it to be around 151bp.

I wonder why.

Please review.

Thank you.

PippaGrant · September 27, 2021, 8:12am

Good question, I need to think before I answer

thuja · September 27, 2021, 6:36pm

It’s likely that your amplicons do not share any meaningful overlap, so make.contigs joined the forward and reverse reads using whatever it could find in common at the ends of the sequences. The “overlap” may be as short as a single bp. This can happen if the maximum read length of the sequencer (2x 150 bp) is shorter than the targeted region. Basically, your contigs might be junk. The forward (or reverse) reads could still be useful on their own though.

Can you share more information about your project? Like what is the overall research objective (metagenomics? gene profiling?), and what is the target region?

holyknightt · September 27, 2021, 11:18pm

Hello. Thuja.

Thank you for your kind reply.

Currently, I am analyzing two types of dust samples in the environment and samples of Tsetse fly from Africa.

I want to check the distribution of bacteria in two types of samples.

At the same time, I would also like to identify eukaryotic biota by amplifying the 18S V9 region.

As you said, we have confirmed that the reverse sequences are actually combined with the foreword in reverse complementarity.

So does that mean it’s pointless to build a contig from the amplicons I have?

I want to create a contig by applying de novo assembly, but what cannot use this method?

If we cannot use this method, it is difficult to identify the level from the current silva database to species, so we plan to conduct analysis using NCBI’s ‘nr’ database.

holyknightt · September 27, 2021, 11:19pm

Thank you for your response.

leocadio · September 28, 2021, 7:17pm

Could you post the command line you are using? And the logfile, if possible.

holyknightt · September 28, 2021, 10:22pm

Hello. Leocadia.

Thank you so much for showing interest in my article.

I think it’s a little long, but I uploaded the whole log once.

In my opinion, we will proceed with ‘De novo assemply’ with the final file ‘stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta’ analyzed by mothur.

I want to get the exact species name by applying the contigs thus obtained to the NCBI ‘nr’ database.

Is it possible?

Please review.

Thank you.

Windows version

mothur v.1.45.3

Last updated: 5/8/21

by

Patrick D. Schloss

[NOTE]: Setting random seed to 19760620.

Interactive Mode

mothur >

align.seqs(fasta=C.tuberculostearicum.16S.V4.fasta, reference=silva.nr_v138_1.align)

Using 4 processors.

Reading in the silva.nr_v138_1.align template sequences... DONE.

It took 492 to read 146601 sequences.

Aligning sequences from C.tuberculostearicum.16S.V4.fasta ...

Reducing processors to 1.

It took 1 secs to align 1 sequences.

It took 12 seconds to align 1 sequences.

Output File Names:

C.tuberculostearicum.16S.V4.align

C.tuberculostearicum.16S.V4.align.report

mothur >

summary.seqs(fasta=C.tuberculostearicum.16S.V4.align)

Using 4 processors.

Start End NBases Ambigs Polymer NumSeqs

Minimum: 11895 25318 293 0 4 1

2.5%-tile: 11895 25318 293 0 4 1

25%-tile: 11895 25318 293 0 4 1

Median: 11895 25318 293 0 4 1

75%-tile: 11895 25318 293 0 4 1

97.5%-tile: 11895 25318 293 0 4 1

Maximum: 11895 25318 293 0 4 1

Mean: 11895 25318 293 0 4

# of Seqs: 1

It took 0 secs to summarize 1 sequences.

Output File Names:

C.tuberculostearicum.16S.V4.summary

mothur >

pcr.seqs(fasta=silva.nr_v138_1.align, start=11895, end=25318, keepdots=F, processors=8)

Using 8 processors.

[NOTE]: no sequences were bad, removing silva.nr_v138_1.bad.accnos

It took 460 secs to screen 146601 sequences.

Output File Names:

silva.nr_v138_1.pcr.align

mothur >

rename.file(input=silva.nr_v138_1.pcr.align, new=silva.v4.fasta)

Current files saved by mothur:

fasta=silva.nr_v138_1.pcr.align

processors=8

summary=C.tuberculostearicum.16S.V4.summary

mothur >

make.file(inputdir=., type=gz, prefix=stability)

Output File Names:

stability.files

mothur >

make.contigs(file=stability.files, processors=8)

Using 8 processors.

Group count:

M♂51 101721

M♂52 93826

M♂53 134029

M♂55 164051

M♂62 136451

M♂65 114612

M♀06 86802

M♀07 90197

M♀09 128188

M♀14 165063

M♀26 129797

Total of all groups is 1344737

It took 534 secs to process 1344737 sequences.

Output File Names:

stability.trim.contigs.fasta

stability.scrap.contigs.fasta

stability.contigs.report

stability.contigs.groups

mothur >

summary.seqs(fasta=stability.trim.contigs.fasta)

Using 8 processors.

Start End NBases Ambigs Polymer NumSeqs

Minimum: 1 146 146 0 3 1

2.5%-tile: 1 223 223 0 11 33619

25%-tile: 1 279 279 0 13 336185

Median: 1 281 281 0 14 672369

75%-tile: 1 297 297 3 57 1008553

97.5%-tile: 1 298 298 14 59 1311119

Maximum: 1 302 302 56 118 1344737

Mean: 1 283 283 2 26

# of Seqs: 1344737

It took 48 secs to summarize 1344737 sequences.

Output File Names:

stability.trim.contigs.summary

mothur >

screen.seqs(fasta=stability.trim.contigs.fasta, group=stability.contigs.groups, maxambig=0, maxlength=300)

Using 8 processors.

It took 35 secs to screen 1344737 sequences, removed 407523.

/******************************************/

Running command: remove.seqs(accnos=stability.trim.contigs.bad.accnos.temp, group=stability.contigs.groups)

Removed 407523 sequences from your group file.

Output File Names:

stability.contigs.pick.groups

/******************************************/

Output File Names:

stability.trim.contigs.good.fasta

stability.trim.contigs.bad.accnos

stability.contigs.good.groups

It took 71 secs to screen 1344737 sequences.

mothur >

summary.seqs(fasta=stability.trim.contigs.good.fasta)

Using 8 processors.

Start End NBases Ambigs Polymer NumSeqs

Minimum: 1 148 148 0 11 1

2.5%-tile: 1 275 275 0 11 23431

25%-tile: 1 280 280 0 13 234304

Median: 1 295 295 0 15 468608

75%-tile: 1 298 298 0 57 702911

97.5%-tile: 1 298 298 0 60 913784

Maximum: 1 300 300 0 110 937214

Mean: 1 287 287 0 32

# of Seqs: 937214

It took 28 secs to summarize 937214 sequences.

Output File Names:

stability.trim.contigs.good.summary

mothur >

unique.seqs(fasta=stability.trim.contigs.good.fasta)

937214 914376

Output File Names:

stability.trim.contigs.good.names

stability.trim.contigs.good.unique.fasta

mothur >

count.seqs(name=stability.trim.contigs.good.names, group=stability.contigs.good.groups)

It took 30 secs to create a table for 937214 sequences.

Total number of sequences: 937214

Output File Names:

stability.trim.contigs.good.count_table

mothur >

summary.seqs(count=stability.trim.contigs.good.count_table)

Using stability.trim.contigs.good.unique.fasta as input file for the fasta parameter.

Using 8 processors.

Start End NBases Ambigs Polymer NumSeqs

Minimum: 1 148 148 0 11 1

2.5%-tile: 1 275 275 0 11 23431

25%-tile: 1 280 280 0 13 234304

Median: 1 295 295 0 15 468608

75%-tile: 1 298 298 0 57 702911

97.5%-tile: 1 298 298 0 60 913784

Maximum: 1 300 300 0 110 937214

Mean: 1 287 287 0 32

# of unique seqs: 914376

total # of seqs: 937214

It took 38 secs to summarize 937214 sequences.

Output File Names:

stability.trim.contigs.good.unique.summary

mothur >

align.seqs(fasta= stability.trim.contigs.good.unique.fasta, reference=silva.v4.fasta, flip=t)

Using 4 processors.

Reading in the silva.v4.fasta template sequences... DONE.

It took 75 to read 146601 sequences.

Aligning sequences from stability.trim.contigs.good.unique.fasta ...

It took 3763 secs to align 914376 sequences.

[WARNING]: 422435 of your sequences generated alignments that eliminated too many bases, a list is provided in stability.trim.contigs.good.unique.flip.accnos.

[NOTE]: 163103 of your sequences were reversed to produce a better alignment.

It took 3770 seconds to align 914376 sequences.

Output File Names:

stability.trim.contigs.good.unique.align

stability.trim.contigs.good.unique.align.report

stability.trim.contigs.good.unique.flip.accnos

mothur >

summary.seqs(fasta=silva.v4.fasta)

Using 4 processors.

Start End NBases Ambigs Polymer NumSeqs

Minimum: 1 9875 83 0 3 1

2.5%-tile: 2 13423 289 0 3 3666

25%-tile: 2 13423 291 0 4 36651

Median: 2 13423 291 0 5 73301

75%-tile: 2 13423 291 0 5 109951

97.5%-tile: 2 13423 458 1 6 142936

Maximum: 4226 13423 1519 5 16 146601

Mean: 2 13422 308 0 4

# of Seqs: 146601

It took 72 secs to summarize 146601 sequences.

Output File Names:

silva.v4.summary

mothur >

screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, start= 2, end= 13423, maxhomop=16)

Using 4 processors.

It took 249 secs to screen 914376 sequences, removed 886183.

/******************************************/

Running command: remove.seqs(accnos=stability.trim.contigs.good.unique.bad.accnos.temp, count=stability.trim.contigs.good.count_table)

Removed 908602 sequences from your count file.

Output File Names:

stability.trim.contigs.good.pick.count_table

/******************************************/

Output File Names:

stability.trim.contigs.good.unique.good.align

stability.trim.contigs.good.unique.bad.accnos

stability.trim.contigs.good.good.count_table

It took 514 secs to screen 914376 sequences.

mothur >

summary.seqs(fasta=current, count=current)

Using stability.trim.contigs.good.good.count_table as input file for the count parameter.

Using stability.trim.contigs.good.unique.good.align as input file for the fasta parameter.

Using 4 processors.

Start End NBases Ambigs Polymer NumSeqs

Minimum: 1 13423 261 0 11 1

2.5%-tile: 1 13423 295 0 11 716

25%-tile: 2 13423 297 0 13 7154

Median: 2 13423 297 0 13 14307

75%-tile: 2 13423 297 0 14 21460

97.5%-tile: 2 13423 298 0 15 27897

Maximum: 2 13423 300 0 16 28612

Mean: 1 13423 296 0 13

# of unique seqs: 28193

total # of seqs: 28612

It took 14 secs to summarize 28612 sequences.

Output File Names:

stability.trim.contigs.good.unique.good.summary

mothur >

filter.seqs(fasta=stability.trim.contigs.good.unique.good.align,vertical=T,trump=.)

Using 4 processors.

Creating Filter...

It took 8 secs to create filter for 28193 sequences.

Running Filter...

It took 7 secs to filter 28193 sequences.

Length of filtered alignment: 545

Number of columns removed: 12878

Length of the original alignment: 13423

Number of sequences used to construct filter: 28193

Output File Names:

stability.filter

stability.trim.contigs.good.unique.good.filter.fasta

mothur >

unique.seqs(fasta=stability.trim.contigs.good.unique.good.filter.fasta, count= stability.trim.contigs.good.good.count_table)

28193 28192

Output File Names:

stability.trim.contigs.good.unique.good.filter.count_table

stability.trim.contigs.good.unique.good.filter.unique.fasta

mothur >

pre.cluster(fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, count=stability.trim.contigs.good.unique.good.filter.count_table, diffs=2)

Using 4 processors.

/******************************************/

Running command: split.groups(groups=M♂51-M♂52-M♂53-M♂55-M♂62-M♂65-M♀06-M♀07-M♀09-M♀14-M♀26, fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, count=stability.trim.contigs.good.unique.good.filter.count_table)

Using 4 processors.

/******************************************/

Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♂53.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♂53)

/******************************************/

Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♂51.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♂51)

Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♂65.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♂65)

Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♀09.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♀09)

Selected 332 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fastaM♂53

/******************************************/

Done.


Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♂55.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♂55)

Selected 2827 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fastaM♂51

Selected 3118 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fastaM♀09

/******************************************/

Done.

Selected 3615 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fastaM♂65

/******************************************/

Done.


Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♂52.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♂52)

/******************************************/


Selected 2567 sequences from your fasta file.

Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♀14.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♀14)

/******************************************/

Done.


Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♂52.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♂52)

/******************************************/


Selected 2567 sequences from your fasta file.

Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♀06.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♀06)

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fastaM♂55

/******************************************/

Done.


Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♂62.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♂62)

Selected 2706 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fastaM♂52

/******************************************/

Done.

Selected 2496 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fastaM♀06

/******************************************/

Done.

Selected 2845 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fastaM♀14

/******************************************/

Done.

Selected 2814 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fastaM♂62

/******************************************/

Done.


Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♀07.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♀07)

/******************************************/

Running command: get.seqs(dups=f, accnos=stability.trim.contigs.good.unique.good.filter.M♀26.count_table.accnos, fasta=stability.trim.contigs.good.unique.good.filter.unique.fastaM♀26)

Selected 2430 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fastaM♀07

/******************************************/

Done.

Selected 2751 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fastaM♀26

/******************************************/

Done.

Output File Names:

stability.trim.contigs.good.unique.good.filter.M♂51.count_table

stability.trim.contigs.good.unique.good.filter.unique.M♂51.fasta

stability.trim.contigs.good.unique.good.filter.M♂52.count_table

stability.trim.contigs.good.unique.good.filter.unique.M♂52.fasta

stability.trim.contigs.good.unique.good.filter.M♂53.count_table

stability.trim.contigs.good.unique.good.filter.unique.M♂53.fasta

stability.trim.contigs.good.unique.good.filter.M♂55.count_table

stability.trim.contigs.good.unique.good.filter.unique.M♂55.fasta

stability.trim.contigs.good.unique.good.filter.M♂62.count_table

stability.trim.contigs.good.unique.good.filter.unique.M♂62.fasta

stability.trim.contigs.good.unique.good.filter.M♂65.count_table

stability.trim.contigs.good.unique.good.filter.unique.M♂65.fasta

stability.trim.contigs.good.unique.good.filter.M♀06.count_table

stability.trim.contigs.good.unique.good.filter.unique.M♀06.fasta

stability.trim.contigs.good.unique.good.filter.M♀07.count_table

stability.trim.contigs.good.unique.good.filter.unique.M♀07.fasta

stability.trim.contigs.good.unique.good.filter.M♀09.count_table

stability.trim.contigs.good.unique.good.filter.unique.M♀09.fasta

stability.trim.contigs.good.unique.good.filter.M♀14.count_table

stability.trim.contigs.good.unique.good.filter.unique.M♀14.fasta

stability.trim.contigs.good.unique.good.filter.M♀26.count_table

stability.trim.contigs.good.unique.good.filter.unique.M♀26.fasta

/******************************************/

Processing group M♂53:

Processing group M♂65:

M♂53 332 332 0

Processing group M♀09:

Total number of sequences before pre.cluster was 332.

pre.cluster removed 0 sequences.

It took 0 secs to cluster 332 sequences.

Processing group M♂55:

Processing group M♂51:

M♂55 2567 2548 19

Total number of sequences before pre.cluster was 2567.

pre.cluster removed 19 sequences.

It took 1 secs to cluster 2567 sequences.

Processing group M♂62:

M♂51 2827 2827 0

Total number of sequences before pre.cluster was 2827.

pre.cluster removed 0 sequences.

It took 1 secs to cluster 2827 sequences.

Processing group M♂52:

M♀09 3118 3099 19

Total number of sequences before pre.cluster was 3118.

pre.cluster removed 19 sequences.

It took 1 secs to cluster 3118 sequences.

Processing group M♀14:

M♂65 3615 3584 31

Total number of sequences before pre.cluster was 3615.

pre.cluster removed 31 sequences.

It took 1 secs to cluster 3615 sequences.

Processing group M♀06:

M♂62 2814 2803 11

M♂52 2706 2705 1

Total number of sequences before pre.cluster was 2814.

Total number of sequences before pre.cluster was 2706.

pre.cluster removed 11 sequences.

pre.cluster removed 1 sequences.

It took 0 secs to cluster 2814 sequences.

It took 0 secs to cluster 2706 sequences.

M♀14 2845 2817 28

M♀06 2496 2487 9

Total number of sequences before pre.cluster was 2496.

pre.cluster removed 9 sequences.

Total number of sequences before pre.cluster was 2845.

pre.cluster removed 28 sequences.

It took 0 secs to cluster 2496 sequences.

Processing group M♀07:

It took 0 secs to cluster 2845 sequences.

Processing group M♀26:

M♀07 2430 2416 14

Total number of sequences before pre.cluster was 2430.

pre.cluster removed 14 sequences.

It took 1 secs to cluster 2430 sequences.

M♀26 2751 2731 20

Total number of sequences before pre.cluster was 2751.

pre.cluster removed 20 sequences.

It took 1 secs to cluster 2751 sequences.

Deconvoluting count table results...

It took 0 secs to merge 28349 sequences group data.

/******************************************/

Running command: get.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table.temp)

Selected 28073 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.pick.fasta

/******************************************/

It took 13 secs to run pre.cluster.

Using 4 processors.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂51.map

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂52.map

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂53.map

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂55.map

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂62.map

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂65.map

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀06.map

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀07.map

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀09.map

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀14.map

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀26.map

mothur >

chimera.uchime(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table, dereplicate=t)

Using 4 processors.

[DEBUG]: uchime location using D:\mothur_Tsetse_silva_v138_1_V4_210925\uchime.exe

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂51.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂51.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂52.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂52.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂53.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂53.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂55.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂55.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂62.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂62.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂65.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♂65.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀06.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀06.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀07.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀07.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀09.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀09.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀14.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀14.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀26.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.M♀26.fasta

/******************************************/

It took 2 secs to check 332 sequences from group M♂53.

It took 110 secs to check 2548 sequences from group M♂55.

It took 117 secs to check 2827 sequences from group M♂51.

It took 151 secs to check 3099 sequences from group M♀09.

It took 205 secs to check 3584 sequences from group M♂65.

It took 120 secs to check 2705 sequences from group M♂52.

It took 126 secs to check 2803 sequences from group M♂62.

It took 113 secs to check 2817 sequences from group M♀14.

It took 80 secs to check 2487 sequences from group M♀06.

It took 66 secs to check 2731 sequences from group M♀26.

It took 50 secs to check 2416 sequences from group M♀07.

It took 335 secs to check 28349 sequences.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.chimeras

stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.accnos

mothur >

remove.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.accnos)

Removed 15 sequences from your fasta file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta

mothur >

summary.seqs(fasta=current)

Using stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta as input file for the fasta parameter.

Using 4 processors.

Start End NBases Ambigs Polymer NumSeqs

Minimum: 1 545 261 0 11 1

2.5%-tile: 1 545 295 0 11 702

25%-tile: 1 545 297 0 13 7015

Median: 1 545 297 0 13 14030

75%-tile: 1 545 297 0 14 21044

97.5%-tile: 1 545 298 0 15 27357

Maximum: 1 545 300 0 16 28058

Mean: 1 545 296 0 13

# of Seqs: 28058

It took 1 secs to summarize 28058 sequences.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.summary

mothur >

classify.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.count_table, reference=silva.seed_v138_1.align, taxonomy=silva.seed_v138_1.tax)

Using 4 processors.

Generating search database... DONE.

It took 21 seconds generate search database.

Reading in the silva.seed_v138_1.tax taxonomy... DONE.

Calculating template taxonomy tree... DONE.

Calculating template probabilities... DONE.

It took 39 seconds get probabilities.

Classifying sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta ...

[WARNING]: FS10001784_2_BRB11601-2529_1_1110_10900_1340 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

[WARNING]: FS10001784_2_BRB11601-2529_1_1114_10570_3530 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

[WARNING]: FS10001784_2_BRB11601-2529_1_1105_8300_2860 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

[WARNING]: FS10001784_2_BRB11601-2529_1_1112_3350_1570 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

[WARNING]: FS10001784_2_BRB11601-2529_1_1112_3210_2000 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

[WARNING]: FS10001784_2_BRB11601-2529_1_1113_12690_1820 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

[WARNING]: FS10001784_2_BRB11601-2529_1_1104_14980_2220 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

[WARNING]: FS10001784_2_BRB11601-2529_1_1102_10060_2120 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

[WARNING]: FS10001784_2_BRB11601-2529_1_1106_13570_1290 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

[WARNING]: FS10001784_2_BRB11601-2529_1_1110_5770_3210 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

**** Exceeded maximum allowed command warnings, silencing warnings ****

It took 160 secs to classify 28058 sequences.

It took 2 secs to create the summary file for 28058 sequences.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v138_1.wang.taxonomy

stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v138_1.wang.tax.summary

mothur >

remove.lineage(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.count_table, taxonomy= stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v138_1.wang.taxonomy, taxon=Chloroplast-Mitochondria-unknown-Archaea-Eukaryota-unclassified)

/******************************************/

Running command: remove.seqs(accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v138_1.wang.accnos, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.count_table, fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta)

Removed 20 sequences from your fasta file.

Removed 20 sequences from your count file.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta

stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.pick.count_table

/******************************************/

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v138_1.wang.pick.taxonomy

stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v138_1.wang.accnos

stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.pick.count_table

stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta

mothur >

summary.tax(taxonomy=current, count=current)

Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.pick.count_table as input file for the count parameter.

Using stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v138_1.wang.pick.taxonomy as input file for the taxonomy parameter.

It took 1 secs to create the summary file for 28574 sequences.

Output File Names:

stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v138_1.wang.pick.tax.summary

leocadio · September 29, 2021, 12:26am

I think there is some confusion there. What you have there is V4 (according to your file), 16S V4 (prokaryota), and that one I think is about 300 bp.

the 18S v9 is for sure about 131 bp in metazoans (what you are looking in the 150 bp?). But, that does not give species resolution. Genus if you are lucky, in general gives family-level resolution.

But, for your original question (“However, if you make forward and reverse contigs with the make.contigs syntax, a total sequence of nearly 300bp is created.I expected it to be around 151bp.”) we would need the logfile of that step. But, I think you might be mixing 18V9 and 16SV4.

pschloss · September 30, 2021, 1:07pm

To back up leocadio, The V4 region with primers and barcodes will be about 290 bp. Regardless, you’re going to get minimal overlap between the 151 nt reads (about 10 nt?). The V4 region without barcodes and primers is right around 250 nt long. You really. need fully overlapping sequence reads like you would get with 2x250 nt reads of the V4 region. If you don’t you’ll get very high sequencing error rates for the assembled reads.

Also, it is impossible to reliably obtain species level taxonomies from a 16S or 18S rRNA gene sequence. Anyone telling you otherwise is stretching things. You would need genome sequences to get that level of resolution. Just because you see a species name on a sequence in the nr database, doesn’t mean that you can get that level of resolution for an unknown, short sequence.

Pat

holyknightt · October 4, 2021, 8:53am

Sorry to confuse you. Leocadia.

The ‘Mothur’ log I posted is for 16S V4.

18S V9 is not uploaded.

holyknightt · October 4, 2021, 8:53am

Thank you for your kind explanation. pschloss. Thanks again for answering my basic question. ^^

system · October 14, 2021, 8:53am

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
make.contigs problem Commands in mothur	4	2643	March 31, 2015
NextSeq2000 files Commands in mothur	26	840	September 24, 2023
make.contigs vs trim.seqs using illumina Commands in mothur	8	7424	February 12, 2014
Clustering at 98% identity threshold level Commands in mothur	2	846	August 9, 2017
screen.seq command problem Commands in mothur	13	5209	February 8, 2016

Questions about how to "make.contigs" works

Related topics