make.contigs issues with read numbers

I posted this in Demultiplexing MiSeq Paired Reads in Oct '16. I am reposting it here, per the Pat’s request.

I Included NONE in the third column, since the samples were forward barcoded and it ran! I have been processing multiple sequencing runs and run into an issue. From my understanding, when mothur states “it took XXX secs to process XYZ sequences” that includes all the forward and reverse sequences (from the R1 and R2 files). Once the make.contigs is completed and we view the summary.seqs() we should only have half the number of sequences. If I am looking at this wrong please correct me, but I can’t figure out why some of my sequencing runs create more than half the number of total sequences after making the contigs. Examples below:

Run1 16S
It took 18107 secs to process 16,688,339 sequences.
Total of all groups is 6,738,318
(40%)

Run2 16S
It took 17512 secs to process 16,557,004 sequences.
Total of all groups is 12,642,768
(76%)

Run3 16S
It took 42626 secs to process 15,482,172 sequences.
Total of all groups is 5,720,164
(40%)

Run1 18S
It took 7417 secs to process 13,045,219 sequences.
Total of all groups is 10,458,033
(80%)

Run 2 18S
It took 10243 secs to process 16,636,473 sequences.
Total of all groups is 7,408,169
(45%)

Run 3 18S
It took 9785 secs to process 14,528,014 sequences.
Total of all groups is 12,350,637
(85%)

From my understanding, when mothur states “it took XXX secs to process XYZ sequences” that includes all the forward and reverse sequences (from the R1 and R2 files).

That’s “process XYZ [pairs of] sequences”. So you wouldn’t expect to recover <50%. Does this help?