make.contigs with oligo creating new names

I’ve just upgraded to 1.40 finally, but I’ve found that the output of make.contigs results in names that are different from what they were in 1.39.5.
I ran them back to back the other day to make sure.

Here it is in 1.39.5, with output that I expect.
mothur > make.contigs(ffastq=Run4_18S_R1.fastq, rfastq=Run4_18S_R2.fastq, rindex=Run4_18S_I1.fastq, oligos=Run4_18S_oligos.txt, processors=2)

Using 2 processors.
Making contigs…
Done.
It took 5909 secs to process 16561081 sequences.

Group count:
DNA_JL1001 88891
DNA_JL1002 30488
DNA_JL1003 64703
DNA_JL1004 46793
DNA_JL1005 63429
DNA_JL1006 49208
DNA_JL1007 51655
DNA_JL1008 44797
DNA_JL1009 63942
DNA_JL1010 42425

And here it is in 1.40, where all the names have .DNA_JL1001 added to the end.
mothur > make.contigs(ffastq=Run4_18S_R1.fastq, rfastq=Run4_18S_R2.fastq, rindex=Run4_18S_I1.fastq, oligos=Run4_18S_oligos.txt, processors=2)

Using 2 processors.
Making contigs…
Done.

Group count:
DNA_JL1001.DNA_JL1001 88891
DNA_JL1002.DNA_JL1001 30488
DNA_JL1003.DNA_JL1001 64703
DNA_JL1004.DNA_JL1001 46793
DNA_JL1005.DNA_JL1001 63429
DNA_JL1006.DNA_JL1001 49208
DNA_JL1007.DNA_JL1001 51655
DNA_JL1008.DNA_JL1001 44797
DNA_JL1009.DNA_JL1001 63942
DNA_JL1010.DNA_JL1001 42425

It’s playing havoc with my downstream analysis when the names start changing from what they are in all the other databases.

Is the “.DNA_JL1001” appended to the group names in your oligos file?

Sorry for the delay, due to hurricane evacuation and vacation.

The oligos file looks like this:

barcode NONE GGTGAAGATACA DNA_JL1001
barcode NONE GACGAGTCAGTC DNA_JL1002
barcode NONE AGCTATCCACGA DNA_JL1003
barcode NONE ATGATGACCCGT DNA_JL1004
barcode NONE ACGTAAATCGCC DNA_JL1005

etc.

There are not column headers.

Are you seeing this issue with our pre-release 1.41.0, https://github.com/mothur/mothur/releases/tag/v1.41.0.pre-release?

I’ve tried it with 1.41 and got the same result.

(As an unrelated comment, I really like the “Current files saved by mothur:” output at the bottom of my script. Very helpful.)

I also received this email from a colleague, working on a different set of sequences, with his own commands file. He’s having the same experience.

"Hi Joe,
I have a couple of questions.
I am finally able to work a little on my GoM sequences.
It has been so long that I was able to do that, that I thought it best to start from the beginning.
I like to make the contigs using the oligo file so I don’t have to type out a long command to get or remove groups.
When I run that, the list of sequences / group comes out like this:

GoM_8045.GoM_8045 29241
GoM_8046.GoM_8045 29849
GoM_8047.GoM_8045 11863
GoM_8048.GoM_8045 22640
GoM_8049.GoM_8045 22306
GoM_8050.GoM_8045 20589
GoM_8051.GoM_8045 27726
GoM_8052.GoM_8045 23874
GoM_8053.GoM_8045 26772

GoM_8045 is the first sample listed in the oligos file, and the .GoM_8045 gets appended onto the name of the other samples. This happens in v139 and v140. The oligo file looks ok.

I went ahead and used get.groups."

Thanks for bringing this to our attention. I resolved the issue and the change will be part of the 1.41.0 release coming within the week.

I downloaded the 1.41.0 version from github on 11/5 (Linux version) and ran my script again, but still got the same result.

Batch Mode

mothur > make.contigs(ffastq=Run4_18S_R1.fastq, rfastq=Run4_18S_R2.fastq, rindex=Run4_18S_I1.fastq, oligos=Run4_18S_oligos.txt, processors=2)

Using 2 processors.
Making contigs…
Done.

Group count:
DNA_JL1001.DNA_JL1001 88891
DNA_JL1002.DNA_JL1001 30488
DNA_JL1003.DNA_JL1001 64703
DNA_JL1004.DNA_JL1001 46793
DNA_JL1005.DNA_JL1001 63429
DNA_JL1006.DNA_JL1001 49208
DNA_JL1007.DNA_JL1001 51655
DNA_JL1008.DNA_JL1001 44797
DNA_JL1009.DNA_JL1001 63942
DNA_JL1010.DNA_JL1001 42425

And so on.

We have not officially released 1.41.0; it’s coming soon. The bug is present in the 1.41.0 pre release available on github.

Okay, thanks. I’ll look for the official release.:grinning: