mothur

make.contigs with oligo creating new names


#1

I’ve just upgraded to 1.40 finally, but I’ve found that the output of make.contigs results in names that are different from what they were in 1.39.5.
I ran them back to back the other day to make sure.

Here it is in 1.39.5, with output that I expect.
mothur > make.contigs(ffastq=Run4_18S_R1.fastq, rfastq=Run4_18S_R2.fastq, rindex=Run4_18S_I1.fastq, oligos=Run4_18S_oligos.txt, processors=2)

Using 2 processors.
Making contigs…
Done.
It took 5909 secs to process 16561081 sequences.

Group count:
DNA_JL1001 88891
DNA_JL1002 30488
DNA_JL1003 64703
DNA_JL1004 46793
DNA_JL1005 63429
DNA_JL1006 49208
DNA_JL1007 51655
DNA_JL1008 44797
DNA_JL1009 63942
DNA_JL1010 42425

And here it is in 1.40, where all the names have .DNA_JL1001 added to the end.
mothur > make.contigs(ffastq=Run4_18S_R1.fastq, rfastq=Run4_18S_R2.fastq, rindex=Run4_18S_I1.fastq, oligos=Run4_18S_oligos.txt, processors=2)

Using 2 processors.
Making contigs…
Done.

Group count:
DNA_JL1001.DNA_JL1001 88891
DNA_JL1002.DNA_JL1001 30488
DNA_JL1003.DNA_JL1001 64703
DNA_JL1004.DNA_JL1001 46793
DNA_JL1005.DNA_JL1001 63429
DNA_JL1006.DNA_JL1001 49208
DNA_JL1007.DNA_JL1001 51655
DNA_JL1008.DNA_JL1001 44797
DNA_JL1009.DNA_JL1001 63942
DNA_JL1010.DNA_JL1001 42425

It’s playing havoc with my downstream analysis when the names start changing from what they are in all the other databases.


#2

Is the “.DNA_JL1001” appended to the group names in your oligos file?


#3

Sorry for the delay, due to hurricane evacuation and vacation.

The oligos file looks like this:

barcode NONE GGTGAAGATACA DNA_JL1001
barcode NONE GACGAGTCAGTC DNA_JL1002
barcode NONE AGCTATCCACGA DNA_JL1003
barcode NONE ATGATGACCCGT DNA_JL1004
barcode NONE ACGTAAATCGCC DNA_JL1005

etc.

There are not column headers.


#4

Are you seeing this issue with our pre-release 1.41.0, https://github.com/mothur/mothur/releases/tag/v1.41.0.pre-release?


#5

I’ve tried it with 1.41 and got the same result.

(As an unrelated comment, I really like the “Current files saved by mothur:” output at the bottom of my script. Very helpful.)


#6

I also received this email from a colleague, working on a different set of sequences, with his own commands file. He’s having the same experience.

"Hi Joe,
I have a couple of questions.
I am finally able to work a little on my GoM sequences.
It has been so long that I was able to do that, that I thought it best to start from the beginning.
I like to make the contigs using the oligo file so I don’t have to type out a long command to get or remove groups.
When I run that, the list of sequences / group comes out like this:

GoM_8045.GoM_8045 29241
GoM_8046.GoM_8045 29849
GoM_8047.GoM_8045 11863
GoM_8048.GoM_8045 22640
GoM_8049.GoM_8045 22306
GoM_8050.GoM_8045 20589
GoM_8051.GoM_8045 27726
GoM_8052.GoM_8045 23874
GoM_8053.GoM_8045 26772

GoM_8045 is the first sample listed in the oligos file, and the .GoM_8045 gets appended onto the name of the other samples. This happens in v139 and v140. The oligo file looks ok.

I went ahead and used get.groups."


#7

Thanks for bringing this to our attention. I resolved the issue and the change will be part of the 1.41.0 release coming within the week.


#8

I downloaded the 1.41.0 version from github on 11/5 (Linux version) and ran my script again, but still got the same result.

Batch Mode

mothur > make.contigs(ffastq=Run4_18S_R1.fastq, rfastq=Run4_18S_R2.fastq, rindex=Run4_18S_I1.fastq, oligos=Run4_18S_oligos.txt, processors=2)

Using 2 processors.
Making contigs…
Done.

Group count:
DNA_JL1001.DNA_JL1001 88891
DNA_JL1002.DNA_JL1001 30488
DNA_JL1003.DNA_JL1001 64703
DNA_JL1004.DNA_JL1001 46793
DNA_JL1005.DNA_JL1001 63429
DNA_JL1006.DNA_JL1001 49208
DNA_JL1007.DNA_JL1001 51655
DNA_JL1008.DNA_JL1001 44797
DNA_JL1009.DNA_JL1001 63942
DNA_JL1010.DNA_JL1001 42425

And so on.


#9

We have not officially released 1.41.0; it’s coming soon. The bug is present in the 1.41.0 pre release available on github.


#10

Okay, thanks. I’ll look for the official release.:grinning: