mothur

Name mismatch in multi-processor make.contigs

Hi,

I’m running the latest version of mothur and trying to process some simulated FASTQ data with sequentially-named reads (header lines @A1, @A2, @A3, etc.). It appears make.contigs does not work correctly on certain FASTQ files with processors=2. This seems to happen when the sequence names differ by only 1 character.

The following pair of files gives an error:
not_working_R1.fastq

@A1
CCAGC
+
ABBCB
@A2
CCAGC
+
CCCCC
@A3
CCAGC
+
CCCCB
@A4
CCAGC
+
ABBBC
@A5
CCAGC
+
CCCCC

not_working_R2.fastq

@A1
ACTTT
+
>1>11
@A2
ACTTT
+
AABBB
@A3
ACTTT
+
AA1A>
@A4
ACTTT
+
BBBBB
@A5
ACTTT
+
BBBBA
$ mothur '#make.contigs(ffastq=not_working_R1.fastq, rfastq=not_working_R2.fastq, processors=2)'
Using 2 processors.
Making contigs...
3
[WARNING]: name mismatch in forward and reverse fastq file. Ignoring, A4.
1
Done.
It took 0 secs to process 4 sequences.

However, changing the sequence names in the two files to “A1”, “B2”, “C3”, “D4”, “E5” allows the file to be properly processed. I can provide working and non-working example files if needed.

Thanks for the help!

Thank you for reporting this bug. We added a feature to the make.contigs command in the last release to help skip missing reads in files to avoid name mismatches. Part of the name matching checks for “off by one character” for reads like: @M00178:4:000000000-A1AE6:1:1101:16364:1386 1:N:0:0 and @M00178:4:000000000-A1AE6:1:1101:16364:1386 2:N:0:0. This change is causing name mismatches with sequence names such as yours. We will correct this in our next release.

I had exactly the same problem with the version 1.44.3.

I checked the read files, and the “1:N:0:0” “2:N:0:0” seem to be the problem, as you suggested.

Could you please look into this?

Thanks!

Could you send your fastq files to mothur.bugs@gmail.com so I can track down the issue for you?

Similar issue. I get the message:
[WARNING]: name mismatch in forward and reverse fastq file. Ignoring, M07073_33_000000000-JKRRG_1_2106_17065_10533__lepidium__RUA-B-04_116__ITS3_KYO2

But the entry in the forward file is …
@M07073:33:000000000-JKRRG:1:2106:17065:10533__lepidium__RUA-B-04_116__ITS3_KYO2

and in the reverse file is …
@M07073:33:000000000-JKRRG:1:2106:17065:10533__lepidium__RUA-B-04_116__ITS3_KYO2

They are identical. So why the warning, and lost match?

Hi Jerry - thanks for the post. Could you create a new thread and let us know things like what version of mothur you’re running, where you go the data, etc?

Pat