Make.contigs creates a wrong count_table

Hello!
I am using mothur 1.47.0 to analyse some MiSeq data.
When I run make.contigs, I get an incomplete count_table which is impairing all my downstream analysis.

I am working in a “raw_data” folder which contains all the fastq files. All the other files are stored in raw_data/mothur_files.

This is my command:
mothur > make.contigs(file=mothur_files/sulphur.files, inputdir=., outputdir=mothur_files)

This is what sulphur.files looks like (I copy it as comma-delimited):
EPH1, GC-HT-9721-EPH1_S33_L001_R1_001.fastq, GC-HT-9721-EPH1_S33_L001_R2_001.fastq
EPH2, GC-HT-9721-EPH2_S41_L001_R1_001.fastq, GC-HT-9721-EPH2_S41_L001_R2_001.fastq
EPH3, GC-HT-9721-EPH3_S49_L001_R1_001.fastq, GC-HT-9721-EPH3_S49_L001_R2_001.fastq

and this is the count_table I get

In the count_table I only get the information about the first sample, while all the other columns are empty. Also, the totals seem to be “1” for every sequence.
What can I do to fix this problem?
Thank you very much
Erica

Hello Erica

the new count file in mothur is a condensed count table. Please use the option

count.seqs(count=YOURTABLENAME, compress=f)

And you will get the table you are looking for : )

Just out of curiosity - why the new count is impairing all your analyses? Mothur should be able to handle that count table without problem. What are the errors mothur is giving you?

Hi Leo,
thank you for your reply. You made me look into the count_file format more carefully and I think now I understand it better. However I still think there is something odd here. So in my case, for each representative sequence I only have 1 copy, occurring in one single sample?

My downstream problem, which made me suspicious about the first count_table, is that in the taxonomy summary I don’t see any information about the samples. But maybe the problem is elsewhere.
All the count tables I get after unique.seqs, are in another format which doesn’t seem to include sample information:
|Representative_Sequence|total|
|M00527_202_000000000-DDG9W_1_1101_16395_1363|1|
|M00527_202_000000000-DDG9W_1_1102_13865_4838|609|
|M00527_202_000000000-DDG9W_1_1102_6551_10436|3753|
|M00527_202_000000000-DDG9W_1_1101_25436_24254|17|

Thank you very much for your help.
Erica

The taxonomy file doesn’t have any info about the samples. It is just the taxonomy of each representative sequence.

If you do not have sample information, likely in some step you missed the option and then the group/count info was lost. if you paste your script, we should be able to help you.

Hi Leocadio,
I double checked my script and found out where the count option was missing.
The taxonomy summary file I was talking about is the one ending in wang.tax.summary and it does have sample information, now that I figured out what the problem was with your help.
Thank you again for your patience,
Erica

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.