Hi Sarah and Pat,
Thanks for replying me on the forum (make.contigs problem posted by Xikun).
I am now using the demultiplexed data from Argonne and processing them separately (50 samples~lol)…
The original file is kind of too large (~5Gb) and I don’t know how to break it into smaller files. Do you want me to send you the original file or could you provide me some suggestions of how to make a sample out of it? Sorry! I don’t have much experience with mothur and computing yet~
I have another two questions when I am processing my demultiplexed files with mothur:
- When I run cluster.split, I set the cutoff to 0.03, but sometimes it changes to 0.02 for the output:
mothur >
cluster.split(fasta=F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta,name=F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.names,taxonomy=F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy,splitmethod=classify,taxlevel=4,cutoff=0.03)
Using 1 processors.
Using splitmethod fasta.
Splitting the file...
/******************************************/
Running command: dist.seqs(fasta=F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.0.temp, processors=1, cutoff=0.035)
Using 1 processors.
/******************************************/
Output File Names:
F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.0.dist
It took 880 seconds to calculate the distances for 14424 sequences.
/******************************************/
Running command: dist.seqs(fasta=F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.1.temp, processors=1, cutoff=0.035)
#Mothur repeats similar output until the end.
Clustering F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.0.dist
Cutoff was 0.035 changed cutoff to 0.02
Cutoff was 0.035 changed cutoff to 0.02
It took 413 seconds to cluster
Merging the clustered files…
It took 1 seconds to merge.
Output File Names:
F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.an.sabund
F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.an.rabund
F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.an.list
Why does it change the cutoff?
- When I run classify.otu, it gives me “XXX is not in the taxonomy file…”. But in the end it still can generate tax summary and taxonomy info for me, and when I open it, they look pretty normal:
mothur >
classify.otu(list=F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.an.list,taxonomy=F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy)
reftaxonomy is not required, but if given will keep the rankIDs in the summary file static.
[WARNING]: This command can take a namefile and you did not provide one. The current namefile is F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.names which seems to match F1_01R1.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy.
unique 20498
[b]HWI-M20149_215_000000000-AGTL9_1_2114_26811_16714 is not in your taxonomy file. I will not include it in the consensus.
HWI-M20149_215_000000000-AGTL9_1_2114_24066_11585 is not in your taxonomy file. I will not include it in the consensus.[/b]
#mothur continues this output until the end.
Output File Names:
F1_01.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.an.unique.cons.taxonomy
F1_01.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.an.unique.cons.tax.summary
F1_01.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.an.0.02.cons.taxonomy
F1_01.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.an.0.02.cons.tax.summary
Why the taxonomy file not matching up? But it seems that this will not affect my result.
I will also post the question on the form so other people can see the answers.
Thanks very much!
Xikun