Using Mothur on Single Read GAII 50bp

ashtx · October 7, 2014, 5:59pm

Hello,
I am not sure if this question is valid.
I have 4 Fastq files single( 4 samples) end 50 bp reads. I am trying to follow Miseq SOP (http://www.mothur.org/wiki/MiSeq_SOP) which I think may not be appropriate but I made use of reverse.seqs() command as mentioned on this thread: Illumina single-read with index in 2nd sequencing run
I also do not have any Barcode files if any.
I was able to get the make.contigs command to work and this is the result:

[ERROR]: Could not open 18910.num.temp
It took 20559 secs to process 219133249 sequences.
Group count:
P_0_098   38812677
P_1847_3  47203468
No_P_7    36488086
No_P_8    53531374
Total of all groups is 176035605

I saw there was error in between, but I continued.

Command: screen.seqs(fasta=stability.trim.contigs.fasta, group=stability.contigs.groups, summary=stability.trim.contigs.summary, maxambig=0, maxlength=50)

Output File Names:
stability.trim.contigs.good.summary
stability.trim.contigs.good.fasta
stability.trim.contigs.bad.accnos
stability.contigs.good.groups
It took 2053 secs to screen 176035605 sequences.

Command: unique.seqs(fasta=stability.trim.contigs.good.fasta)

126250000 93433220
126251000 93434117
126252000 93435002
126253000 93435907
126254000 93436812
126255000 93437717
126256000 93438597
126257000 93439481
126258000 93440372
126259000 93441265
126260000 93442154
126261000 93443059
Killed

Unfortunately, I was not able to save Logfile and I had saved these outputs by copy pasting to a text file.
I stopped processing the data at this point realizing that it may not be right. Also I found that short reads would cause a problem as mentioned here: Produce too large amount of data when running dist.seqs

My question: is Mothur a right tool for my case or it may not be feasible considering my data.
I am sorry, if this question is very naive and if so please delete it.

pschloss · October 8, 2014, 3:00pm

Ugh, this sounds dreadful. Can you get MiSeq data instead? I’m assuming these are 16S rRNA gene sequence data - right? There’s no way you’ll be able to get 3% OTUs out of these. Aside from getting better data, your next best bet would be to classify the sequences using classify.seqs and then assign them to phylotypes using phylotype. You might also want to see this:

http://blog.mothur.org/2014/09/11/Why-such-a-large-distance-matrix%3F/

Sorry,
Pat

ashtx · October 8, 2014, 4:09pm

Hi Pat,
Thanks for the reply and the link :idea:
Yes,these are 16s RNA sequence data. I am expecting to get Miseq Sequence data on the same samples so hopefully I will get better results but I think we will be only getting single reads

Thanks again

Topic		Replies	Views
Produce too large amount of data when running dist.seqs Commands in mothur	8	7693	October 18, 2013
Analysis of Illumina data - problem with make.contigs Commands in mothur	10	9902	September 10, 2014
make.contigs gets short reads Commands in mothur	1	1052	August 30, 2016
How to make contigs for single end miseq data? Commands in mothur	2	356	January 28, 2022
Empty folder after the unique.seqs command Commands in mothur	2	663	October 13, 2019

Using Mothur on Single Read GAII 50bp

Related topics