Cleaning second-generation sequencing data

xApple · October 22, 2012, 2:14pm

I have been looking around the web testing the current tools for pre-processing of sequencing reads and it just seems like no solution really cuts it â€“Â including mothur. I was wondering if there are any plans to make mothur better in this aspect and also wanted to know what other tools you guys are using. If you feel it sharing your opinion on the matter after reading the short review I wrote on the subject, that would be great.

pschloss · October 22, 2012, 3:37pm

Yeah we’re on it. Your review is very much out of date…

Other nuisances include that it doesnâ€™t support the FASTQ format and only takes combination of FASTA and QUAL files. As usual, nothing is said about the expected standard to be used in the QUAL file

Ummm… there’s the fastq.info command.

There appears to be no paper associated, only an old poster

Have you run mothur before? The first thing that comes up is the citation to a 2009 manuscript. You can also get references by running the command with the keyword citation in teh parenthesis. Eg. align.seqs(citation).

xApple · October 22, 2012, 4:07pm

Hello,

I wasn’t expecting to get such a quick response from the lead developer. That’s great. Sorry about not spotting the paper. I looked on the main page and a quickly on the wiki, didn’t think of looking into the standard output of the trim command for that information. I updated my post and added the link to the paper now.

Is there any other aspect that you think is wrong or out of date ? I wrote this post only a few days ago trying to vent my frustration and trying to capture the reactions of a new user coming to the field of dna sequence analysis, starting with the pre-treatement of the data.

I can imagine that Mothur provides commands for going from FASTA, QUAL, FASTQ back and forth; For that fairly common task, I have developed my own, tested, programs. But the fact is, for trimming reads, it doesn’t take FASTQ or SFF as an input, thus not using the apparent standard file formats and adding a step in the processing. That’s all. It’s a detail.

If I had to point to the thing that bugged me the most, it was the use of a interactive mode as a principal interface for the software. That was probably a bad design idea, in my opinion.

Sincerely,

pschloss · October 22, 2012, 4:21pm

The thing I would suggest is to read anything that comes up in PubMed for the following search: “Schloss PD[au]”. Then you might have a better idea of what’s going on.

Well if you have a few hundred thousand dollars to throw at us, we’ll make a much nicer interface. It’s our philosophy that scientists should concentrate on science, not design.

Topic		Replies	Views
Getting data for NCBI submission Commands in mothur	14	7230	March 17, 2015
fastq.info Commands in mothur	1	1918	June 23, 2015
Fasta + qual --> fastq? Feature requests	1	4445	February 18, 2011
Can ''qwindowaverage'' help me ? Commands in mothur	3	2582	September 17, 2014
Trim.seqs to remove barcodes and primers from fast files Commands in mothur	3	5069	March 23, 2015

Cleaning second-generation sequencing data

Related topics