chimera checked fasta as input file for dist.seqs?

azel · July 29, 2014, 8:11pm

Hello,

I apologize if this is a basic question or if it has been covered, but after a search I was unable to find something helpful for me. I am using mothur to analyze paired end Miseq data (about 1.5 million reads). I have performed the following steps of the pipeline with my data in this order: make.contigs, screen.seqs, unique.seqs, align.seqs, screen.seqs, filter.seqs (using trump=.), unique.seqs, chimera.uchime, remove.seqs (I did not do a pre-cluster because I am doing some comparisons).

At this point, I want to make a distance-matrix using dist.seqs. My specific questions are:

is the chimera checked fasta file (output of remove.seqs) still “aligned” , meaning, would there still be gaps, or is everything removed after subsequent steps (screen.seqs, filter.seqs, etc). I know that the trump=. option removes terminal . characters, but I’m unclear if internal gaps still exist.
In either case (whether the chimera checked fasta file has gaps or not), can I use this file as the input fasta for dist.seqs?

I am running the program right now, and while its taking a long time (its been over an hour now) I’m not getting any errors as of yet. However based on the SOP the program uses reads gaps/inserts and mismatches to determine distance, so I’m just curious if my chimera checked fasta file is in the correct format for dist.seqs to work properly.

Thank you for the clarification,
Anna

dwaite · July 29, 2014, 8:41pm

Hi,

The chimera-checked fasta file is still aligned and will contain internal gaps where needed.
I’m not sure if dist.seqs will accept an unaligned fasta file, I haven’t tried this in a long time but I seem to remember mothur throwing an error if the sequences in your fasta file are different lengths. There is the pairwise.seqs() command for working with unaligned fasta files but I’ve never tried it and imagine that it would be much slower than dist.seqs().

azel · July 29, 2014, 9:02pm

Thank you dwaite! I figured that since I didn’t get an error message, that things would be okay, but I wasn’t sure if my file was aligned and if dist.seqs would be able to work on a non-aligned file (in case my file wasn’t aligned). Thank you so much for the clarification!

Anna

Topic		Replies	Views
Problems writing .dist file mothur bugs	4	4928	June 16, 2011
Count and fasta file mismatch Commands in mothur	9	1251	May 2, 2020
Mismatch error when removing chimeras Commands in mothur	4	749	November 23, 2019
name file in chimera.perseus Commands in mothur	4	3513	April 23, 2012
error with filter.seqs or chimera.uchime??? mothur bugs	3	1452	March 13, 2017

chimera checked fasta as input file for dist.seqs?

Related Topics