phylip distance matrix vs column distance matrix

CGabriel · August 15, 2012, 11:03pm

Hi all,

I am running my first pyrosequencing analysis following the Schloss SOP tutorial. The number of OTUs obtained varies a lot when you are calculating them from a column-formatted or a phylip-formatted distance matrix (337 to 607 in my case). Does it relate to unique sequences?
By the way, what text application do you use to open the list file generated from a column-formatted distance matrix when it is a heavy file?

Thanks to anyone who can help me with these doubts

pschloss · August 16, 2012, 12:27pm

There shouldn’t be a difference between the two methods if you’re comparing the same OTU cutoff. Are you comparing the same cutoff? Can you email the input alignment that you are giving to dist.seqs to mothur.bugs@gmail.com and we can take a look?

Pat

pschloss · August 17, 2012, 2:48pm

Thanks for sending the files. The problem is that you are running…

cluster(phylip=final.phylip.dist)

and

cluster(column=final.dist, name=final.names)

By default cluster uses the average neighbor algorithm, which uses abundance information provided by the names file. For a true/correct comparison you should do…

cluster(phylip=final.phylip.dist, name=final.names)

HTH,
Pat

Kirk · August 21, 2012, 8:23am

CGabriel, I use Jujuedit for opening files up to 2 GB in a matter of seconds, has some nice features and you can use regular expressions to edit text!

Topic		Replies	Views
Phylip vs. Column-based format changing downstream results Commands in mothur	2	5261	June 17, 2010
cluster and phylip and name file Commands in mothur	2	3082	October 6, 2011
Phylip formatted distance matrix question Commands in mothur	6	161245	January 9, 2010
cluster and cutoffs Commands in mothur	2	2028	September 1, 2014
dist file difficulties Commands in mothur	2	3383	July 1, 2010

phylip distance matrix vs column distance matrix

Related topics