distances for libshuff?

gaidos · June 22, 2010, 3:04am

Hi,
I am analyzing a large pyrosequencing dataset and would like to use libshuff to compare samples.

libshuff must be preceded by a read.dist command to read in the distances, which I understand.
However all my attempts to use the smaller, column-formatted distance files (which can benefit from a cutoff), e.g.
read.dist(column=mydata.unique.filter.dist,name=mydata.names,group=mydata.groups)
“You must read in a matrix and groupfile using the read.dist command, before you use the libshuff command.”
Which makes me, following the libshuff instructions, go back to dist.seqs and create a phylip-formatted distance matrix.

This seems to work, but, there is no provision for a cutoff when using the phylip output (presumably because every element in the matrix must be included) and thus the phylip distance matrix is very large (~1 Gigabyte) and the computation becomes extremely slow. I’m wondering whether there is a workaround or is what I described the way to do this business.

Thanks!

gaidos · June 22, 2010, 3:27am

I think I’ve found the solution to this … that is, to generate the large phylip distance matrix, but use the “cutoff” option in read.dist (and specifying both the “names” and the “groups” files).

Topic		Replies	Views
Libshuff Commands in mothur	1	2783	August 12, 2010
dist file difficulties Commands in mothur	2	3383	July 1, 2010
can not run Libshuff command mothur bugs	0	81072	November 10, 2009
Phylip vs. Column-based format changing downstream results Commands in mothur	2	5261	June 17, 2010
How to create a groupfile for Libshuff Commands in mothur	5	3655	September 18, 2013

distances for libshuff?

Related topics