Hi,
I would like to run the command tree.shared to see the similarity between 8 samples. All samples were sequenced using Ion Torrent. I have 237,000 sequences. I’m trying to use this workflow:
- unique.seqs
- align.seqs
- filter.seqs
- dist.seqs (cutoff 0.1, ouput=lt)
- cluster.seqs (cutoff 0.1, furthest method)
- make.shared
- tree.shared
However, the step 4 (dist.seqs) is creating an output of 58 gigabytes of size and the cluster analysis is not running well (my computer cannot read the matrix entirely). Could you help me or suggest anything?
Below, I show you some data that could be useful.
Metadata
Number of samples: 7
Number of sequences: 237,000
Average lenght: 150 bp
Sequencing plataform: Ion Torrent, 318 chip
Other informations: Barcode
Computer features
Ram memory: 16 GB
Processors: 1 Intel Xeon 2.5 Ghz
Best regards,