Hi,
I am trying to compare the proteomes of 2 species - I have 2 FASTA lists of proteins, and I want to create a Venn diagram of shared and unshared proteins. Of course, mutations may have occured so even the same proteins may not have the exact same sequence, and determining “protein OTUs” is first required. From what I saw on the internet, the best program to do such a thing would be Mothur (if Mothur could handle protein sequences…)
Does anyone have any advice, experience or a program that can do what I need?
Thanks a lot!
The key is to come up with an alignment and a distance matrix for your protein sequences, once you have that you can enter mothur after read.dist and cluster away…
I have tried on feed Mothur with protein sequences alignment and get clusters. But when I tried to use bin.seqs to get each OTU sequences, the command output likes
G0NHG8H03G6L67 1
CNNGGGNNNNNTNCTGANNNN
It automatically converts non-ATCG to N.
Will bin.seqs bin proteins sequence in the future release?
Thanks,
Chien-Chi
*** Just browser other threads that dist.seqs should not be used for amino acid sequences.