name file in chimera.perseus

Hello,
I would like to use the perseus algorithm to check a dataset for chimeras, but am getting a recurring error:

is in your fasta file and not in your namefile, please correct.

More specifically, the first sequence in all lines of the name file is read correctly, but any additional sequences linked to that representative sequences (ie. anything after a comma) is ignored. I assume this is a product of the fact that my name file was not generated within mothur. The file was generated manually, as I am using other pipelines more suitable to my data for most of my analysis. However, they don’t have chimera-checking options, so I was hoping to use Mothur to run Perseus. The name file is correctly formatted according to the information on the mothur wiki as far as I can tell - representative sequence in first column, all the sequences it represents separated by commas in the second column. An example of both input files is below. Any insight on the problem would be greatly appreciated!

-Marie

Sample Name File:

21251222902 21251222902,2121110114
2121110232 2121110232,232508291

Sample fasta:

21251222902
GAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCTTTGGTATTCCGAAGGGCATGCCTGTTTGAGTGTCATTAAATTCTCAACCTTGCTCGCCTTTACCGGCTTGAGTGAGGCTTGGACGTGAGGGCTTTGCTGGCTTCCTTAAGTGGATGGTCTGCTCCCTTTAAATGCATTAGTGGGATCTCTTGTGGACCGTCACTTGGTGTGATAATTATCTACGCCTCGTCGTACTTTGAAGACAAACTTATGGGAACCTGCTTATAACCGTCTCGACGAAGGGACTAACTTTCTGACTATTTGACCTACAAATCAGGTACGGACCTACCCGCTA
2121110114
GAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCTTTGGTATTCCGAAGGGCATGCCTGTTTGAGTGTCATTAAATTCTCAACCTTGCTCGCCTTTACCGGCTTGAGTGAGGCTTGGACGTGAGGGCTTTGCTGGCTTCCTTAAGTGGATGGTCTGCTCCCTTTAAATGCATTAGTGGGATCTCTTGTGGACCGTCACTTGGTGTGATAATTATCTACGCCTCGTCGTACTTTGAAGACAAACTTATGGGAACCTGCTTATAACCGTCTCGACGAAGGGACTAACTTTCTGACTATTTGACCTACAAATCAGGTACGGACCTACCCGCTA
2121110232
GAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCTTTGGTATTCCGAAGGGCATGCCTGTTTGAGTGTCATTAAATTCTCAACCTTGCTCGCCTTTACCGGCTTGAGTGAGGCTTGGACGTGAGGGCTTTGCTGGCTTCCTTAAGTGGATGGTCTGCTCCCTTTAAATGCATTAGTGGGATCTCTTGTGGACCGTCACTTGGTGTGATAATTATCTACGCCTCGTCGTACTTTGAAGACAAACTTATGGGAACCTGCTTATAACCGTCTCGACGAAGGGACTAACTTTCTGACTATTTGACCTACAAATCAGGTACGGACCTACCCGCTA
232508291
GAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCTTTGGTATTCCGAAGGGCATGCCTGTTTGAGTGTCATTAAATTCTCAACCTTGCTCGCCTTTACCGGCTTGAGTGAGGCTTGGACGTGAGGGCTTTGCTGGCTTCCTTAAGTGGATGGTCTGCTCCCTTTAAATGCATTAGTGGGATCTCTTGTGGACCGTCACTTGGTGTGATAATTATCTACGCCTCGTCGTACTTTGAAGACAAACTTATGGGAACCTGCTTATAACCGTCTCGACGAAGGGACTAACTTTCTGACTATTTGACCTACAAATCAGGTACGGACCTACCCGCTA

Hi Marie,

The fasta file should only have the sequence names and sequences that are found in the first column of the names file. Try using this instead…


>21251222902 GAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCTTTGGTATTCCGAAGGGCATGCCTGTTTGAGTGTCATTAAATTCTCAACCTTGCTCGCCTTTACCGGCTTGAGTGAGGCTTGGACGTGAGGGCTTTGCTGGCTTCCTTAAGTGGATGGTCTGCTCCCTTTAAATGCATTAGTGGGATCTCTTGTGGACCGTCACTTGGTGTGATAATTATCTACGCCTCGTCGTACTTTGAAGACAAACTTATGGGAACCTGCTTATAACCGTCTCGACGAAGGGACTAACTTTCTGACTATTTGACCTACAAATCAGGTACGGACCTACCCGCTA >2121110232 GAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCTTTGGTATTCCGAAGGGCATGCCTGTTTGAGTGTCATTAAATTCTCAACCTTGCTCGCCTTTACCGGCTTGAGTGAGGCTTGGACGTGAGGGCTTTGCTGGCTTCCTTAAGTGGATGGTCTGCTCCCTTTAAATGCATTAGTGGGATCTCTTGTGGACCGTCACTTGGTGTGATAATTATCTACGCCTCGTCGTACTTTGAAGACAAACTTATGGGAACCTGCTTATAACCGTCTCGACGAAGGGACTAACTTTCTGACTATTTGACCTACAAATCAGGTACGGACCTACCCGCTA

Hi,
Thanks for the help! Using only the representative sequences has cleared up the problem in part, but I’m still getting the same error. Only now it seems to be with a completely random selection of representative sequences…I have checked the files repeatedly and the sequences triggering the error are present both in the fasta file and in the name file and with no apparent formatting errors so I’m at a loss as to what the problem is.

An example of what will trigger the error:

Name file:
23GA12_1545 23GA12_1545,13GD12_20533
21GA06_11136 21GA06_11136,13BD06_19303,21GA06_7309,22BD06_23210,13GB06_11364,13GA06_20008

Fasta file:

23GA12_1545
TAGATGCGAGACGTAACGTGAATTGCAGGACTTTGTGAACGTTAATTCTTCGAACGTACATTACGGCTTCGGGTCAACCGAAGCCACGCCTGGTTGAGGGTCAGTTGAACTAAACACTCGTAGCAGTGCGATTGCTACGGAATGTCTGAATAGCACTGTAAAAGGCGCTAGCAGATCAAGTTGAGACCAGTTGTGCTTGAATATAGCTGACCAACGCGAATACCGCGTCAGTTGGAGCATCGGGCTTTCCNGACGACCACTGCACCGACGACGTGTCGTCGCGACGAACGGAACGACGTACGACCGAACGTACGGACGG
21GA06_11136
GAAATGCGATAAGTAGTGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCATATTGCGCTTCCTGGTATTCCGGGGAGCATGCTCGTTTCAGAATCATTAAAAAACTCACACAGCTGTTTCTTGCCTGTCATGGGTAAGTGGAGCACGTGTGGATTTGAGCGTTTTGTCCAGGCCTCTTTTTGAGGCCAGGACAATCGCTTTGAATGATTGTGCGTACAGCAGTACATTACGCTTTGGAACTTGTGTTTTTGAGCTACGGTTTTGAGGCGTGCGTCCGCTCGACTCTCAGTGCAACAGATCATTTTAGAAACTACTTTTGGGTTTGAGCTACCGCGAGCTGCATTACGCTTTCCTAATTTTACACGGTGCCTGCTCGCTAATTTTGCGTAGTCGGCTCCGACTAACTTGGTCTAGAATCGAGACAACGAAACCCGCTACGAACTTAAGACATACTACAATAAGACGGAGGAGA

I’m probably missing something obvious again, but any help would be appreciated!

Thanks,
Marie

If you want to send the fasta and name file to mothur.bugs@gmail.com, I can take a look?

Will do! Thanks again :slight_smile: