Hey I’m undergraduate student who just got some Co1 reads from edna sampling. Can I use the mothur miseq SOP (v4 region) to analysis my Co1 reads? Or is there another set of codes/tutorials I can reference for a better analysis?
P.S.
I’ve gotten an error that “mothur found unpaired files” in my fasta/gz folder and some of the SOP directions are negligible because Co1 is a different base pair size and region. Our Master student got no such errors when working with our 16S reads (which is also the v4 region).
Hi - that’s awesome you are using mothur as an undergrad! I don’t know what Co1 is, but you should be able to use mothur if the sequence data were generated as amplicons. For paired end sequencing, you should have a pair of file (forward and reverse) for each sample. Can you either find the missing files or remove the samples/files that don’t have a pair?
Now my problem is with my reference database file, I’m using the MIDORI2_LONGEST_NUC_GB264_CO1_MOTHUR.fasta file to look at the ~650 bp region of Co1.
I got the error: Mothur is not setup to process protein sequences and template is not aligned.
So I tried to align it in MAFFT but then it erased all of my data and replaced it with gaps. I know this because afterwards I had Mothur eliminate all the gaps with:
filter.seqs(inputdir=/Users/drocean/Desktop/Sequences1/Co1 Sequences/Co1 Contigs, fasta=MIDORI2_LONGEST_CO1_MOTHUR_aligned.fasta, vertical=T, trump=.)
And it removed every single column. My protocol for MAFFT was as follows:
input file name
output file name
output format = fasta format input order
Strategy = auto
Additional arguments = --ep 0 --op 1.53 (ep=set gap extension penalty to zero and op= sets gap opening penalty to 1.53)
I’m not sure what I’m doing wrong or if I’m overcomplicating thing.
I just tried to skip the MAFFT step and use screen.seqs to trim to the correct basepair size on the original MIDORI2 file but I got the error that says: template is not aligned, aborting.