Co1 Region in Mothur

Hey I’m undergraduate student who just got some Co1 reads from edna sampling. Can I use the mothur miseq SOP (v4 region) to analysis my Co1 reads? Or is there another set of codes/tutorials I can reference for a better analysis?

P.S.
I’ve gotten an error that “mothur found unpaired files” in my fasta/gz folder and some of the SOP directions are negligible because Co1 is a different base pair size and region. Our Master student got no such errors when working with our 16S reads (which is also the v4 region).

Hi - that’s awesome you are using mothur as an undergrad! I don’t know what Co1 is, but you should be able to use mothur if the sequence data were generated as amplicons. For paired end sequencing, you should have a pair of file (forward and reverse) for each sample. Can you either find the missing files or remove the samples/files that don’t have a pair?

Pat

Hey so I got all my reads to be paired!

Now my problem is with my reference database file, I’m using the MIDORI2_LONGEST_NUC_GB264_CO1_MOTHUR.fasta file to look at the ~650 bp region of Co1.
I got the error: Mothur is not setup to process protein sequences and template is not aligned.
So I tried to align it in MAFFT but then it erased all of my data and replaced it with gaps. I know this because afterwards I had Mothur eliminate all the gaps with:
filter.seqs(inputdir=/Users/drocean/Desktop/Sequences1/Co1 Sequences/Co1 Contigs, fasta=MIDORI2_LONGEST_CO1_MOTHUR_aligned.fasta, vertical=T, trump=.)

And it removed every single column. My protocol for MAFFT was as follows:
input file name
output file name
output format = fasta format input order
Strategy = auto
Additional arguments = --ep 0 --op 1.53 (ep=set gap extension penalty to zero and op= sets gap opening penalty to 1.53)

I’m not sure what I’m doing wrong or if I’m overcomplicating thing.

I just tried to skip the MAFFT step and use screen.seqs to trim to the correct basepair size on the original MIDORI2 file but I got the error that says: template is not aligned, aborting.

So once again any advice is helpful.

What does the output of running summary.seqs look like on the output from MAFFT? What syntax are you using for screen.seqs after running align.seqs?

Pat

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.