Customize coordinates for v4 region

Hello, I am customizing reference alignment for 16S v4 region. After I align the .fasta file to the silva.seed_v132.align file using “align.seqs” command, there was no .align file being generated and I can’t run the summary.seqs command. Can anyone help me with this issue?

Can you post the commands you are running and the names of the files that are being generated along with any warning/error messages?

Thanks,
Pat

Hi Dr. Schloss,

This is the my align command:
“align.seqs(fasta=Ecoli.v4.fasta, reference=silva.seed_v132.align)”
There is no .align file generated after this command, maybe something wrong with my v4.fasta file and I can post later.

Alternatively, I just use v4 coordinates 11895 and 25318 to trim the silva.seed_v132.align and it worked fine.
Two quick questions: Should I use the Silva full-length sequence references or the recreated seed database as the alignment reference? Can I use the HMP_MOCK.v35.fasta file to check my own sequencing errors?

Thanks for your time.

Hi,

  1. Were there any error messages when you ran align.seqs?

  2. I think the recreated seed should work as well as the non-redundant alignments (they both contain full length sequences)

  3. I doubt you sequenced the HMP Mock community. You would need to get your own fasta file with sequences for the 16S rRNA gene sequences in your mock comunity

Pat

Hi Dr. Schloss,

  1. When I ran align.seqs, nothing popped out .

  2. Yes, the recreated seed file worked fine for me now.

  3. I found my own mock file. However, when I used them as reference to run the error.seqs command, the following warning message popped out. Maybe I should screen the mock file to get rid of ambigs?

[WARNING]: We found more than 25% of the bases in sequence AA>1BCAFFAC11FEGFG?FGHCG?AEHHHHF1FCEGEGHGGEFEEEEGECGEC>ECDGGHHFFFGBFHHEG?GCEFGH?EGE<>?GFFGCGG/>@CGCFFHH1C1FGB11GFCGECCG0<0=0DGGFHCCGEHGHGHHGEHHHHHGHFEGFBFGG?.;EFB0.9AB-99////;9FFA–B?FFFF9BB/-@>-999=@@?FFFFFFBBFFFF//;///—?BF99AA–;-BFB–@-@=9A@9>@- to be ambiguous. Mothur is not setup to process protein sequences.
[WARNING]: 417 reference sequences have ambiguous bases, these bases will be ignored.

I suspect you haven’t correctly formatted your reference file.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.