Customize coordinates for v4 region

zhinew · May 16, 2023, 2:52am

Hello, I am customizing reference alignment for 16S v4 region. After I align the .fasta file to the silva.seed_v132.align file using “align.seqs” command, there was no .align file being generated and I can’t run the summary.seqs command. Can anyone help me with this issue?

pschloss · May 16, 2023, 6:00pm

Can you post the commands you are running and the names of the files that are being generated along with any warning/error messages?

Thanks,
Pat

zhinew · May 16, 2023, 7:15pm

Hi Dr. Schloss,

This is the my align command:
“align.seqs(fasta=Ecoli.v4.fasta, reference=silva.seed_v132.align)”
There is no .align file generated after this command, maybe something wrong with my v4.fasta file and I can post later.

Alternatively, I just use v4 coordinates 11895 and 25318 to trim the silva.seed_v132.align and it worked fine.
Two quick questions: Should I use the Silva full-length sequence references or the recreated seed database as the alignment reference? Can I use the HMP_MOCK.v35.fasta file to check my own sequencing errors?

Thanks for your time.

pschloss · May 18, 2023, 5:30pm

Hi,

Were there any error messages when you ran align.seqs?
I think the recreated seed should work as well as the non-redundant alignments (they both contain full length sequences)
I doubt you sequenced the HMP Mock community. You would need to get your own fasta file with sequences for the 16S rRNA gene sequences in your mock comunity

Pat

zhinew · May 18, 2023, 6:21pm

Hi Dr. Schloss,

When I ran align.seqs, nothing popped out .
Yes, the recreated seed file worked fine for me now.
I found my own mock file. However, when I used them as reference to run the error.seqs command, the following warning message popped out. Maybe I should screen the mock file to get rid of ambigs?

[WARNING]: We found more than 25% of the bases in sequence AA>1BCAFFAC11FEGFG?FGHCG?AEHHHHF1FCEGEGHGGEFEEEEGECGEC>ECDGGHHFFFGBFHHEG?GCEFGH?EGE<>?GFFGCGG/>@CGCFFHH1C1FGB11GFCGECCG0<0=0DGGFHCCGEHGHGHHGEHHHHHGHFEGFBFGG?.;EFB0.9AB-99////;9FFA–B?FFFF9BB/-@>-999=@@?FFFFFFBBFFFF//;///—?BF99AA–;-BFB–@-@=9A@9>@- to be ambiguous. Mothur is not setup to process protein sequences.
[WARNING]: 417 reference sequences have ambiguous bases, these bases will be ignored.

pschloss · May 23, 2023, 5:44pm

I suspect you haven’t correctly formatted your reference file.

system · June 2, 2023, 5:45pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Customize Silva reference for V4 region Commands in mothur	6	574	August 13, 2023
Errors following align.seqs Commands in mothur	8	699	January 26, 2024
align.seqs using silva.nr_v123 Theory behind mothur	4	4379	February 16, 2016
Template not aligned error for silva reference alignment (v123) Theory behind mothur	3	3007	February 29, 2016
align.seqs error--How should I troubleshoot? Commands in mothur	2	3617	June 13, 2014

Customize coordinates for v4 region

Related topics