Hi Pat,
Posting it here as a new thread.
And yes indeed, I have removed the primers while making the silva.V3_V4 file.
I have a similar question with regard to too many bases being eliminated. I am working with gut samples sequenced at the V3-V4 region. On aligning these sequences to the Silva.v3_4 file, I receive a similar warning message.
This is the output from my log file. I am running Mothur version 1.48.0, Linux version on the HPC as a batch file. Since the flip=T is already set as default in the new version, I have not specified it in the command.
mothur > align.seqs(fasta=current, reference=Silva.v3_4.fasta)
Using /scratch/leuven/341/vsc34148/V3_V4/V3_V4.trim.contigs.good.unique.fasta as input file for the fasta parameter.
Using 36 processors.
Reading in the /scratch/leuven/341/vsc34148/V3_V4/Silva.v3_4.fasta template sequences… DONE.
It took 61 to read 213119 sequences.
Aligning sequences from /scratch/leuven/341/vsc34148/V3_V4/V3_V4.trim.contigs.good.unique.fasta …
It took 38084 secs to align 15860288 sequences.
[WARNING]: 4426786 of your sequences generated alignments that eliminated too many bases, a list is provided in /scratch/leuven/341/vsc34148/V3_V4/V3_V4.trim.contigs.good.unique.flip.accno$
[NOTE]: 4388639 of your sequences were reversed to produce a better alignment.
It took 38084 seconds to align 15860288 sequences.
Output File Names:
/scratch/leuven/341/vsc34148/V3_V4/V3_V4.trim.contigs.good.unique.align
/scratch/leuven/341/vsc34148/V3_V4/V3_V4.trim.contigs.good.unique.align_report
/scratch/leuven/341/vsc34148/V3_V4/V3_V4.trim.contigs.good.unique.flip.accnos
mothur > summary.seqs(fasta=current, count=current)
Using /scratch/leuven/341/vsc34148/V3_V4/V3_V4.trim.contigs.good.count_table as input file for the count parameter.
Using /scratch/leuven/341/vsc34148/V3_V4/V3_V4.trim.contigs.good.unique.align as input file for the fasta parameter.
Using 36 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 0 0 0 0 1 1
2.5%-tile: 1 18983 440 0 4 637607
25%-tile: 1 18983 458 0 5 6376061
Median: 1 18985 465 0 5 12752121
75%-tile: 55 18985 469 0 6 19128181
97.5%-tile: 55 18985 469 0 6 24866635
Maximum: 18985 18985 480 0 8 25504240
Mean: 62 18968 459 0 5
of unique seqs: 15860288
total # of seqs: 25504240
It took 646 secs to summarize 25504240 sequences.
Output File Names:
/scratch/leuven/341/vsc34148/V3_V4/V3_V4.trim.contigs.good.unique.summary
Any suggestions would be very helpful!
Many thanks,
Aditi