So I posted about this previously, but I did not respond in time and the thread closed.
I am currently running the current version of mothur on 16s v4 sequence data. When a exectued filter.seqs I got the following:
Length of filtered alignment: 0
Number of columns removed: 13425
Length of the original alignment: 13425
Number of sequences used to construct filter: 3058818
Output File Names:
/work/LAS/eswanner-lab/Micah/epa19_16s/epa19.filter
/work/LAS/eswanner-lab/Micah/epa19_16s/epa19.trim.contigs.good.unique.filter.fasta
In the previous discussion, Pat asked to see the summary.seqs command, so I have provided that here:
summary.seqs(fasta=epa19.trim.contigs.good.unique.align)
[WARNING]: This command can take a namefile and you did not provide one. The current namefile is /work/LAS/eswanner-lab/Micah/epa19_16s/epa19.trim.contigs.good.names which seems to match /work/LAS/eswanner-lab/Micah/epa19_16s/epa19.trim.contigs.good.unique.align.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 0 0 0 0 1 1
2.5%-tile: 1968 11550 252 0 3 76471
25%-tile: 1968 11550 253 0 4 764705
Median: 1968 11550 253 0 4 1529410
75%-tile: 1968 11550 253 0 5 2294114
97.5%-tile: 1968 11550 254 0 6 2982348
Maximum: 13425 13425 275 0 114 3058818
Mean: 2013 11539 251 0 4
# of Seqs: 3058818
It took 159 secs to summarize 3058818 sequences.
Output File Names:
/work/LAS/eswanner-lab/Micah/epa19_16s/epa19.trim.contigs.good.unique.summary
filter.seqs(fasta=epa19.trim.contigs.good.unique.align, vertical=T, trump=.)
Output File Names:
epa19.trim.contigs.good.unique.summary
Length of filtered alignment: 0
Number of columns removed: 13425
Length of the original alignment: 13425
Number of sequences used to construct filter: 3058818
Output File Names:
epa19.filter
epa19.trim.contigs.good.unique.filter.fasta
Edit: I read through another post regarding this, and based off of the suggestions, I added ‘minlength=50’ to the screen.seqs command. I reran this and the subsequent commands, but I am still getting the same output for summary.seqs and for filter.seqs, as shown above.
screen.seqs(fasta=epa19.trim.contigs.fasta, group=epa19.contigs.groups, maxambig=0, minlength=50, maxlength=275)
It took 52 secs to screen 13881595 sequences, removed 1654095.
/******************************************/
Running command: remove.seqs(accnos=epa19.trim.contigs.bad.accnos.temp, group=epa19.contigs.groups)
Removed 1654095 sequences from your group file.
Output File Names:
epa19.contigs.pick.groups
/******************************************/
Output File Names:
epa19.trim.contigs.good.fasta
epa19.trim.contigs.bad.accnos
epa19.contigs.good.groups
It took 87 secs to screen 13881595 sequences.
The problem is that some of your sequences did not align to the correct region and when you do trump=T in filter.seqs you lose all of the columns. This is why in the SOP, we suggest using both screen.seqs with start and end positions and filter.seqs with vertical=T and trump=..
So I tried running this with the screen.seqs and filter.seqs, but I still got the following:
Length of filtered alignment: 0
Number of columns removed: 13425
Length of the original alignment: 13425
Number of sequences used to construct filter: 3058818
Is there a way that I could send my log file? I think I am having trouble keeping things straight when I have been posting my information. From what I can tell, I have done things in the correct order (but maybe I am missing something or got things switched around).