I’m getting some odd behavior when using trim.seqs with the qfile arguement when it’s not actually needed. Again I think an example will be more clear:
First I trim the sequences using average Q score as a threshold. Mothur didn’t throw up any mistakes so I’m assuming that qual file looked okay.
mothur > trim.seqs(fasta=Data.fasta, qfile=Data.qual, qaverage=25)
mothur > system(mv Data.trim.fasta Data.Q25.trim.fasta)
mothur > summary.seqs(fasta=Data.Q25.trim.fasta)
Start End NBases Ambigs Polymer
Minimum: 1 30 30 0 1
2.5%-tile: 1 34 34 0 2
25%-tile: 1 94 94 0 3
Median: 1 97 97 0 3
75%-tile: 1 97 97 0 3
97.5%-tile: 1 97 97 1 4
Maximum: 1 447 447 14 31of Seqs: 151527
I then try to trim the sequences using some additional criteria which don’t actually require the qual file, but I left the qfile argument in anyway.
mothur > trim.seqs(fasta=Data.Q25.trim.fasta, qfile=Data.qual, maxambig=0, minlength=88, maxlength=107)
mothur > system(mv Data.Q25.trim.trim.fasta Data.Q25.QC.trim.fasta)
mothur > summary.seqs(fasta=Data.Q25.QC.trim.fasta)
Start End NBases Ambigs Polymer
Minimum: 1 88 88 0 2
2.5%-tile: 1 91 91 0 2
25%-tile: 1 96 96 0 3
Median: 1 97 97 0 3
75%-tile: 1 97 97 0 3
97.5%-tile: 1 97 97 0 3
Maximum: 1 107 107 0 7of Seqs: 117266
Now I try the same thing but take the qfile argument out.
mothur > trim.seqs(fasta=Data.Q25.trim.fasta, maxambig=0, minlength=88, maxlength=107)
mothur > system(mv Data.Q25.trim.trim.fasta Data.Q25.QC.trim.fasta)
mothur > summary.seqs(fasta=Data.Q25.QC.trim.fasta)
Start End NBases Ambigs Polymer
Minimum: 1 88 88 0 2
2.5%-tile: 1 91 91 0 2
25%-tile: 1 96 96 0 3
Median: 1 97 97 0 3
75%-tile: 1 97 97 0 3
97.5%-tile: 1 97 97 0 3
Maximum: 1 107 107 0 7of Seqs: 120644
I randomly picked a few sequences in the fasta and qual files to check that their length actually match, and as far as I can see, they did. Is anyone else able to replicate this behavior?