Hi,
I am using mothur v.1.45.1.
I trimmed the region of ecoli’s being amplified by the primers 341F and 805R. Aligning the trimmed sequence to the silva.bacteria.fasta, I got that the starting position is 6388 and the end position is 25316.
I used those values to run the pcr.seqs. Then, after running the filter.seqs, the output states that the length of the filtered alignment is 0.
I looked at other threads for the same topic, but I can not get what could cause this.
Thank you so much for any hint.
Can you post the full commands you are running and their order?
Thanks,
Pat
Thank you so much for your response. I appreciate it.
Please find below an extract of the log file with all the commands I used, until I got that the length of the filtered alignment was zero.
mothur >
make.file(inputdir=bottledwater, type=fastq, prefix=stability)
Setting input directory to: bottledwater\
mothur >
make.contigs(file=stability.files, processors=8)
mothur >
summary.seqs(fasta=stability.trim.contigs.fasta)
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 298 298 0 3 1
2.5%-tile: 1 439 439 0 4 20288
25%-tile: 1 441 441 0 5 202872
Median: 1 464 464 0 5 405744
75%-tile: 1 465 465 1 6 608615
97.5%-tile: 1 466 466 10 6 791199
Maximum: 1 602 602 282 289 811486
Mean: 1 455 455 1 5
# of Seqs: 811486
mothur >
screen.seqs(fasta=stability.trim.contigs.fasta, group=stability.contigs.groups, maxambig=0, maxlength=466)
mothur >
get.current()
mothur >
summary.seqs()
Using bottledwater\stability.trim.contigs.good.fasta as input file for the fasta parameter.
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 301 301 0 3 1
2.5%-tile: 1 439 439 0 4 12559
25%-tile: 1 441 441 0 5 125585
Median: 1 464 464 0 5 251170
75%-tile: 1 465 465 0 6 376754
97.5%-tile: 1 465 465 0 6 489780
Maximum: 1 466 466 0 27 502338
Mean: 1 454 454 0 5
# of Seqs: 502338
mothur >
unique.seqs(fasta=stability.trim.contigs.good.fasta)
502338 401551
Output File Names:
bottledwater\stability.trim.contigs.good.names
bottledwater\stability.trim.contigs.good.unique.fasta
mothur >
count.seqs(name=stability.trim.contigs.good.names, group=stability.contigs.good.groups)
It took 8 secs to create a table for 502338 sequences.
Total number of sequences: 502338
Output File Names:
bottledwater\stability.trim.contigs.good.count_table
mothur >
summary.seqs(count=stability.trim.contigs.good.count_table)
Using bottledwater\stability.trim.contigs.good.unique.fasta as input file for the fasta parameter.
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 301 301 0 3 1
2.5%-tile: 1 439 439 0 4 12559
25%-tile: 1 441 441 0 5 125585
Median: 1 464 464 0 5 251170
75%-tile: 1 465 465 0 6 376754
97.5%-tile: 1 465 465 0 6 489780
Maximum: 1 466 466 0 27 502338
Mean: 1 454 454 0 5
# of unique seqs: 401551
total # of seqs: 502338
It took 7 secs to summarize 502338 sequences.
Output File Names:
bottledwater\stability.trim.contigs.good.unique.summary
mothur >
pcr.seqs(fasta=silva.bacteria.fasta, start=6388, end=25319, keepdots=F, processors=8)
Using 8 processors.
[NOTE]: no sequences were bad, removing bottledwater\silva.bacteria.bad.accnos
It took 12 secs to screen 14956 sequences.
Output File Names:
bottledwater\silva.bacteria.pcr.fasta
mothur >
rename.file(input=silva.bacteria.pcr.fasta, new=silva.v4.fasta)
mothur >
summary.seqs(fasta=silva.v4.fasta)
Unable to open bottledwater\silva.v4.fasta. Trying input directory bottledwater\silva.v4.fasta.
Unable to open bottledwater\silva.v4.fasta. Trying default C:\Users\gonzalg\Desktop\NEW_mothur\Mothur.win\mothur\silva.v4.fasta.
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 18930 421 0 3 1
2.5%-tile: 1 18931 441 0 4 374
25%-tile: 1 18931 444 0 4 3740
Median: 1 18931 463 0 5 7479
75%-tile: 1 18931 466 0 5 11218
97.5%-tile: 1 18931 467 1 6 14583
Maximum: 3 18931 508 5 9 14956
Mean: 1 18930 456 0 4
# of Seqs: 14956
It took 4 secs to summarize 14956 sequences.
Output File Names:
C:\Users\gonzalg\Desktop\NEW_mothur\Mothur.win\mothur\silva.v4.summary
mothur >
align.seqs(fasta=stability.trim.contigs.good.unique.fasta, reference=silva.v4.fasta)
Unable to open bottledwater\silva.v4.fasta. Trying input directory bottledwater\silva.v4.fasta.
Unable to open bottledwater\silva.v4.fasta. Trying default C:\Users\gonzalg\Desktop\NEW_mothur\Mothur.win\mothur\silva.v4.fasta.
Using 8 processors.
Reading in the C:\Users\gonzalg\Desktop\NEW_mothur\Mothur.win\mothur\silva.v4.fasta template sequences... DONE.
It took 6 to read 14956 sequences.
Aligning sequences from bottledwater\stability.trim.contigs.good.unique.fasta ...
It took 614 secs to align 401551 sequences.
[WARNING]: 401537 of your sequences generated alignments that eliminated too many bases, a list is provided in bottledwater\stability.trim.contigs.good.unique.flip.accnos.
[NOTE]: 401475 of your sequences were reversed to produce a better alignment.
It took 616 seconds to align 401551 sequences.
Output File Names:
bottledwater\stability.trim.contigs.good.unique.align
bottledwater\stability.trim.contigs.good.unique.align.report
bottledwater\stability.trim.contigs.good.unique.flip.accnos
mothur >
summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table)
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 0 0 0 0 1 1
2.5%-tile: 1 18928 438 0 4 12559
25%-tile: 1 18928 440 0 5 125585
Median: 1 18928 463 0 5 251170
75%-tile: 1 18928 464 0 6 376754
97.5%-tile: 1 18928 464 0 6 489780
Maximum: 18931 18931 466 0 27 502338
Mean: 42 18926 453 0 5
# of unique seqs: 401551
total # of seqs: 502338
It took 148 secs to summarize 502338 sequences.
Output File Names:
bottledwater\stability.trim.contigs.good.unique.summary
mothur >
screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary, maxhomop=8)
Using 8 processors.
It took 80 secs to screen 401551 sequences, removed 181.
/******************************************/
Running command: remove.seqs(accnos=bottledwater\stability.trim.contigs.good.unique.bad.accnos.temp, count=bottledwater\stability.trim.contigs.good.count_table)
Removed 181 sequences from your count file.
Output File Names:
bottledwater\stability.trim.contigs.good.pick.count_table
/******************************************/
Output File Names:
bottledwater\stability.trim.contigs.good.unique.good.summary
bottledwater\stability.trim.contigs.good.unique.good.align
bottledwater\stability.trim.contigs.good.unique.bad.accnos
bottledwater\stability.trim.contigs.good.good.count_table
It took 169 secs to screen 401551 sequences.
mothur >
summary.seqs(fasta=current, count=current)
Using bottledwater\stability.trim.contigs.good.good.count_table as input file for the count parameter.
Using bottledwater\stability.trim.contigs.good.unique.good.align as input file for the fasta parameter.
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 25 10 0 1 1
2.5%-tile: 1 18928 438 0 4 12554
25%-tile: 1 18928 440 0 5 125540
Median: 1 18928 463 0 5 251079
75%-tile: 1 18928 464 0 6 376618
97.5%-tile: 1 18928 464 0 6 489604
Maximum: 18899 18931 466 0 8 502157
Mean: 40 18927 453 0 5
# of unique seqs: 401370
total # of seqs: 502157
It took 140 secs to summarize 502157 sequences.
Output File Names:
bottledwater\stability.trim.contigs.good.unique.good.summary
mothur >
filter.seqs(fasta=stability.trim.contigs.good.unique.good.align, vertical=T, trump=.)
Using 8 processors.
Creating Filter...
It took 77 secs to create filter for 401370 sequences.
Running Filter...
It took 71 secs to filter 401370 sequences.
Length of filtered alignment: 0
Number of columns removed: 18931
Length of the original alignment: 18931
Number of sequences used to construct filter: 401370
Output File Names:
bottledwater\stability.filter
bottledwater\stability.trim.contigs.good.unique.good.filter.fasta
Hi there,
In your second screen.seqs
step…
screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary, maxhomop=8)
You need to include start and end coordinate positions. You might try something like this…
screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary, maxhomop=8, start=1, end=18928)
Also, it looks like you might be using 2x300 chemistry and are sequencing a longer region. You might want to check out this blog post from a few years back… Why do I have such a large distance matrix
Pat
Hi Pat,
Thank you so much.
The information in the link is relevant.
Including the start and end positions on the second screen.seqs step allowed to continue with: filter.seqs, unique.seqs and the pre.cluster commands.
However, at the chimera.vsearch, the program closed suddenly.
Thus, I got stuck.
This points to the difficulties to detect chimeras when using 2x300 chemistry as you mentioned in the link.
Hi -
The chimera checking step shouldn’t crash out. Can you get the latest version of mothur and try again? It should be at Release Version 1.45.2 · mothur/mothur · GitHub. Let us know if it still crashes.
pat
Hi
Version 1.45.2 is working great in analyzing my dataset !
The chimera check ran without problems.
I am now clustering the sequences into OTUs. I chose the traditional approach and it took about 24 h to get the distance for 233000 sequences. The clustering step has been running for 3 days…(on widows with 16 GB ram). It is taking long, but so far it looks fine.
Maybe at some point, I will compare it with the heuristic approach.
Thank you for your help !
Graciela