Length of filtered alignment:0

Hi,

I am using mothur v.1.45.1.
I trimmed the region of ecoli’s being amplified by the primers 341F and 805R. Aligning the trimmed sequence to the silva.bacteria.fasta, I got that the starting position is 6388 and the end position is 25316.

I used those values to run the pcr.seqs. Then, after running the filter.seqs, the output states that the length of the filtered alignment is 0.
I looked at other threads for the same topic, but I can not get what could cause this.

Thank you so much for any hint.

Can you post the full commands you are running and their order?

Thanks,
Pat

Thank you so much for your response. I appreciate it.
Please find below an extract of the log file with all the commands I used, until I got that the length of the filtered alignment was zero.

mothur > 
make.file(inputdir=bottledwater, type=fastq, prefix=stability)
Setting input directory to: bottledwater\



mothur > 
make.contigs(file=stability.files, processors=8)



mothur > 
summary.seqs(fasta=stability.trim.contigs.fasta)

Using 8 processors.

		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	298	298	0	3	1
2.5%-tile:	1	439	439	0	4	20288
25%-tile:	1	441	441	0	5	202872
Median: 	1	464	464	0	5	405744
75%-tile:	1	465	465	1	6	608615
97.5%-tile:	1	466	466	10	6	791199
Maximum:	1	602	602	282	289	811486
Mean:	1	455	455	1	5
# of Seqs:	811486



mothur > 
screen.seqs(fasta=stability.trim.contigs.fasta, group=stability.contigs.groups, maxambig=0, maxlength=466)



mothur > 
get.current()




mothur > 
summary.seqs()
Using bottledwater\stability.trim.contigs.good.fasta as input file for the fasta parameter.

Using 8 processors.

		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	301	301	0	3	1
2.5%-tile:	1	439	439	0	4	12559
25%-tile:	1	441	441	0	5	125585
Median: 	1	464	464	0	5	251170
75%-tile:	1	465	465	0	6	376754
97.5%-tile:	1	465	465	0	6	489780
Maximum:	1	466	466	0	27	502338
Mean:	1	454	454	0	5
# of Seqs:	502338




mothur > 
unique.seqs(fasta=stability.trim.contigs.good.fasta)
502338	401551

Output File Names: 
bottledwater\stability.trim.contigs.good.names
bottledwater\stability.trim.contigs.good.unique.fasta


mothur > 
count.seqs(name=stability.trim.contigs.good.names, group=stability.contigs.good.groups)

It took 8 secs to create a table for 502338 sequences.

Total number of sequences: 502338

Output File Names: 
bottledwater\stability.trim.contigs.good.count_table


mothur > 
summary.seqs(count=stability.trim.contigs.good.count_table)
Using bottledwater\stability.trim.contigs.good.unique.fasta as input file for the fasta parameter.

Using 8 processors.

		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	301	301	0	3	1
2.5%-tile:	1	439	439	0	4	12559
25%-tile:	1	441	441	0	5	125585
Median: 	1	464	464	0	5	251170
75%-tile:	1	465	465	0	6	376754
97.5%-tile:	1	465	465	0	6	489780
Maximum:	1	466	466	0	27	502338
Mean:	1	454	454	0	5
# of unique seqs:	401551
total # of seqs:	502338

It took 7 secs to summarize 502338 sequences.

Output File Names:
bottledwater\stability.trim.contigs.good.unique.summary


mothur > 
pcr.seqs(fasta=silva.bacteria.fasta, start=6388, end=25319, keepdots=F, processors=8)

Using 8 processors.
[NOTE]: no sequences were bad, removing bottledwater\silva.bacteria.bad.accnos

It took 12 secs to screen 14956 sequences.

Output File Names: 
bottledwater\silva.bacteria.pcr.fasta



mothur > 
rename.file(input=silva.bacteria.pcr.fasta, new=silva.v4.fasta)



mothur > 
summary.seqs(fasta=silva.v4.fasta)
Unable to open bottledwater\silva.v4.fasta. Trying input directory bottledwater\silva.v4.fasta.
Unable to open bottledwater\silva.v4.fasta. Trying default C:\Users\gonzalg\Desktop\NEW_mothur\Mothur.win\mothur\silva.v4.fasta.

Using 8 processors.

		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	18930	421	0	3	1
2.5%-tile:	1	18931	441	0	4	374
25%-tile:	1	18931	444	0	4	3740
Median: 	1	18931	463	0	5	7479
75%-tile:	1	18931	466	0	5	11218
97.5%-tile:	1	18931	467	1	6	14583
Maximum:	3	18931	508	5	9	14956
Mean:	1	18930	456	0	4
# of Seqs:	14956

It took 4 secs to summarize 14956 sequences.

Output File Names:
C:\Users\gonzalg\Desktop\NEW_mothur\Mothur.win\mothur\silva.v4.summary


mothur > 
align.seqs(fasta=stability.trim.contigs.good.unique.fasta, reference=silva.v4.fasta)
Unable to open bottledwater\silva.v4.fasta. Trying input directory bottledwater\silva.v4.fasta.
Unable to open bottledwater\silva.v4.fasta. Trying default C:\Users\gonzalg\Desktop\NEW_mothur\Mothur.win\mothur\silva.v4.fasta.

Using 8 processors.

Reading in the C:\Users\gonzalg\Desktop\NEW_mothur\Mothur.win\mothur\silva.v4.fasta template sequences...	DONE.
It took 6 to read  14956 sequences.

Aligning sequences from bottledwater\stability.trim.contigs.good.unique.fasta ...
It took 614 secs to align 401551 sequences.

[WARNING]: 401537 of your sequences generated alignments that eliminated too many bases, a list is provided in bottledwater\stability.trim.contigs.good.unique.flip.accnos.
[NOTE]: 401475 of your sequences were reversed to produce a better alignment.

It took 616 seconds to align 401551 sequences.

Output File Names: 
bottledwater\stability.trim.contigs.good.unique.align
bottledwater\stability.trim.contigs.good.unique.align.report
bottledwater\stability.trim.contigs.good.unique.flip.accnos


mothur > 
summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table)

Using 8 processors.

		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	0	0	0	0	1	1
2.5%-tile:	1	18928	438	0	4	12559
25%-tile:	1	18928	440	0	5	125585
Median: 	1	18928	463	0	5	251170
75%-tile:	1	18928	464	0	6	376754
97.5%-tile:	1	18928	464	0	6	489780
Maximum:	18931	18931	466	0	27	502338
Mean:	42	18926	453	0	5
# of unique seqs:	401551
total # of seqs:	502338

It took 148 secs to summarize 502338 sequences.

Output File Names:
bottledwater\stability.trim.contigs.good.unique.summary


mothur > 
screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary,  maxhomop=8)

Using 8 processors.

It took 80 secs to screen 401551 sequences, removed 181.

/******************************************/
Running command: remove.seqs(accnos=bottledwater\stability.trim.contigs.good.unique.bad.accnos.temp, count=bottledwater\stability.trim.contigs.good.count_table)
Removed 181 sequences from your count file.

Output File Names:
bottledwater\stability.trim.contigs.good.pick.count_table

/******************************************/

Output File Names:
bottledwater\stability.trim.contigs.good.unique.good.summary
bottledwater\stability.trim.contigs.good.unique.good.align
bottledwater\stability.trim.contigs.good.unique.bad.accnos
bottledwater\stability.trim.contigs.good.good.count_table


It took 169 secs to screen 401551 sequences.

mothur > 
summary.seqs(fasta=current, count=current)
Using bottledwater\stability.trim.contigs.good.good.count_table as input file for the count parameter.
Using bottledwater\stability.trim.contigs.good.unique.good.align as input file for the fasta parameter.

Using 8 processors.

		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	25	10	0	1	1
2.5%-tile:	1	18928	438	0	4	12554
25%-tile:	1	18928	440	0	5	125540
Median: 	1	18928	463	0	5	251079
75%-tile:	1	18928	464	0	6	376618
97.5%-tile:	1	18928	464	0	6	489604
Maximum:	18899	18931	466	0	8	502157
Mean:	40	18927	453	0	5
# of unique seqs:	401370
total # of seqs:	502157

It took 140 secs to summarize 502157 sequences.

Output File Names:
bottledwater\stability.trim.contigs.good.unique.good.summary


mothur > 
filter.seqs(fasta=stability.trim.contigs.good.unique.good.align, vertical=T, trump=.)

Using 8 processors.
Creating Filter...
It took 77 secs to create filter for 401370 sequences.


Running Filter...
It took 71 secs to filter 401370 sequences.



Length of filtered alignment: 0
Number of columns removed: 18931
Length of the original alignment: 18931
Number of sequences used to construct filter: 401370

Output File Names: 
bottledwater\stability.filter
bottledwater\stability.trim.contigs.good.unique.good.filter.fasta

Hi there,

In your second screen.seqs step…

screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary,  maxhomop=8)

You need to include start and end coordinate positions. You might try something like this…

screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary,  maxhomop=8, start=1, end=18928)

Also, it looks like you might be using 2x300 chemistry and are sequencing a longer region. You might want to check out this blog post from a few years back… Why do I have such a large distance matrix

Pat

Hi Pat,

Thank you so much.
The information in the link is relevant.

Including the start and end positions on the second screen.seqs step allowed to continue with: filter.seqs, unique.seqs and the pre.cluster commands.
However, at the chimera.vsearch, the program closed suddenly.
Thus, I got stuck.
This points to the difficulties to detect chimeras when using 2x300 chemistry as you mentioned in the link.

Hi -

The chimera checking step shouldn’t crash out. Can you get the latest version of mothur and try again? It should be at Release Version 1.45.2 · mothur/mothur · GitHub. Let us know if it still crashes.

pat

Hi
Version 1.45.2 is working great in analyzing my dataset !
The chimera check ran without problems.
I am now clustering the sequences into OTUs. I chose the traditional approach and it took about 24 h to get the distance for 233000 sequences. The clustering step has been running for 3 days…(on widows with 16 GB ram). It is taking long, but so far it looks fine.
Maybe at some point, I will compare it with the heuristic approach.

Thank you for your help !
Graciela