Filtered alignment, 0

micah_rain · April 3, 2020, 11:29pm

Hello,

So I posted about this previously, but I did not respond in time and the thread closed.

I am currently running the current version of mothur on 16s v4 sequence data. When a exectued filter.seqs I got the following:

Length of filtered alignment: 0
Number of columns removed: 13425
Length of the original alignment: 13425
Number of sequences used to construct filter: 3058818

Output File Names: 
/work/LAS/eswanner-lab/Micah/epa19_16s/epa19.filter
/work/LAS/eswanner-lab/Micah/epa19_16s/epa19.trim.contigs.good.unique.filter.fasta

In the previous discussion, Pat asked to see the summary.seqs command, so I have provided that here:

summary.seqs(fasta=epa19.trim.contigs.good.unique.align) 
[WARNING]: This command can take a namefile and you did not provide one. The current namefile is /work/LAS/eswanner-lab/Micah/epa19_16s/epa19.trim.contigs.good.names which seems to match /work/LAS/eswanner-lab/Micah/epa19_16s/epa19.trim.contigs.good.unique.align.

                Start   End     NBases  Ambigs  Polymer NumSeqs
Minimum:        0       0       0       0       1       1
2.5%-tile:      1968    11550   252     0       3       76471
25%-tile:       1968    11550   253     0       4       764705
Median:         1968    11550   253     0       4       1529410
75%-tile:       1968    11550   253     0       5       2294114
97.5%-tile:     1968    11550   254     0       6       2982348
Maximum:        13425   13425   275     0       114     3058818
Mean:   2013    11539   251     0       4
# of Seqs:      3058818

It took 159 secs to summarize 3058818 sequences.

Output File Names:
/work/LAS/eswanner-lab/Micah/epa19_16s/epa19.trim.contigs.good.unique.summary

micah_rain · April 3, 2020, 11:38pm

I just reread this and realized I ran summary.seqs without the count table parameter, I will rerun this and post the output.

Okay, here are the summary.seqs and filter.seqs outputs that I got:

summary.seqs(fasta=epa19.trim.contigs.good.unique.align, count=epa19.trim.contigs.good.count_table)

                Start   End     NBases  Ambigs  Polymer NumSeqs
Minimum:        0       0       0       0       1       1
2.5%-tile:      1968    11550   253     0       4       305688
25%-tile:       1968    11550   253     0       4       3056876
Median:         1968    11550   253     0       4       6113751
75%-tile:       1968    11550   253     0       5       9170626
97.5%-tile:     1968    11550   254     0       6       11921813
Maximum:        13425   13425   275     0       114     12227500
Mean:   1991    11545   252     0       4
# of unique seqs:       3058818
total # of seqs:        12227500

It took 213 secs to summarize 12227500 sequences.

filter.seqs(fasta=epa19.trim.contigs.good.unique.align, vertical=T, trump=.)

Output File Names:
epa19.trim.contigs.good.unique.summary

Length of filtered alignment: 0
Number of columns removed: 13425
Length of the original alignment: 13425
Number of sequences used to construct filter: 3058818

Output File Names: 
epa19.filter
epa19.trim.contigs.good.unique.filter.fasta

Edit: I read through another post regarding this, and based off of the suggestions, I added ‘minlength=50’ to the screen.seqs command. I reran this and the subsequent commands, but I am still getting the same output for summary.seqs and for filter.seqs, as shown above.

screen.seqs(fasta=epa19.trim.contigs.fasta, group=epa19.contigs.groups, maxambig=0, minlength=50, maxlength=275)


It took 52 secs to screen 13881595 sequences, removed 1654095.

/******************************************/
Running command: remove.seqs(accnos=epa19.trim.contigs.bad.accnos.temp, group=epa19.contigs.groups)
Removed 1654095 sequences from your group file.

Output File Names: 
epa19.contigs.pick.groups

/******************************************/

Output File Names:
epa19.trim.contigs.good.fasta
epa19.trim.contigs.bad.accnos
epa19.contigs.good.groups


It took 87 secs to screen 13881595 sequences.

pschloss · April 6, 2020, 6:47pm

The problem is that some of your sequences did not align to the correct region and when you do trump=T in filter.seqs you lose all of the columns. This is why in the SOP, we suggest using both screen.seqs with start and end positions and filter.seqs with vertical=T and trump=..

You should do the following…

mothur > screen.seqs(fasta=epa19.trim.contigs.good.unique.align, count=epa19.trim.contigs.good.count_table, start=1968, end=11550, maxhomop=8)
mothur > filter.seqs(fasta=current, vertical=T, trump=.)

Pat

micah_rain · April 6, 2020, 10:30pm

Hi Pat,

Okay, I had (embarrassingly) overlooked the second screen.seqs command.

Thank you.

micah_rain · April 7, 2020, 1:36am

Hi Pat,

So I tried running this with the screen.seqs and filter.seqs, but I still got the following:

Length of filtered alignment: 0
Number of columns removed: 13425
Length of the original alignment: 13425
Number of sequences used to construct filter: 3058818

Output File Names:
epa19.filter
epa19.trim.contigs.good.unique.filter.fasta

Thanks, Micah.

pschloss · April 9, 2020, 3:07pm

Can you post the ouput from running…

summary.seqs(fasta=epa19.trim.contigs.good.unique.align, count=epa19.trim.contigs.good.count_table)
summary.seqs(fasta=epa19.trim.contigs.good.unique.good.align, count=epa19.trim.contigs.good.good.count_table)

Make sure you ran screen.seqs as I had in in the earlier post

micah_rain · April 13, 2020, 3:39am

Hi Pat,

here are the outputs you requested after running re-running the screen.seqs and subsequent commands. Thanks, Micah.

summary.seqs(fasta=epa19.trim.contigs.good.unique.align, count=epa19.trim.contigs.good.count_table)


		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	0	0	0	0	1	1
2.5%-tile:	1968	11550	253	0	4	304760
25%-tile:	1968	11550	253	0	4	3047594
Median: 	1968	11550	253	0	4	6095187
75%-tile:	1968	11550	253	0	5	9142780
97.5%-tile:	1968	11550	254	0	6	11885614
Maximum:	13425	13425	260	0	103	12190373
Mean:	1972	11548	252	0	4
# of unique seqs:	3038729
total # of seqs:	12190373

It took 855 secs to summarize 12190373 sequences.

Output File Names:
epa19.trim.contigs.good.unique.summary

summary.seqs(fasta=epa19.trim.contigs.good.unique.good.align, count=epa19.trim.contigs.good.good.count_table)

		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	11550	245	0	3	1
2.5%-tile:	1968	11550	253	0	4	303473
25%-tile:	1968	11550	253	0	4	3034726
Median: 	1968	11550	253	0	4	6069452
75%-tile:	1968	11550	253	0	5	9104178
97.5%-tile:	1968	11550	254	0	6	11835431
Maximum:	1968	13393	260	0	8	12138903
Mean:	1967	11550	253	0	4
# of unique seqs:	3014261
total # of seqs:	12138903

It took 788 secs to summarize 12138903 sequences.

Output File Names:
epa19.trim.contigs.good.unique.good.summary

pschloss · April 16, 2020, 4:27pm

The problem was that you used filter.seqs on the file before you ran screen.seqs. You want to run this instead…

filter.seqs(fasta=epa19.trim.contigs.good.unique.good.align, vertical=T, trump=.)

micah_rain · April 17, 2020, 7:08am

Hi Pat!

Is there a way that I could send my log file? I think I am having trouble keeping things straight when I have been posting my information. From what I can tell, I have done things in the correct order (but maybe I am missing something or got things switched around).

Regards,

Micah

pschloss · April 20, 2020, 5:21pm

Sure - or you could post it here

system · April 30, 2020, 5:21pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problem with filter.seqs - Length of filtered alignment: 0 Commands in mothur	4	337	June 22, 2023
problems with filter.seqs Commands in mothur	3	2160	March 26, 2015
Length of filtered alignment: 0 Commands in mothur	2	571	March 12, 2020
filter.seqs : Length of filtered alignment problem Commands in mothur	4	3055	December 17, 2021
filter.seqs Commands in mothur	4	3922	May 31, 2012

Filtered alignment, 0

Related topics