Problems with screen.seqs and filter.seqs commands

Hello everyone.

I’m very new at working with MOTHUR and 454 sequences and I have been experiencing some problems.

I have my pyrosequencing results in .fna format, so I began working with the tutorial at the “using trim.seqs” step. I have been working with fungal ITS sequences. Everything was going great until the alignment step. There, I aligned my sequences with the UNITE database, and when I look at the results with the summary.seqs command I realize that most of my sequences do not overlap in the same region, so I tried to use the screen.seqs using the option of minlength=400.

After that, I tried to run filter.seqs and I got the message " Sequences are not all the same length, please correct."

I do not know how to fix that problem, because my sequences are quite long but are aligned badly, and so I cannot run later steps to get one .fasta file that allows me to identify and remove chimeras and contaminants and start my analysis.

Thanks a lot.

Miguel Ángel

Are you sure that you’re using the aligned sequences as input to screen.seqs?

For aligning my sequences I am using UNITE database, withe the following command:

mothur > align.seqs(fasta=hongos_pia0.trim.unique.fasta, reference=unite.fungal.fasta, processors=1)

After doing that I get these three files:

hongos_pia0.trim.unique.align
hongos_pia0.trim.unique.align.report
hongos_pia0.trim.unique.flip.accnos


With the first, I get this:

mothur > summary.seqs(fasta=hongos_pia0.trim.unique.align, name=hongos_pia0.trim.names)


Start End NBases Ambigs Polymer NumSeqs Minimum: 1 3 3 0 1 1 2.5%-tile: 1 10 9 0 2 35 25%-tile: 1 444 316 0 4 348 Median: 1 462 447 0 5 695 75%-tile: 810 889 470 0 6 1042 97.5%-tile: 1361 1694 498 0 6 1355 Maximum: 1666 2197 534 0 8 1389 Mean: 312.05 695.058 375.318 0 4.99136 # of unique seqs: 1276 total # of seqs: 1389
Then, I do this:

mothur > screen.seqs(fasta=hongos_pia0.trim.unique.align, name=hongos_pia0.trim.names, minlength=400)

And get these:

Output File Names:
hongos_pia0.trim.unique.good.align
hongos_pia0.trim.unique.bad.accnos
hongos_pia0.trim.good.names

And finally, when I do:
mothur > filter.seqs(fasta=hongos_pia0.trim.unique.good.align, vertical=T, trump=., processors=1)

I get the error message:

Creating Filter…
Sequences are not all the same length, please correct.


Thanks for your time

I was wondering if it would be possible to optimize the screen.seqs command both for start and end with a criteria=95 instead of having minlength as criteria for getting the sequences.

This way I would get a start and an end shared for the 95% of the sequences, no matter the length (by now).

Thanks again

I’m not familiar with the unite database - is it actually aligned?

mmmmm…

I’m not sure about that, I’ll check it.

Do you recommend me other database for fungal ITS sequences?

Thanks again.

I am receiving the sequences are not all the same length error when trying to filter the alignment after screening sequences. I have aligned my sequences with the silva.bacteria.fasta reference set. I haven’t had any problems with running this analysis previously. Do you have any suggestions on how to correct for this error?

I don’t think UNITE is aligned.