Screen.seqs removed all the sequences

Hira · October 30, 2022, 8:20am

After making contigs, here is the summary of my data.

And the summary of contigs report:

mothur > screen.seqs(fasta=stability.trim.contigs.fasta, count=stability.contigs.count_table, maxambig=0, maxlength=275, maxhomop=8)

It took 28 secs to screen 1896836 sequences, removed 1896836.

/******************************************/
Running command: remove.seqs(accnos=/users/hiraabid/desktop/mothur/Paddy_Fish_NGS_RawData/stability.trim.contigs.bad.accnos.temp, count=/users/hiraabid/desktop/mothur/Paddy_Fish_NGS_RawData/stability.contigs.count_table)
Removed 1896836 sequences from /users/hiraabid/desktop/mothur/Paddy_Fish_NGS_RawData/stability.contigs.count_table.
[WARNING]: /users/hiraabid/desktop/mothur/Paddy_Fish_NGS_RawData/stability.contigs.count_table contains only sequences from the .accnos file.

I need help in understanding the screen.seqs command. I want to know how to set the start, end, maxlength, minlength, maxamig, maxhomop parameters according to this data.
There are questions similar to mine already on the forum and I read them but I am unable to understand this concept. I hope I can get help in this regard.

Thanks
Hira

pschloss · November 1, 2022, 3:10pm

Hi - It looks like you used the 2x300 chemistry and a region where your reads do not fully overlap. If I were you, I’d use…

screen.seqs(fasta=stability.trim.contigs.fasta, count=stability.contigs.count_table, maxambig=0, maxhomop=8, maxlength=500)

The maxlength=500 really depends on the region and what the length range looks like for good sequences, typically obtained from a set of database sequences. The start and end positions only make sense once your sequences have been aligned.

You can get reference sequences for your region here:

You need to read this given that you don’t have fully overlapping sequences:

Hira · November 2, 2022, 4:22am

Thank you for replying. I did the customization of reference alignment according to this “Customize your reference alignment for your favorite region” as my sequences were of 16srRNA V3-V4 region. I am able to understand all this clearly now
I am curious about one thing
After making contigs, in scree.seqs, I used maxlength=485. I want to know that choosing length in this step produces what kind of variation in our final results i.e. the OTU table or taxonomy file. Does it also affect the number of unclassified sequences we get as a result of classify.seqs command? Or You can tell me what kinds of results of our analysis can be affected by it?

Regards
Hira

pschloss · November 2, 2022, 1:25pm

Picking maxlength prevents you from getting (in your case) contigs that are 600nt long because there is only trivial overlap between the reads. The value you pick should be selected keeping in mind how long the region is for high quality sequences selected from a database.

Pat

system · November 12, 2022, 1:26pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Screen.seqs result varying Commands in mothur	2	373	July 1, 2022
Error with screen.seqs removing all sequences Commands in mothur	4	1043	July 29, 2021
Screen.seqs getting rid of most sequences Commands in mothur	3	544	March 1, 2021
Screen.seqs removed all sequence Commands in mothur	2	906	July 24, 2022
Screen.seq fail Commands in mothur	2	29	September 13, 2024

Screen.seqs removed all the sequences

Related topics