Increase in sample-specific OTU abundance after remove.seqs?

danieln · April 20, 2014, 8:14pm

Hi,

I selected a number of unique OTUs with get.otulabel command using list option and subsequently their corresponding sequences with list.seqs.

Then I removed these sequences from 97% OTU list file and group file. When I reconstructed my shared file from these two input files to generate a 97% OTU table, I did observe a consistent decrease in total OTU abundance across most of OTU labels, which is presumably correct, BUT a differential increase and/or decrease in sample-specific OTU abundance within each individual OTU label, which is somehow puzzling to me.

Might this be a bug?

Thank you.

Daniel

pschloss · April 24, 2014, 10:25am

Sorry, I’m not following what you’re doing (or why…). Could you post the commands you’re running along with an example of what you’re getting?

Thanks,
Pat

danieln · May 13, 2014, 6:06am

Hi again Pat,

My apology for late response.

We’re interested in investigating the prevalence of OTU sequences that can be confidently classified to certain bacterial strains.

Here are the commands I used with a new dataset:

To generate non-selected/original shared file…
make.shared(list=final.an.list,group=final.groups,label=0.03)

To generate re-selected shared file…
get.seqs(accnos=classified_sequences.accnos,list=final.an.list,name=final.names,group=final.groups,label=0.03)
make.shared(list=final.an.0.03.pick.list,group=final.pick.groups)

Because I was only interested in a certain group of OTU labels that are found in both files of my shared table (i.e., original vs selected), I made an accnos file for filtering,
get.otulabels(accnos=common.otulabels,shared=final.an.original.shared)#renamed file name
get.otulabels(accnos=common.otulabels,shared=final.an.selected.shared)#renamed file name

Output:
Section of my original shared table (final.an.original.0.03.pick.shared)…
Otu0001 Otu0002 Otu0003 Otu0004 Otu0005 Otu0006 Otu0007 Otu0008 Otu0009 Otu0010 Otu0011 Otu0012
Sample A 148 0 14 0 0 0 0 0 0 0 0 0
Sample B 631 343 310 0 33 885 23 0 0 5 0 0
Sample C 1231 339 0 0 375 0 113 0 0 0 0 0
Sample D 980 371 99 0 16 56 5 0 0 2 0 0
Sample E 215 406 0 0 0 42 21 1 0 0 0 27
Sample F 73 0 0 14083 0 0 0 0 0 0 653 0

Section of my selected shared table (final.an.selected.0.03.pick.shared)…
Otu0001 Otu0002 Otu0003 Otu0004 Otu0005 Otu0006 Otu0007 Otu0008 Otu0009 Otu0010 Otu0011 Otu0012
Sample A 13 0 0 0 0 0 0 0 0 0 0 0
Sample B 12 322 1 33 885 0 0 5 0 0 41 0
Sample C 45 339 0 375 0 0 0 0 0 0 0 0
Sample D 11 352 1 16 56 0 0 2 0 0 4 0
Sample E 53 406 0 0 42 1 0 0 0 27 708 0
Sample F 0 0 0 0 0 0 0 0 653 0 23 0

In OTU0004, mothur over-selects sequences but I thought sequences that are originally absent in OTU0004 can never be present in a selected shared table? When I look closely at OTU0004 in the selected table, it seems to me that these abundance values used to belong to those of OTU0005 in the original table (or to the column immediate next to the right). I believe the selected numbers of sequences that I see in my table are correct. But, could mothur have overwritten the zero abundances, which then shifts downstream values up to the left side of the table on a column-basis when I reconstructed my shared file?

Or could I have done things incorrectly? :?

Thank you.

Daniel

pschloss · May 14, 2014, 8:14pm

I suspect that we are generating the OTU labels in make.shared and so the labeling isn’t consistent between your datasets.

Topic		Replies	Views
report on all seqs Commands in mothur	2	2268	May 14, 2013
abundant + shared otus Commands in mothur	0	3949	April 5, 2012
Get.oturep: Too much OTUs Commands in mothur	5	1118	June 21, 2019
Rare and abundant OTU Commands in mothur	3	4766	November 11, 2014
How to get.oturep from a filtered shared file Commands in mothur	4	3220	November 17, 2016

Increase in sample-specific OTU abundance after remove.seqs?

Related topics