Phylotypes based analysis

Jatinder · September 4, 2014, 4:06pm

Hi,

I would like to do phylotypes based analysis with my MiSeq data for v34 region. As a practice, I processed the example data in the MiSeq SOP and binned sequences into phylotypes, generated the stability.an.shared file. I noticed that the system did not generate the stability…0.03.cons.taxonomy file. I went ahead with the next command: count.groups(shared=stability.an.shared). It generated the output: Mock contains 4016 Total seqs:4061 (The stabilty.an.shared file contains only one group- Mock, is this correct?). In the SOP, under the count.groups command, it says the smallest sample had 2441 sequences in it. So, the question is: where did we get this number “2441” from? As this number is used in the next command and many commands in the analysis. Is this analysis on the correct path?

Thanks,

Jatinder

pschloss · September 10, 2014, 5:21pm

In the phylotype section of the SOP there are 4 steps…

mothur > remove.groups(count=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.count_table, fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta, taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy, groups=Mock)


mothur > phylotype(taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.taxonomy)

mothur > make.shared(list=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.list, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.pick.count_table, label=1)

mothur > classify.otu(list=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.list, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.pick.count_table, taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.taxonomy, label=1)

The last one should generate the cons.taxonomy file. From the looks of your output, I suspect you missed running the first command

Jatinder · September 10, 2014, 7:20pm

Hello Pat, Thanks. I had gone through all the commands you have suggested and it did generate a cons.taxonomy file, but I was a little confused as the file is names are a little different: …tx.1.cons.taxonomy instead of …unique_list.0.03.cons.taxonomy. Please comment on the second part of the question regarding the output of the command:count.groups.
Thank you.

pschloss · September 11, 2014, 12:16pm

So, the question is: where did we get this number “2441” from? As this number is used in the next command and many commands in the analysis. Is this analysis on the correct path?

It was the size of the smallest library.

Jatinder · September 11, 2014, 12:54pm

Thanks, Pat. So, this number comes from the output of the command : count.groups(shared=stability.an.shared)? :? When I used this command what I got was : Mock contains 4,061 sequences, and the file stability.an.shared has only mock sequences. Please clarify.

Thanks

Jatinder

pschloss · September 11, 2014, 7:01pm

See above…

From the looks of your output, I suspect you missed running the first command

Jatinder · September 12, 2014, 1:14pm

Hi Pat, Thanks much. I have copied and pasted below from the log file what was done.

mothur > remove.groups(count=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.count_table, fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta, taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy, groups=Mock)

[NOTE]: The count file should contain only unique names, so mothur assumes your fasta, list and taxonomy files also contain only uniques.

Removed 46 sequences from your fasta file.
Removed 4061 sequences from your count file.
Removed 46 sequences from your taxonomy file.

Output File names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.pick.fasta
stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.pick.count_table
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.taxonomy

mothur > phylotype(taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.taxonomy) 1 2 3 4 5 6

Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.sabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.list

mothur > make.shared(list=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.list, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.pick.count_table, label=1) 1

Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.shared
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D0.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D1.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D141.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D142.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D143.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D144.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D145.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D146.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D147.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D148.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D149.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D150.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D2.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D3.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D5.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D6.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D7.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D8.rabund
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.F3D9.rabund

mothur > classify.otu(list=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.list, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.pick.count_table, taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.taxonomy, label=1) reftaxonomy is not required, but if given will keep the rankIDs in the summary file static. 1 64

Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.1.cons.taxonomy
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.1.cons.tax.summary

mothur > system(mv stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.pick.an.unique_list.shared stability.an.shared)
mothur > count.groups(shared=stability.an.shared) Mock contains 4061.

Total seqs: 4061.

Output File Names:
count.summary

So, I believe, I have gone through all the necessary commands. :?

Please comment.

Thanks,

Jatinder

pschloss · September 15, 2014, 9:12pm

Based on the commands you entered, I believe that this…

mothur > system(mv stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.pick.an.unique_list.shared stability.an.shared)

should be this…

mothur > system(mv stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.shared stability.an.shared)

Note that you didn’t redo the cluster/cluster.split and make.shared commands on the OTU data, just the phylotype data.

Pat

Jatinder · September 16, 2014, 1:55pm

Hi Pat,

Thanks much. I will give this a try.

Jatinder

Jatinder · September 16, 2014, 7:09pm

Hi Pat,

That worked and I went smoothly through till the alpha diversity part of the SOP.

Now, I have two questions:

In the beta diversity part of the analysis, the SOP uses the file: stability.an.0.03.subsample.shared, in heatmap.bin, venn, get.communitytype, metastats and many other commands. I did not generate this file. However, from the sub.sample(shared=stability.an.shared, size=2241) command, I generated the file: stability.an.1.subsample.shared. So, should I be using this file for the beta diversity analysis :?:
The phylotypes analysis did not generate the file: stability.trim.contigs.good.unique.precluster.pick.pick.pick.an.unique_list.0.03.cons.taxonomy. So, which file should I use to rename to: stability.an.cons.taxonomy, to use for the beta diversity analysis :?:

Thank you,

Jatinder

pschloss · September 17, 2014, 3:27pm

You want the same files, but they will have a tx in them.

Jatinder · September 17, 2014, 7:11pm

Hi Pat,

Thank you. I have 5 different files ending in “.taxonomy”. However, only one file out of these has “tx” in it: stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.pick.tx.1.cons.taxonomy, so is this the right file to use for beta diversity analysis :?:

Also, is the file: stability.an.1.subsample.shared, the right file to be used in place of the file: stability.an.0.03.subsample.shared :?: Please clarify.

Thank you,

Jatinder

pschloss · September 22, 2014, 12:31pm

Are you following and understanding the SOP? To generate the phylotype-based files you need to follow the instructions found here:

http://www.mothur.org/wiki/MiSeq_SOP#Phylotypes

This will generate a shared file and cons.taxonomy file for you that are based on the phylotype data. A shared file is what is used in the beta diversity analysis. If you look at the beta-divesity analysis steps, you will see this.

Pat

Jatinder · September 22, 2014, 1:39pm

Thank you.

Yes, I am following the SOP for phylotype-based analysis. Before I analyzing my data, I wanted to go through the SOP using the data provided by you and make sure that I was getting the expected output. Also, I wanted to make sure that I was using the right files for the analysis and hence the very basic questions about the file names.

Topic		Replies	Views
Phylotype analyses: make.shared produces no output Commands in mothur	6	4063	May 14, 2014
Classify.seqs output in phylotype analysis Commands in mothur	4	1021	May 10, 2017
difference between total seq numbers Theory behind mothur	2	1481	November 9, 2017
aggregating data Commands in mothur	5	1126	December 9, 2016
Align.seqs removing most bp Commands in mothur	22	14384	April 26, 2013

Phylotypes based analysis

Related topics