Sanger 16S rRNA gene analysis

uihyun · April 20, 2021, 5:58am

Hi. i’m done with the analysis on my sequence data from Sanger sequencing.
I’m new in sort of these analysis so i want to confirm if i did it each step correctly.
I referred to the example “Esophageal community analysis”.

I aligned the fasta files(almost full length) with silva.seed_v132.align as you’ve recommended. My purpose on the analysis was that after clustering into OTUs, getting the representative OTUs for further taxonomy assignment.
The followings is the commands that i used.

OTU clustering for cultured samples

step

setting the directory
set.dir(input=/Users/uihy/Downloads/silva.seed_v132)
```
Mothur's directories:
inputDir=/Users/uihy/Downloads/silva.seed_v132/
```

alignment with silva alignment database
align.seqs(fasta=stock_re.fasta, reference=silva.seed_v132.align, processors=2)

Using 2 processors.

Reading in the /Users/uihy/Downloads/silva.seed_v132/silva.seed_v132.align template sequences...	DONE.
It took 12 to read  11180 sequences.

Aligning sequences from /Users/uihy/Downloads/silva.seed_v132/stock_541.fasta ...
It took 6 secs to align 541 sequences.


It took 6 seconds to align 541 sequences.

Output File Names: 
/Users/uihy/Downloads/silva.seed_v132/stock_541.align
/Users/uihy/Downloads/silva.seed_v132/stock_541.align.report

summary the alignment file
summary.seqs(stock_541.align)

Using 2 processors.

	Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1046	41517	1221	0	5	1
2.5%-tile:	1148	41572	1249	0	5	14
25%-tile:	1726	42565	1287	0	5	136
Median: 	1769	42581	1309	0	6	271
75%-tile:	2032	42622	1335	0	6	406
97.5%-tile:	2086	42984	1372	0	6	528
Maximum:	2540	43017	1402	0	9	541
Mean:	1726	42563	1309	0	5
# of Seqs:	541

It took 0 secs to summarize 541 sequences.

Output File Names:
/Users/uihy/Downloads/silva.seed_v132/stock_541.summary

get the same region by modifying the start point and end point
screen.seqs(fasta=stock_541.align, group=stock541.groups,start=2086, end=41517, maxambig=5)

Using 2 processors.

It took 0 secs to screen 541 sequences, removed 13.

/******************************************/
Running command: remove.seqs(accnos=/Users/uihy/Downloads/silva.seed_v132/stock_541.bad.accnos.temp, group=/Users/uihyeontae/Downloads/silva.seed_v132/stock541.groups)
Removed 13 sequences from your group file.

Output File Names:
/Users/uihy/Downloads/silva.seed_v132/stock541.pick.groups

/******************************************/

Output File Names:
/Users/uihy/Downloads/silva.seed_v132/stock_541.good.align
/Users/uihy/Downloads/silva.seed_v132/stock_541.bad.accnos
/Users/uihy/Downloads/silva.seed_v132/stock541.good.groups


It took 0 secs to screen 541 sequences.

delete the gaps
filter.seqs(fasta=stock_541.good.align, trump=., vertical=T)

Using 2 processors.
Creating Filter...
It took 0 secs to create filter for 528 sequences.


Running Filter...
It took 0 secs to filter 528 sequences.

Length of filtered alignment: 1387
Number of columns removed: 48613
Length of the original alignment: 50000
Number of sequences used to construct filter: 528

Output File Names: 
/Users/uihy/Downloads/silva.seed_v132/stock_541.filter
/Users/uihy/Downloads/silva.seed_v132/stock_541.good.filter.fasta

calculate the distance
dist.seqs(fasta=stock_541.good.filter.fasta, output=lt)

Using 2 processors.

Sequence	Time	Num_Dists_Below_Cutoff

It took 1 secs to find distances for 528 sequences. 139128 distances below cutoff 1.


Output File Names: 
/Users/uihy/Downloads/silva.seed_v132/stock_541.good.filter.phylip.dist

cluster the OTU (cutoff =98.7%)
cluster(phylip=stock_541.good.filter.phylip.dist, cutoff=0.013)

Using 2 processors.

Clustering /Users/uihy/Downloads/silva.seed_v132/stock_541.good.filter.phylip.dist

iter	time	label	num_otus	cutoff	tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score

0.013
0	0	0.013	528	0.013	0	128888	0	10240	0	1	0	0.926399	1	0.926399	0	0	
1	0	0.013	88	0.013	10229	128874	14	11	0.998926	0.999891	0.998633	0.999915	0.998633	0.99982	0.998683	0.998779	
2	0	0.013	87	0.013	10236	128871	17	4	0.999609	0.999868	0.998342	0.999969	0.998342	0.999849	0.998894	0.998975	
3	0	0.013	87	0.013	10236	128871	17	4	0.999609	0.999868	0.998342	0.999969	0.998342	0.999849	0.998894	0.998975	


It took 0 seconds to cluster

Output File Names: 
/Users/uihy/Downloads/silva.seed_v132/stock_541.good.filter.phylip.opti_mcc.list
/Users/uihy/Downloads/silva.seed_v132/stock_541.good.filter.phylip.opti_mcc.steps
/Users/uihy/Downloads/silva.seed_v132/stock_541.good.filter.phylip.opti_mcc.sensspec

get the representative fasta files and .names files
get.oturep(phylip=stock_541.good.filter.phylip.dist, list=stock_541.good.filter.phylip.opti_mcc.list, fasta=stock_541.fasta)

You did not provide a label, using 0.013.
0.013	87

Output File Names: 
/Users/uihy/Downloads/silva.seed_v132/stock_541.good.filter.phylip.opti_mcc.0.013.rep.names
/Users/uihy/Downloads/silva.seed_v132/stock_541.good.filter.phylip.opti_mcc.0.013.rep.fasta

get the representative OTU list with OTU number
get.otulist(list=stock_541.good.filter.phylip.opti_mcc.list, label=0.013)
```
Output File Names: 
/Users/uihy/Downloads/silva.seed_v132/stock_541.good.filter.phylip.opti_mcc.0.013.otu
```

Summary

This text will be hidden

pschloss · April 22, 2021, 1:06pm

Looks good to me. My only comment would be that maxambig should really never be anything other than zero.

Good work!
Pat

uihyun · April 22, 2021, 5:17pm

Thank you for your reply as I’ve really needed one who can give feedback on my analysis process.

Mothur makes me to have great interest in getting started to analyze multiple sequences!

Thank you again for your feedback!

system · May 2, 2021, 5:18pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Analysis Example of 16s RNA gene for Beginners Commands in mothur	0	6348	December 4, 2011
Lost in the analyzes Commands in mothur	3	313	December 31, 2022
Cluster sequence into OTUs Theory behind mothur	23	15383	January 9, 2015
Clustering at 98% identity threshold level Commands in mothur	2	832	August 9, 2017
Each and All of Sequences adressed to indiviuals OTU, how can I solve this? Theory behind mothur	8	2856	June 30, 2016

Sanger 16S rRNA gene analysis

OTU clustering for cultured samples

step

Related topics