Individual count table and align files for each sample

ehy026 · February 27, 2023, 6:06pm

Hi all,

I am running a test pipeline on 18S V4 MiSeq data for five samples. Any help is appreciated as I am new to Mothur.

After running chimera.vsearch I end up with larger align and count table files as expected but Mothur also creates individual align and count table files for each of my samples (e.g., stability.trim.contigs.good.unique.good.SAMPLENAME1.align and stability.trim.contigs.good.unique.good.SAMPLENAME1.count_table). When I look at the results from others, I do not see these individual files in the destination folder.

My question is: Are these individual align and count table files a normal Mothur output or does this indicate a problem in my pipeline?

Below is my logfile:
Windows version

Using Boost
mothur v.1.48.0
Last updated: 5/20/22
by
Patrick D. Schloss

Department of Microbiology & Immunology

University of Michigan
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type 'help()' for information on the commands that are available

For questions and analysis support, please visit our forum at https://forum.mothur.org

Type 'quit()' to exit program

[NOTE]: Setting random seed to 19760620.

Interactive Mode



mothur > 
make.file(inputdir=., type=fastq, prefix=stability
[ERROR]: You are missing )
[ERROR]: Invalid.

mothur > 
make.file(inputdir=., type=fastq, prefix=stability)
Setting input directories to: 
	C:\Users\erika\Documents\MiSeqNEMtest7\


Output File Names: 
C:\Users\erika\Documents\MiSeqNEMtest7\stability.files


mothur > 
make.contigs(file=stability.files) 

Using 4 processors.

>>>>>	Processing file pair NL13_R1_MI.M06648_0253.001.FLD_ill_129_i7_IDT_i5_5.fastq - NL13_R2_MI.M06648_0253.001.FLD_ill_129_i7_IDT_i5_5.fastq (files 1 of 5)	<<<<<
Making contigs...
Done.

It took 33 secs to assemble 80140 reads.


>>>>>	Processing file pair NL20_R1_MI.M06648_0253.001.FLD_ill_136_i7_IDT_i5_5.fastq - NL20_R2_MI.M06648_0253.001.FLD_ill_136_i7_IDT_i5_5.fastq (files 2 of 5)	<<<<<
Making contigs...
Done.

It took 30 secs to assemble 78803 reads.


>>>>>	Processing file pair NL29_R1_MI.M06648_0253.001.FLD_ill_145_i7_IDT_i5_7.fastq - NL29_R2_MI.M06648_0253.001.FLD_ill_145_i7_IDT_i5_7.fastq (files 3 of 5)	<<<<<
Making contigs...
Done.

It took 33 secs to assemble 85306 reads.


>>>>>	Processing file pair NL37_R1_MI.M06648_0253.001.FLD_ill_153_i7_IDT_i5_8.fastq - NL37_R2_MI.M06648_0253.001.FLD_ill_153_i7_IDT_i5_8.fastq (files 4 of 5)	<<<<<
Making contigs...
Done.

It took 32 secs to assemble 83243 reads.


>>>>>	Processing file pair NL45_R1_MI.M06648_0253.001.FLD_ill_161_i7_IDT_i5_9.fastq - NL45_R2_MI.M06648_0253.001.FLD_ill_161_i7_IDT_i5_9.fastq (files 5 of 5)	<<<<<
Making contigs...
Done.

It took 41 secs to assemble 100514 reads.


Group count: 
NL13	80140
NL20	78803
NL29	85306
NL37	83243
NL45	100514

Total of all groups is 428006

It took 175 secs to process 428006 sequences.

Output File Names: 
stability.trim.contigs.fasta
stability.scrap.contigs.fasta
stability.contigs_report
stability.contigs.count_table


mothur > 
screen.seqs(fasta=stability.trim.contigs.fasta, count=stability.contigs.count_table, maxambig=0,minlength=150,  maxlength=450, maxhomop=8) 

Using 4 processors.

It took 5 secs to screen 428006 sequences, removed 62659.

/******************************************/
Running command: remove.seqs(accnos=stability.trim.contigs.bad.accnos.temp, count=stability.contigs.count_table)
Removed 62659 sequences from stability.contigs.count_table.

Output File Names:
stability.contigs.pick.count_table

/******************************************/

Output File Names:
stability.trim.contigs.good.fasta
stability.trim.contigs.bad.accnos
stability.contigs.good.count_table


It took 17 secs to screen 428006 sequences.

mothur > 
unique.seqs(fasta= stability.trim.contigs.good.fasta, count= stability.contigs.good.count_table)
365347	97773

Output File Names: 
stability.trim.contigs.good.unique.fasta
stability.trim.contigs.good.count_table


mothur > 
align.seqs(fasta= stability.trim.contigs.good.unique.fasta, reference=silva.seed_v138_1.align)

Using 4 processors.

Reading in the silva.seed_v138_1.align template sequences...	DONE.
It took 7 to read  7641 sequences.

Aligning sequences from stability.trim.contigs.good.unique.fasta ...
It took 340 secs to align 97773 sequences.

[WARNING]: 12 of your sequences generated alignments that eliminated too many bases, a list is provided in stability.trim.contigs.good.unique.flip.accnos.
[NOTE]: 2 of your sequences were reversed to produce a better alignment.

It took 341 seconds to align 97773 sequences.

Output File Names: 
stability.trim.contigs.good.unique.align
stability.trim.contigs.good.unique.align_report
stability.trim.contigs.good.unique.flip.accnos


mothur > 
screen.seqs(fasta= stability.trim.contigs.good.unique.align, count= stability.trim.contigs.good.count_table, start=33287, end=41796) 

Using 4 processors.

It took 49 secs to screen 97773 sequences, removed 4750.

/******************************************/
Running command: remove.seqs(accnos=stability.trim.contigs.good.unique.bad.accnos.temp, count=stability.trim.contigs.good.count_table)
Removed 8097 sequences from stability.trim.contigs.good.count_table.

Output File Names:
stability.trim.contigs.good.pick.count_table

/******************************************/

Output File Names:
stability.trim.contigs.good.unique.good.align
stability.trim.contigs.good.unique.bad.accnos
stability.trim.contigs.good.good.count_table


It took 96 secs to screen 97773 sequences.

mothur > 
filter.seqs(fasta=current, vertical=T, trump=.)
Using stability.trim.contigs.good.unique.good.align as input file for the fasta parameter.

Using 4 processors.
Creating Filter...
It took 41 secs to create filter for 93023 sequences.


Running Filter...
It took 37 secs to filter 93023 sequences.



Length of filtered alignment: 802
Number of columns removed: 49198
Length of the original alignment: 50000
Number of sequences used to construct filter: 93023

Output File Names: 
stability.filter
stability.trim.contigs.good.unique.good.filter.fasta


mothur > 
unique.seqs(fasta=current, count=current)
Using stability.trim.contigs.good.good.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.good.filter.fasta as input file for the fasta parameter.
93023	92937

Output File Names: 
stability.trim.contigs.good.unique.good.filter.unique.fasta
stability.trim.contigs.good.unique.good.filter.count_table


mothur > 
pre.cluster(fasta=current, count=current, diffs=2)
Using stability.trim.contigs.good.unique.good.filter.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.good.filter.unique.fasta as input file for the fasta parameter.

Using 4 processors.

/******************************************/
Splitting by sample: 

Using 4 processors.

Selecting sequences for groups NL29


Selecting sequences for groups NL13


Selecting sequences for groups NL20


Selecting sequences for groups NL37-NL45

Selected 17628 sequences from NL20.
Selected 22638 sequences from NL13.
Selected 19577 sequences from NL29.
Selected 20876 sequences from NL37.
Selected 17791 sequences from NL45.

It took 7 seconds to split the dataset by sample.
/******************************************/

Processing group NL20:

Processing group NL29:

Processing group NL13:

Processing group NL37:
NL20	17628	5881	11747
Total number of sequences before pre.cluster was 17628.
pre.cluster removed 11747 sequences.

It took 8 secs to cluster 17628 sequences.
NL29	19577	6474	13103
Total number of sequences before pre.cluster was 19577.
pre.cluster removed 13103 sequences.

It took 8 secs to cluster 19577 sequences.
NL37	20876	6976	13900
Total number of sequences before pre.cluster was 20876.
pre.cluster removed 13900 sequences.

It took 10 secs to cluster 20876 sequences.

Processing group NL45:
NL13	22638	8200	14438
Total number of sequences before pre.cluster was 22638.
pre.cluster removed 14438 sequences.

It took 11 secs to cluster 22638 sequences.
NL45	17791	5589	12202
Total number of sequences before pre.cluster was 17791.
pre.cluster removed 12202 sequences.

It took 4 secs to cluster 17791 sequences.

Deconvoluting count table results...
It took 0 secs to merge 33120 sequences group data.
/******************************************/
Running get.seqs: 
Selected 32899 sequences from stability.trim.contigs.good.unique.good.filter.unique.fasta.
/******************************************/
It took 25 secs to run pre.cluster.

Using 4 processors.

Output File Names: 
stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta
stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table
stability.trim.contigs.good.unique.good.filter.unique.precluster.NL13.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.NL20.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.NL29.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.NL37.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.NL45.map


mothur > 
chimera.vsearch(fasta=current, count=current, dereplicate=t) 
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta as input file for the fasta parameter.

Using 4 processors.
Checking sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta ...

/******************************************/
Splitting by sample: 

Using 4 processors.

Selecting sequences for groups NL29


Selecting sequences for groups NL13


Selecting sequences for groups NL20


Selecting sequences for groups NL37-NL45

Selected 6474 sequences from NL29.
Selected 5881 sequences from NL20.
Selected 8200 sequences from NL13.
Selected 6976 sequences from NL37.
Selected 5589 sequences from NL45.

It took 3 seconds to split the dataset by sample.
/******************************************/

It took 26 secs to check 5881 sequences from group NL20.

It took 28 secs to check 6474 sequences from group NL29.

It took 30 secs to check 6976 sequences from group NL37.

It took 34 secs to check 8200 sequences from group NL13.

It took 15 secs to check 5589 sequences from group NL45.
It took 45 secs to check 33120 sequences.


Removing chimeras from your input files:
/******************************************/
Running command: remove.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.accnos)
Removed 2952 sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta.

Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta

/******************************************/

Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.chimeras
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.accnos
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta


mothur > 
classify.seqs(fasta=current, count=current, reference=silva.nr_v138_1.align, taxonomy=silva.nr_v138_1.tax)
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta as input file for the fasta parameter.

Using 4 processors.
Generating search database...    DONE.
It took 272 seconds generate search database.

Reading in the silva.nr_v138_1.tax taxonomy...	DONE.
Calculating template taxonomy tree...     DONE.
Calculating template probabilities...     DONE.
It took 519 seconds get probabilities.
Classifying sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta ...

It took 836 secs to classify 29947 sequences.


It took 6 secs to create the summary file for 29947 sequences.


Output File Names: 
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.taxonomy
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.tax.summary


mothur > 
remove.lineage(fasta=current, count=current, taxonomy=current, taxon=Vertebrata-Fungi)  
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta as input file for the fasta parameter.
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.taxonomy as input file for the taxonomy parameter.

/******************************************/
Running command: remove.seqs(accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.accnos, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table, fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta)
Removed 728 sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta.
Removed 18357 sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table.

Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.fasta
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table

/******************************************/

Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.pick.taxonomy
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.accnos
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.fasta


mothur > 
summary.tax(taxonomy=current, count=current) 
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.pick.taxonomy as input file for the taxonomy parameter.

It took 1 secs to create the summary file for 329009 sequences.


Output File Names: 
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.pick.tax.summary


mothur > 
rename.file(fasta=current, count=current, taxonomy=current, prefix=final)
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.fasta as input file for the fasta parameter.
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.pick.taxonomy as input file for the taxonomy parameter.

Current files saved by mothur:
accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.accnos
fasta=final.fasta
taxonomy=final.taxonomy
contigsreport=stability.contigs_report
count=final.count_table
processors=4
file=C:\Users\erika\Documents\MiSeqNEMtest7\stability.files

mothur > 
: cluster.split(fasta=current, count=current, taxonomy=current, taxlevel=4, cutoff=0.03)
[ERROR]: Invalid command.
[ERROR]: did not complete : cluster.split.

mothur > 
: cluster.split(fasta=final.fasta, count=final.count_table, taxonomy=final.taxonomy, taxlevel=4, cutoff=0.03)
[ERROR]: Invalid command.
[ERROR]: did not complete : cluster.split.

mothur > 
cluster.split(fasta=final.fasta, count=final.count_table, taxonomy=final.taxonomy, taxlevel=4, cutoff=0.03)

Using 4 processors.
Splitting the file...
/******************************************/
Selecting sequences for group Tylenchida (1 of 16)
Number of unique sequences: 7903

Selected 65564 sequences from final.count_table.

Calculating distances for group Tylenchida (1 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 63 secs to find distances for 7903 sequences. 1209651 distances below cutoff 0.03.


Output File Names:
final.0.dist

/******************************************/
/******************************************/
Selecting sequences for group Chromadorea_unclassified (2 of 16)
Number of unique sequences: 11814

Selected 166367 sequences from final.count_table.

Calculating distances for group Chromadorea_unclassified (2 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 141 secs to find distances for 11814 sequences. 8484186 distances below cutoff 0.03.


Output File Names:
final.1.dist

/******************************************/
/******************************************/
Selecting sequences for group Triplonchida (3 of 16)
Number of unique sequences: 111

Selected 1252 sequences from final.count_table.

Calculating distances for group Triplonchida (3 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 0 secs to find distances for 111 sequences. 1475 distances below cutoff 0.03.


Output File Names:
final.2.dist

/******************************************/
/******************************************/
Selecting sequences for group Eukaryota_unclassified (4 of 16)
Number of unique sequences: 3749

Selected 42361 sequences from final.count_table.

Calculating distances for group Eukaryota_unclassified (4 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 12 secs to find distances for 3749 sequences. 525073 distances below cutoff 0.03.


Output File Names:
final.3.dist

/******************************************/
/******************************************/
Selecting sequences for group Rhabditida (5 of 16)
Number of unique sequences: 2888

Selected 35853 sequences from final.count_table.

Calculating distances for group Rhabditida (5 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 9 secs to find distances for 2888 sequences. 1170484 distances below cutoff 0.03.


Output File Names:
final.4.dist

/******************************************/
/******************************************/
Selecting sequences for group Diplogasterida (6 of 16)
Number of unique sequences: 875

Selected 8633 sequences from final.count_table.

Calculating distances for group Diplogasterida (6 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 1 secs to find distances for 875 sequences. 105271 distances below cutoff 0.03.


Output File Names:
final.5.dist

/******************************************/
/******************************************/
Selecting sequences for group Dorylaimia_or (7 of 16)
Number of unique sequences: 414

Selected 3500 sequences from final.count_table.

Calculating distances for group Dorylaimia_or (7 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 0 secs to find distances for 414 sequences. 23454 distances below cutoff 0.03.


Output File Names:
final.6.dist

/******************************************/
/******************************************/
Selecting sequences for group Nematozoa_unclassified (8 of 16)
Number of unique sequences: 829

Selected 1176 sequences from final.count_table.

Calculating distances for group Nematozoa_unclassified (8 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 1 secs to find distances for 829 sequences. 250341 distances below cutoff 0.03.


Output File Names:
final.7.dist

/******************************************/
/******************************************/
Selecting sequences for group Haplotaxida (9 of 16)
Number of unique sequences: 25

Selected 344 sequences from final.count_table.

Calculating distances for group Haplotaxida (9 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 0 secs to find distances for 25 sequences. 184 distances below cutoff 0.03.


Output File Names:
final.8.dist

/******************************************/
/******************************************/
Selecting sequences for group Araeolaimida (10 of 16)
Number of unique sequences: 484

Selected 3670 sequences from final.count_table.

Calculating distances for group Araeolaimida (10 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 1 secs to find distances for 484 sequences. 25078 distances below cutoff 0.03.


Output File Names:
final.9.dist

/******************************************/
/******************************************/
Selecting sequences for group Enoplea_unclassified (11 of 16)
Number of unique sequences: 80

Selected 188 sequences from final.count_table.

Calculating distances for group Enoplea_unclassified (11 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 0 secs to find distances for 80 sequences. 1970 distances below cutoff 0.03.


Output File Names:
final.10.dist

/******************************************/
/******************************************/
Selecting sequences for group Adinetida (12 of 16)
Number of unique sequences: 23

Selected 59 sequences from final.count_table.

Calculating distances for group Adinetida (12 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 0 secs to find distances for 23 sequences. 78 distances below cutoff 0.03.


Output File Names:
final.11.dist

/******************************************/
/******************************************/
Selecting sequences for group Parachaela (13 of 16)
Number of unique sequences: 5

Selected 19 sequences from final.count_table.

Calculating distances for group Parachaela (13 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 0 secs to find distances for 5 sequences. 3 distances below cutoff 0.03.


Output File Names:
final.12.dist

/******************************************/
/******************************************/
Selecting sequences for group Clitellata_unclassified (14 of 16)
Number of unique sequences: 11

Selected 13 sequences from final.count_table.

Calculating distances for group Clitellata_unclassified (14 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 0 secs to find distances for 11 sequences. 45 distances below cutoff 0.03.


Output File Names:
final.13.dist

/******************************************/
/******************************************/
Selecting sequences for group Animalia_or (15 of 16)
Number of unique sequences: 2

Selected 2 sequences from final.count_table.

Calculating distances for group Animalia_or (15 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 0 secs to find distances for 2 sequences. 0 distances below cutoff 0.03.


Output File Names:
final.14.dist

/******************************************/
/******************************************/
Selecting sequences for group Liliopsida (16 of 16)
Number of unique sequences: 2

Selected 2 sequences from final.count_table.

Calculating distances for group Liliopsida (16 of 16):

Sequence	Time	Num_Dists_Below_Cutoff

It took 0 secs to find distances for 2 sequences. 1 distances below cutoff 0.03.


Output File Names:
final.15.dist

/******************************************/
/******************************************/
Finding singletons (ignore 'Removing group' messages):

Running command: remove.seqs()
Removed 329001 sequences from final.count_table.
/******************************************/
It took 232 seconds to split the distance file.
final.0.disttemp
final.5.disttemp
final.2.disttemp
final.12.disttemp

Clustering final.12.disttemp

Clustering final.13.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
0	7	0	3	0	10	0	45	0	0	1	1	
Clustering final.15.disttemp
0	0	0.181818	0.7	1	1	0.181818	0	0	0.7	


tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
0	0	0	0	1	0	0	

0	0	0	1	0	0	0	


Clustering final.8.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
184	108	8	0	1	0.931034	0.958333	1	0.958333	0.973333	0.944585	0.978723	


Clustering final.2.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
1445	4602	28	30	0.979661	0.993952	0.980991	0.993523	0.980991	0.9905	0.974064	0.980326	


Clustering final.10.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
1959	1185	5	11	0.994416	0.995798	0.997454	0.990803	0.997454	0.994937	0.989235	0.995933	


Clustering final.9.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
21892	90163	1645	3186	0.872956	0.982082	0.93011	0.96587	0.93011	0.958669	0.87527	0.900627	


Clustering final.5.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
102969	275200	1904	2302	0.978133	0.993129	0.981845	0.991705	0.981845	0.989	0.972405	0.979985	


Clustering final.7.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
250016	89308	3557	325	0.998702	0.961697	0.985972	0.996374	0.985972	0.988689	0.971311	0.992296	


Clustering final.3.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
519952	6.48142e+06	19133	5121	0.990247	0.997057	0.964508	0.999211	0.964508	0.996548	0.97544	0.977208	


Clustering final.6.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
23147	61245	792	307	0.986911	0.987233	0.966916	0.995012	0.966916	0.987145	0.968017	0.976811	


Clustering final.11.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
73	175	0	5	0.935897	1	1	0.972222	1	0.980237	0.953887	0.966887	


Clustering final.4.disttemp

Clustering final.0.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
1.16217e+06	2.96566e+06	32682	8313	0.992898	0.9891	0.972648	0.997205	0.972648	0.990166	0.975906	0.982668	


tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
1.18525e+06	2.99823e+07	32841	24401	0.979828	0.998906	0.973039	0.999187	0.973039	0.998167	0.975474	0.976422	


Clustering final.1.disttemp

tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
7.83585e+06	6.10008e+07	294450	648334	0.923583	0.995196	0.963784	0.989483	0.963784	0.986489	0.935864	0.943255	

It took 159 seconds to cluster
Merging the clustered files...
It took 0 seconds to merge.
/******************************************/
Running command: sens.spec(cutoff=0.03, list=final.opti_mcc.list, column=final.dist, count=final.count_table)

NOTE: sens.spec assumes that only unique sequences were used to generate the distance matrix.

label	cutoff	numotus	tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
0.03
0.03	0.03	2553	1.11049e+07	4.14676e+08	387045	692384	0.94131	0.999068	0.96632	0.998333	0.96632	0.997471	0.952438	0.953651

It took 162 to run sens.spec.

Output File Names: 
final.opti_mcc.sensspec

/******************************************/
Done.


label	cutoff	numotus	tp	tn	fp	fn	sensitivity	specificity	ppv	npv	fdr	accuracy	mcc	f1score
0.03	0.03	2553	1.11049e+07	4.14676e+08	387045	692384	0.9413	0.9991	0.9663	0.9983	0.9663	0.9975	0.9524	0.9537

Output File Names: 
final.dist
final.opti_mcc.list
final.opti_mcc.sensspec


mothur > 
make.shared(list=final.opti_mcc.list, count=final.count_table, label=0.03)
0.03

Output File Names:
final.opti_mcc.shared


mothur > 
classify.otu(list=final.opti_mcc.list, count=final.count_table, taxonomy=final.taxonomy, label=0.03)
0.03

Output File Names: 
final.opti_mcc.0.03.cons.taxonomy
final.opti_mcc.0.03.cons.tax.summary


mothur > 
count.groups(shared=final.opti_mcc.shared) 
NL13 contains 64554.
NL20 contains 44657.
NL29 contains 69452.
NL37 contains 68144.
NL45 contains 82202.

Size of smallest group: 44657.

Total seqs: 329009.

Output File Names: 
final.opti_mcc.count.summary


mothur > 
sub.sample(shared=final.opti_mcc.shared) 
0.03 
Sampling 44657 from each group.

Output File Names: 
final.opti_mcc.0.03.subsample.shared


mothur > 
rarefaction.single(shared=final.opti_mcc.shared, calc=sobs, freq=100) 

Using 4 processors.

Processing group NL13

0.03

Processing group NL20

0.03

Processing group NL29

0.03

Processing group NL37

0.03

Processing group NL45

0.03

It took 28 secs to run rarefaction.single.

Output File Names: 
final.opti_mcc.groups.rarefaction


mothur >

pschloss · February 28, 2023, 6:50pm

Hi - I think you should be in good shape

Pat

Topic		Replies	Views
difference between total seq numbers Theory behind mothur	2	1479	November 9, 2017
Clustering at 98% identity threshold level Commands in mothur	2	830	August 9, 2017
aggregating data Commands in mothur	5	1119	December 9, 2016
Groups missing after cluster command	3	580	October 16, 2020
NextSeq2000 files Commands in mothur	26	747	September 24, 2023

Individual count table and align files for each sample

Related topics