Hi
I ran classify.seqs with silva seed database and then remove.lineage.
I have two questions:
- did I choose the correct name for the groups to be removed? the SOP mention the names may be different from the used with RDP database.
- the output files include the accnos file, and mothur already mentioned that Removed 113 sequences from your fasta file. Removed 180 sequences from your count file., but I cannot find it among the files already saved. Any idea why?
Below the logfile. Thanks!!
Susi
mothur >
classify.seqs(fasta=site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=site7.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table, reference=silva.seed_v132.pcr.align, taxonomy=silva.seed_v132.tax, cutoff=80, probs=F)
Using 4 processors.
Generating search database... DONE.
It took 10 seconds generate search database.
Reading in the silva.seed_v132.tax taxonomy... DONE.
Calculating template taxonomy tree... DONE.
Calculating template probabilities... DONE.
It took 26 seconds get probabilities.
Classifying sequences from site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta ...
[WARNING]: M01426_142_000000000-ADATB_1_1118_18228_21944 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
[WARNING]: M02133_32_000000000-AL5EB_1_1109_2115_18523 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
[WARNING]: M01426_142_000000000-ADATB_1_1104_10876_18856 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
[WARNING]: M02133_32_000000000-AL5EB_1_1102_19321_15093 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
[WARNING]: M01426_142_000000000-ADATB_1_1101_13427_10763 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
[WARNING]: M02133_32_000000000-AL5EB_1_2110_25447_19404 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
[WARNING]: M02133_32_000000000-AL5EB_1_1112_8463_23365 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
[WARNING]: M01426_142_000000000-ADATB_1_1110_20705_10861 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
[WARNING]: M02133_32_000000000-AL5EB_1_1101_22053_14186 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
[WARNING]: M02133_32_000000000-AL5EB_1_1106_23790_11496 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
It took 5134 secs to classify 251806 sequences.
It took 5134 secs to classify 251806 sequences.
It took 12 secs to create the summary file for 251806 sequences.
Output File Names:
site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v132.wang.taxonomy
site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v132.wang.tax.summary
mothur > remove.lineage(fasta=site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=site7.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table, taxonomy=site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v132.wang.taxonomy, taxon=Chloroplast-Mitochondria-Unclassified-Archaea-Eukaryota)
[NOTE]: The count file should contain only unique names, so mothur assumes your fasta, list and taxonomy files also contain only uniques.
/******************************************/
Running command: remove.seqs(accnos=site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v132.wang.accnos, count=site7.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table, fasta=site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta)
[NOTE]: The count file should contain only unique names, so mothur assumes your fasta, list and taxonomy files also contain only uniques.
Removed 113 sequences from your fasta file.
Removed 180 sequences from your count file.
Output File Names:
site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta
site7.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table
/******************************************/
Output File Names:
site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v132.wang.pick.taxonomy
site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.seed_v132.wang.accnos
site7.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table
site7.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta