different taxonomy files for classify.seqs

Hi there,

I am following the MiSeq SOP for analysis of my sequences.

For classify.seqs…(sorry i Have several questions at once :roll: )

I am wondering what is in the trainset9_032012.pds.fasta (and trainset9_032012.pds.tax) ? Is the “trainset” just a newer version of the SILVA database ? What does the extension ‘pds’ stands for?

Where does the silva.bacteria.fasta (nogap.bacteria.fasta and silva.bacteria.silva.tax) that was used in the past fit in all this? (why not use that instead as it has ~ 4000 sequences more?)

Why also do we use ‘reference’ and ‘taxonomy’, instead of ‘template’ and ‘taxonomy’? (In other words what is the difference between ‘reference’ and ‘template’?)

mothur > classify.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.count_table, reference=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax, cutoff=80)

mothur > classify.seqs(fasta=abrecovery.fasta, template=nogap.bacteria.fasta, taxonomy=silva.bacteria.silva.tax)

Thanks!!

I am wondering what is in the trainset9_032012.pds.fasta (and trainset9_032012.pds.tax) ? Is the “trainset” just a newer version of the SILVA database ? What does the extension ‘pds’ stands for?

This is described here: https://mothur.org/wiki/RDP_reference_files

Where does the silva.bacteria.fasta (nogap.bacteria.fasta and silva.bacteria.silva.tax) that was used in the past fit in all this? (why not use that instead as it has ~ 4000 sequences more?)

This is described here: https://mothur.org/wiki/Silva_reference_files

Why also do we use ‘reference’ and ‘taxonomy’, instead of ‘template’ and ‘taxonomy’? (In other words what is the difference between ‘reference’ and ‘template’?)

No difference.


Hope this helps, Pat