Input data and biom file generation clarity

Hello,

I have followed the miseq SOP. And everything went well. However i have some confusions regarding two things

1) Input data: Currently I have used Mothur test data from Miseq.

Input data can be : Illumina or 454

(1.1) Illumina

  • single end + fasta files => Can we create stability file for that. If yes how is the quality handled? and what steps top follow next.

  • single end + de-multiplexed fastq files => for this i guess trim.seq command with oligos file is required. Then it will create de-multiplex fastq file? and what next ?

  • paired end + de-multiplexed fastq files => for this i guess trim.seq command with oligos file is required
    Then it will create de-multiplex fastq file? and can I then use these files to create stability file and follow up Miseq SOP steps

(1.2) 454

It is more or less clear from SOP or To be honest I have not tested it yet :wink:

#####################################

2) Biom file: To create Biom file in mothur one can provide metadata file, shared file, constaxonomy file, reftaxonomy(for picrust).

What is not clear to me is

  • How is mothur handling the sub-sampled shared file with not sub-sampled taxonomy file.

  • When i would like to use phyloseq package for visualizations, with individual files then the constaxonomy and sub-sampled shared file don’t contain same ids. How can that be handled

  • Finally, how can I add phylogenetic tree file to biom file? When try to merge it with Phyloseq the content doesn’t match.

Looking forward for some clarity
Thank you

  1. In all honesty, I don’t know why anyone would do single end Illumina sequencing. The data quality will be horrible. You probably want to follow the trim.seqs steps from the 454 SOP, but you’ll need to set your own trimming parameters

  2. The subsampling shouldn’t impact the taxonomy file. You should be able to add a tree to the biom-formatted file. If you have a specific example that isn’t working we can see what we can do. Keep in mind that it’s generally hard to create a meaningful tree from NGS data, especially when the data are as poor as you are going to get with single end data.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.