I have 454 data from 120+ patients divided on 6 separate 454 runs, using the same 32 barcodes, i.e. merging sff files is not an option. They represent 6 nominal groups. I wish to combine them in two ways: 1) just combining the runs and making a shared file for all of the individual patients, and 2) merging the patients into each nominal group and making a shared file for each nominal group. I have a standard workflow with trim.seqs for quality, remove singletons, aligning against a subsample (checked with BLAST) and removing the unalignable. I will then proceed into clustering, making shared, rarefaction, classification etc.
My question is: At what point is is most appropriate to do the merging? and if one merges the group, name and fasta files, using the merge.files command, do you then automatically retain the information in a way that makes it possible to run a dist.seqs and make.shared for the complete 6 454 runs subsequently?