Hi, I have a quick question about what make.file is doing. I know Pat said it takes everything from the first underscore to the left as the “sample name”, however this doesn’t seem to be the case in my situation. I have mulitple fastq files that start with the same names that I ultimately want to be part of the same group. Here is an example:
Why is the first sample name it runs into called “P1” but then the next sample that is run into is named “P1_D0” (followed by 2-5 appended afterward)? How can I fix this, or make sure all of these sequence files will be regarded as the same group?
To clarify, these are all replicates of the .22um and 3um fractions and I would like to combine them all. If it is not suggested to combine replicates, how do I ensure these are at least regarded as the same group?
This does seem a bit weird. Is it possible to rename your files to replace those _ characters with something else? Alternatively, you can definitely make your own files file where you can give give the files that need to be pooled. You can do this from scratch or you could edit this file.
Thanks Pat, I did end up using the rename command (in unix) to remove any unnecessary underscores and that seems to have done the trick. How do you specify which files get pooled? Would the first column be group, then followed by sample name, then the paired fastq files?
Close - it’s the name in the first column that’s used. So you would leave the fastq files in columns 2 and 3 and give a name for the pooled group. Alternatively, you could use merge.groups to pool things after going through make.contigs
Okay that makes sense. Do all of the samples need to be included in a design file or just the ones I want to group together? Because I have other samples that I just want to stand alone