We have implemented the MiSeq SOP wet lab for community profiling. Our first run consisted of 384 samples but the coverage for some samples is lower than expected. We plan re-run the 384 samples to boost the coverage. I want to know how to merge these two runs within Mothur? The sample names will be identical so essentially what I need to do is merge each sample file with the duplicate sample file. It it helps for the command, I can change the name of each file so they are not identical, or put each file from run 1 in one folder and each file from run 2 in another folder and create a merged folder?
Any help is greatly appreciated. Thanks in advance and thanks also for your ongoing excellent work in this area.
For example, if you had 10PE reads but the samples were in duplicate, you would make a stability.file as suggested in your example, but it would look like this?
Since this would be 10 rows in the stability.file, am I correct in saying mothur will read this as 10 contigs, but would infact merge them into 5 samples?
Sorry if this is not a clear question, I can elaborate if needed. This is the case for my current analysis but it is only on 175 or 1150 contigs and has took over 12 hours so far, so I do not want to keep it running if there is a problem. It also appears the MAC may have crashed and might this be to an error in merging files? It should be noted that my actual stability.files contains duplicates to be merged but also samples that do not need to be merged, so might this affect anything? I presumed not.
Thanks in advance for your always helpful responses!
In the above example, mothur should create a merged fasta file containing the assembled reads from all 10 forward and reverse pairs of files. It should also create a group file with the 5 groups in it. The sequences in the 1 and 2 files would be assigned to group1, the sequences from 3 and 4 would be assigned to group2, and so on.
This is the case for my current analysis but it is only on 175 or 1150 contigs and has took over 12 hours so far, so I do not want to keep it running if there is a problem. It also appears the MAC may have crashed and might this be to an error in merging files?
How many file pairs are in the file? How many reads in each file pair?
It should be noted that my actual stability.files contains duplicates to be merged but also samples that do not need to be merged, so might this affect anything? I presumed not.
Thanks. So it is the group file I am interested in.
I needed to restart the MAC as it had crashed. I have 1150 file pairs in the stability.files and I would estimate around 25,000 reads per pair on average. It is essentially 4 runs on a MiSeq using the Schloss SOP wet lab, with 194 samples per run. The reason I need to merge some runs and not others is because we initially ran 384 samples in a single run which resulted in low coverage. We have since re-ran the samples and I need to merge the corresponding samples for analysis. Within the analysis is also 2 further runs where we ran, only this time we only loaded 194 samples to hit better coverage.
Hope this is clearer as to why my stability.files has some duplicates and some non-duplicates.