Hello, thank you very much for replying!
I’ve followed the SOP in the wiki. And I’ve tried to merge files in different points of the procedure. But the problem is always the same.
Last time, I tried as follows:
-sffinfo(sff=sample1.sff, flow=T)
-sffinfo(sff=sample2.sff, flow=T)
-sffinfo(sff=sample3.sff, flow=T)
-sffinfo(sff=sample4.sff, flow=T)
-sffinfo(sff=sample5.sff, flow=T)
trim.flows(flow=sample1.flow, minflows=360, maxflows=720, fasta=T, processors=6)
trim.flows(flow=sample2.flow, minflows=360, maxflows=720, fasta=T, processors=6)
trim.flows(flow=sample3.flow, minflows=360, maxflows=720, fasta=T, processors=6)
trim.flows(flow=sample4.flow, minflows=360, maxflows=720, fasta=T, processors=6)
trim.flows(flow=sample5.flow, minflows=360, maxflows=720, fasta=T, processors=6)
shhh.flows(flow=sample1.flow)
shhh.flows(flow=sample2.flow)
shhh.flows(flow=sample3.flow)
shhh.flows(flow=sample4.flow)
shhh.flows(flow=sample5.flow)
And then I went on with the other commands of the SOP (trim.seqs, unique.seqs, align.seqs, screen.seqs, filter.seqs, unique.seqs, chimera.uchime, remove.seqs, classify.seqs, dist.seqs, cluster) and calculated a-diversity indexes and rarefaction curves.
For the b-diversity I read that I need a .shared file. To do a shared file I need a group file with the list of the sequences and the names of each groups.
I did:
- make.groups(fasta=sample1…fasta-sample2…fasta-sample3…fasta-sample4…fasta-sample5…fasta, groups=sample1-sample2-sample3–sampl4-sample5)
- merge.files(name=sample1…names-name=sample2…names-name=sample3…names-name=sample4…names-name=sample5…names)
- merge.files(list=sample1…list-list=sample2…list-list=sample3…list-list=sample4…list-list=sample5…list)
When I did this I was really carefull at using the .fasta, .list and .names files from the same stage, to avoid any problems. Indeed with summary.seqs I checked all steps BUT I also noticed that the group files had just the unique sequences.
I did make.shared with the merged.list file and the merged.nemes and the group file but mothur told that the number of sequences in the name files was graeter than in the group file “please, correct”.
Anyway, since I tried in many different ways, for curiosity, I made a group file at the very beginning (after shhhinfo). Running summary.seqs on the 5 .fasta files I saw that, also this time, the number of the sequences in the group file was just the sum of the unique sequences in the 5 fasta.files.
So, now, I’m asking. How can I do to analyse the beta-diversity when my samples were run separately?