remove mock community

Hello everyone,

It is my first post and I need to state that I’m very grateful to be able to use Mothur to process my dataset, thanks Pat and your team for all of your hard work!!!

I’m having problems with the step of removing. groups command…
I’m following the 454 SOP tutorial…an did this:

mothur > get.groups(fasta=/home/zm1/fastaall/April13/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pick.fasta, group=/home/zm1/fastaall/April13/012414MCcom519F.shhh.good.pick.pick.groups, name=/home/zm1/fastaall/April13/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pick.names, groups=MOCK.012414MCcom519F)
In this last portion the tutorial has introduced group=MOCK.GQY1XT001 so I thought I had to introduced the name of my files, is this correct or should I do this portion all over again???
MOTHUR REPLIED:
MOCK.012414MCcom519F is not a valid group, and will be disregarded.
You provided no valid groups. I will run the command using all the groups in your file.
Selected 50936 sequences from your name file.
Selected 8466 sequences from your fasta file.
Selected 50936 sequences from your group file.

Output File names:
/home/zm1/fastaall/April13/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pick.pick.names
/home/zm1/fastaall/April13/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pick.pick.fasta
/home/zm1/fastaall/April13/012414MCcom519F.shhh.good.pick.pick.pick.groups

As you ca see I got a sort of error answer for the group MOCK.012414MCcom519F but because it worked I went ahead…now I’m in remove.groups and I need to open MOCK.012414MCcom519F and I cannot because it was never created it???..

need some help to move forward! any suggestions? the command I’m inputting is:

mothur > remove.groups(fasta=/home/zm1/fastaall/April13/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pick.fasta, group=/home/zm1/fastaall/April13/012414MCcom519F.shhh.good.pick.pick.groups, name=/home/zm1/fastaall/April13/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pick.names, taxonomy=/home/zm1/fastaall/April13/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pds.wang.taxonomy, groups=MOCK.012414MCcom519F)

[WARNING]: MOCK.012414MCcom519F is not a valid group in your groupfile, ignoring.
[ERROR]: no valid groups, aborting.

mothur > remove.groups(fasta=/home/zm1/fastaall/April13/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pick.fasta, group=/home/zm1/fastaall/April13/012414MCcom519F.shhh.good.pick.pick.groups, name=/home/zm1/fastaall/April13/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pick.names, taxonomy=/home/zm1/fastaall/April13/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pds.wang.taxonomy, groups=MOCK.GQY1XT001)

[WARNING]: MOCK.GQY1XT001 is not a valid group in your groupfile, ignoring.
[ERROR]: no valid groups, aborting.

THANKS!

Can you run the following (http://www.mothur.org/wiki/Count.groups)?

count.groups(group=012414MCcom519F.shhh.good.pick.pick.groups)

Do you see MOCK.012414MCcom519F?

hello Pat,

Thanks for your reply…I ran count.groups and this is what i got:
mothur > count.groups(group=/home/zm1/fastaall012414MCcom519F.shhh.good.pick.pick.pick.groups)

com519Fbar3 contains 4719.
com519bar10 contains 3817.
com519bar11 contains 8557.
com519bar12 contains 751.
com519bar15 contains 6552.
com519bar16 contains 5380.
com519bar19 contains 5037.
com519bar20 contains 7002.
com519bar4 contains 2802.
com519bar9 contains 6319.

Total seqs: 50936.

Yet, i’m not able to get MOCK.012414MCcom519F

This is the result when use group=’ ___pick.pick.groups’)

mothur > count.groups(groups=/home/zm1/fastaall/April13/012414MCcom519F.shhh.good.pick.pick.groups)

Using /home/zm1/fastaall/April13/012414MCcom519F.shhh.good.groups as input file for the group parameter.
/home/zm1/fastaall/April13/012414MCcom519F.shhh.good.pick.pick.groups is not a valid group, and will be disregarded.
You provided no valid groups. I will run the command using all the groups in your file.
com519Fbar3 contains 7001.
com519bar10 contains 4740.
com519bar11 contains 10858.
com519bar12 contains 804.
com519bar15 contains 8661.
com519bar16 contains 7048.
com519bar19 contains 6144.
com519bar20 contains 8427.
com519bar4 contains 3855.
com519bar9 contains 7613.

Total seqs: 65151.

Output File Names:
/home/zm1/fastaall/April13/012414MCcom519F.shhh.good.count.summary

Did you have a Mock community sample on your sequencing run that was included in the oligos file? You might try running count.groups on each group file from trim.seqs to where you are and see where you’re losing the sequences.

Hi Pat,
No i don’t have it. I never created this group… In my oligos files I only have barcode for my 10 samples. Is this OK?

Some other questions: I went ahead and build the OTUs, cluster and make.share. I found out that one of my samples has only 751 OTUs, which is a problem when I try to run sub.samples. Specially because the total number of sequences among my samples varies from 6000 to 750… With this range of sequences among samples do you know which could be a safe way to normalize my data? I ran it with 2050 sequences which was the second lowest and consequently Ioss one important replicate…
is it possible also not to normalize the OTUs and go ahead with the analysis? will this be incorrect?

Thanks,

astrid

No i don’t have it. I never created this group… In my oligos files I only have barcode for my 10 samples. Is this OK?

To analyze the mock community data, you have to sequence a mock community in parallel with your other sequences.

Some other questions: I went ahead and build the OTUs, cluster and make.share. I found out that one of my samples has only 751 OTUs, which is a problem when I try to run sub.samples. Specially because the total number of sequences among my samples varies from 6000 to 750… With this range of sequences among samples do you know which could be a safe way to normalize my data? I ran it with 2050 sequences which was the second lowest and consequently Ioss one important replicate…

is it possible also not to normalize the OTUs and go ahead with the analysis? will this be incorrect?

You’ll have to decide how important that sample is. If it’s very important then you’ll have to subsample or rarefy all of your samples to 750 reads (see how to do this in the SOP). If it’s not, then increase the number of reads until you start removing samples that you really care about.