Just when I thought it was all over... how to group data after classify.seqs?

natalie · March 15, 2021, 2:00pm

Hello! Thank you all for your previous help.

So Ive got quite far in my pipeline, but now I’m stuck as to how to assign my samples to treatment groups?

Here are the scripts I’ve ran so far:

BATCHI

mothur “#set.dir(input=/scratch/micro/nb326/temp_bpsenvs/01_rawdata, output=/home/n/nb326/miniconda3/envs/tbatch/03_preprocess); make.file(inputdir=/scratch/micro/nb326/temp_bpsenvs/01_rawdata, type=gz, prefix=tbps); make.contigs(file=current, oligos=/scratch/micro/nb326/temp_bpsenvs/01_rawdata/abps.oligos, pdiffs=1); summary.seqs(fasta=current)”

BATCHII

mothur “#set.dir(output=/home/n/nb326/miniconda3/envs/tbatch/03_preprocess); screen.seqs(fasta=/home/n/nb326/miniconda3/envs/tbatch/03_preprocess/tbps.trim.contigs.fasta, group=/home/n/nb326/miniconda3/envs/tbatch/03_preprocess/tbps.contigs.groups, summary=/home/n/nb326/miniconda3/envs/tbatch/03_preprocess/tbps.trim.contigs.summary,maxambig=0, minlength=252, maxlength=254, maxhomop=8); summary.seqs(fasta=current); unique.seqs(fasta=current); count.seqs(name=current,group=current);summary.seqs(fasta=current,count=current); align.seqs(fasta=current,reference=/home/n/nb326/miniconda3/envs/bpsenv/silva/silva.nr_v138/silva.nr_v138.align); summary.seqs(fasta=current,count=current)”

Batch III

mothur “#set.dir(input=/home/n/nb326/miniconda3/envs/tbatch/03_preprocess, output=/home/n/nb326/miniconda3/envs/tbatch/03_preprocess/trains); screen.seqs(fasta=/home/n/nb326/miniconda3/envs/tbatch/03_preprocess/tbps.trim.contigs.good.unique.align, count=/home/n/nb326/miniconda3/envs/tbatch/03_preprocess/tbps.trim.contigs.good.count_table, start=13862, end=23444, maxhomop=8); summary.seqs(fasta=current, count=current); filter.seqs(fasta=current, vertical=T, trump=.); unique.seqs(fasta=current, count=current); pre.cluster(fasta=current, count=current, diffs=2); chimera.uchime(fasta=current, count=current, dereplicate=t); remove.seqs(fasta=current, accnos=current); summary.seqs(fasta=current, count=current)”

Then I changed the names of the final output fasta and count file generated in this batch job to tbps.train.fasta and tbps.train.count.

BATCH IV

mothur “#set.dir(output=/home/n/nb326/miniconda3/envs/tbatch/03_preprocess/trains); classify.seqs(fasta=home/n/nb326/miniconda3/envs/tbatch/03_preprocess/trains/tbps.train.fasta,count=/home/n/nb326/miniconda3/envs/tbatch/03_preprocess/trains/tbps.train.count_table,reference=/home/n/nb326/miniconda3/envs/tbatch/trainset18_062020.rdp/trainset18_062020.rdp.fasta,taxonomy=/home/n/nb326/miniconda3/envs/tbatch/trainset18_062020.rdp/trainset18_062020.rdp.tax, cutoff=80); remove.lineage(fasta=current, count=current, taxonomy=current, taxon=Chloroplast-Mitochondria-Eukaryota); remove.groups(count=current, fasta=current, taxonomy=current, groups=SAM1.raw-SAM3.raw)”

Right now I have 96 groups. I have 96 samples I sent for sequencing. If I want to add these samples to treatment groups, to do unifraq analysis what do i do please?

Thanks in advance!

pschloss · March 18, 2021, 2:31pm

Hi,

You would need to create a design file that has the name of the sample in the first column and the treatment group the sample belongs to in the second column. If you look at the SOP, we do this with the early/late groups.

Pat

Topic		Replies	Views
Possible to split samples into groups after initial processing? Commands in mothur	1	1251	June 30, 2016
classify.seqs Commands in mothur	1	894	March 13, 2017
post processing of classify.seqs files Commands in mothur	0	2712	April 23, 2010
Sequences assigned to one group/when to remove primers by trim.seqs Commands in mothur	4	253	January 19, 2023
classify.seqs in MiSeq SOP Commands in mothur	15	2661	February 27, 2017

Just when I thought it was all over... how to group data after classify.seqs?

Related topics