screen.seq command problem

Dear Pschloss,
In screen.seq command all my group seq is getting removed. Can you help me to sought out the problem. I am pasting the logfile which i got.


mothur > make.contigs(file=stability.files/fileList.paired.file, processors=10)

Using 10 processors.

Processing file pair merge/A091-58-GAACACCG-Schulman-bac-run20150630B_S58_L001_R1_001.fastq - merge/A091-58-GAACACCG-Schulman-bac-run20150630B_S58_L001_R2_001.fastq (files 1 of 6) <<<<<
Making contigs…
Done.

It took 64 secs to assemble 147284 reads.


>>>>> Processing file pair merge/bounty_S74_L001_R1_001.fastq - merge/bounty_S74_L001_R2_001.fastq (files 2 of 6) <<<<< Making contigs... Done.

It took 179 secs to assemble 334403 reads.


>>>>> Processing file pair merge/honeoye_S77_L001_R1_001.fastq - merge/honeoye_S77_L001_R2_001.fastq (files 3 of 6) <<<<< Making contigs... Done.

It took 198 secs to assemble 337928 reads.


>>>>> Processing file pair merge/mock_S83_L001_R1_001.fastq - merge/mock_S83_L001_R2_001.fastq (files 4 of 6) <<<<< Making contigs... Done.

It took 12 secs to assemble 8223 reads.


>>>>> Processing file pair merge/polka_S71_L001_R1_001.fastq - merge/polka_S71_L001_R2_001.fastq (files 5 of 6) <<<<< Making contigs... Done.

It took 179 secs to assemble 352345 reads.


>>>>> Processing file pair merge/wendy_S80_L001_R1_001.fastq - merge/wendy_S80_L001_R2_001.fastq (files 6 of 6) <<<<< Making contigs... Done.

It took 174 secs to assemble 312152 reads.

It took 806 secs to process 1492335 sequences.

Group count:
B1S1 334403
FM1W1 147284
H1S1 337928
Mock 8223
P1S1 352345
W1S1 312152

Total of all groups is 1492335

Output File Names:
stability.files/fileList.paired.trim.contigs.fasta
stability.files/fileList.paired.trim.contigs.qual
stability.files/fileList.paired.contigs.report
stability.files/fileList.paired.scrap.contigs.fasta
stability.files/fileList.paired.scrap.contigs.qual
stability.files/fileList.paired.contigs.groups

[WARNING]: your sequence names contained ‘:’. I changed them to ‘_’ to avoid problems in your downstream analysis.

mothur > summary.seqs(fasta=stability.files/fileList.paired.trim.contigs.fasta)

Using 10 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 314 314 0 3 1
2.5%-tile: 1 439 439 0 6 37309
25%-tile: 1 444 444 0 8 373084
Median: 1 447 447 2 8 746168
75%-tile: 1 449 449 7 8 1119252
97.5%-tile: 1 489 489 25 9 1455027S
Maximum: 1 612 612 104 293 1492335
Mean: 1 448.972 448.972 4.75385 7.77012

of Seqs: 1492335

Output File Names:
stability.files/fileList.paired.trim.contigs.summary

It took 40 secs to summarize 1492335 sequences.

mothur > summary.seqs(fasta=stability.files/fileList.paired.trim.contigs.fasta)

Using 10 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 314 314 0 3 1
2.5%-tile: 1 439 439 0 6 37309
25%-tile: 1 444 444 0 8 373084S
Median: 1 447 447 2 8 746168
75%-tile: 1 449 449 7 8 1119252
97.5%-tile: 1 489 489 25 9 1455027
Maximum: 1 612 612 104 293 1492335
Mean: 1 448.972 448.972 4.75385 7.77012

of Seqs: 1492335

Output File Names:
stability.files/fileList.paired.trim.contigs.summary

It took 9 secs to summarize 1492335 sequences.

mothur > screen.seqs(fasta=stability.files/fileList.paired.trim.contigs.fasta, group=stability.files/fileList.paired.contigs.groups, maxambig=0, maxlength=500, processors=10)

Using 10 processors.

Output File Names:
stability.files/fileList.paired.trim.contigs.good.fastaS
stability.files/fileList.paired.trim.contigs.bad.accnos
stability.files/fileList.paired.contigs.good.groups


It took 35 secs to screen 1492335 sequences.

mothur > unique.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.fasta)
505339 115833

Output File Names:
stability.files/fileList.paired.trim.contigs.good.names
stability.files/fileList.paired.trim.contigs.good.unique.fasta


mothur > count.seqs(name=stability.files/fileList.paired.trim.contigs.good.names, group=stability.files/fileList.paired.contigs.good.groups)

Using 10 processors.
It took 2 secs to create a table for 505339 sequences.
S

Total number of sequences: 505339

Output File Names:
stability.files/fileList.paired.trim.contigs.good.count_table


mothur > summary.seqs(count=stability.files/fileList.paired.trim.contigs.good.count_table, processors=10) Using stability.files/fileList.paired.trim.contigs.good.unique.fasta as input file for the fasta parameter.

Using 10 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 324 324 0 3 1
2.5%-tile: 1 416 416 0 6 12634
25%-tile: 1 444 444 0 8 126335
Median: 1 447 447 0 8 252670
75%-tile: 1 449 449 0 8 379005
97.5%-tile: 1 452 452 0 8 492706S
Maximum: 1 500 500 0 33 505339
Mean: 1 444.629 444.629 0 7.78769

of unique seqs: 115833

total # of seqs: 505339

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.summary

It took 2 secs to summarize 505339 sequences.

mothur > pcr.seqs(fasta=output.files/silva.bacteria.fasta, start=6428, end=23958, keepdots=F, outputdir=stability.files, processors=10)
Setting output directory to: stability.files/

Using 10 processors.

Output File Names:
stability.files/silva.bacteria.pcr.fasta


It took 37 secs to screen 14956 sequences.

mothur > system(mv stability.files/silva.bacteria.pcr.fasta output.files/silva.v4.fasta)


mothur > summary.seqs(fasta=output.files/silva.v4.fasta)

Using 10 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 16155 386 0 3 1
2.5%-tile: 2 17530 406 0 4 374
25%-tile: 2 17530 409 0 4 3740
Median: 2 17530 428 0 5 7479
75%-tile: 2 17530 431 0 5 11218S
97.5%-tile: 2 17530 432 1 6 14583
Maximum: 8 17530 473 5 9 14956
Mean: 2.0006 17529.8 421.362 0.0962824 4.86079

of Seqs: 14956

Output File Names:
stability.files/silva.v4.summary

It took 4 secs to summarize 14956 sequences.

mothur > align.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.fasta, reference=output.files/silva.v4.fasta)

Using 10 processors.

Reading in the output.files/silva.v4.fasta template sequences… DONE.
It took 6 to read 14956 sequences.
Aligning sequences from stability.files/fileList.paired.trim.contigs.good.unique.fasta …
Some of you sequences generated alignments that eliminated too many bases, a list is provided in stability.files/fileList.paired.trim.contigs.good.unique.flip.accnos. If you set the flip parameter to true mothur will try aligning the reverse compliment as well.
It took 197 secs to align 115833 sequences.

S
Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.align
stability.files/fileList.paired.trim.contigs.good.unique.align.report
stability.files/fileList.paired.trim.contigs.good.unique.flip.accnos


mothur > summary.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.align, count=stability.files/fileList.paired.trim.contigs.good.count_table)

Using 10 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: -1 -1 0 0 1 1
2.5%-tile: 2 17530 11 0 3 12634
25%-tile: 2 17530 408 0 8 126335
Median: 2 17530 408 0 8 252670
75%-tile: 2 17530 408 0 8 379005
97.5%-tile: 2 17530 409 0 8 492706
Maximum: 17530 17530 448 0 33 505339
Mean: 344.65 17343.8 395.503 0 7.56873

of unique seqs: 115833

total # of seqs: 505339

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.summary

It took 59 secs to summarize 505339 sequences.

mothur > screen.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.align, count=stability.files/fileList.paired.trim.contigs.good.count_table, summary=stability.files/fileList.paired.trim.contigs.good.unique.summary, start=2, end=17531, maxhomop=8)

Using 10 processors.

Removing group: B1S1 because all sequences have been removed.

Removing group: FM1W1 because all sequences have been removed.

Removing group: H1S1 because all sequences have been removed.

Removing group: Mock because all sequences have been removed.

Removing group: P1S1 because all sequences have been removed.

Removing group: W1S1 because all sequences have been removed.

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.summary
stability.files/fileList.paired.trim.contigs.good.unique.good.align
stability.files/fileList.paired.trim.contigs.good.unique.bad.accnos
stability.files/fileList.paired.trim.contigs.good.good.count_table


It took 12 secs to screen 115833 sequences.

mothur > summary.seqs(fasta=current, count=current)
Using stability.files/fileList.paired.trim.contigs.good.good.count_table as input file for the count parameter.
Using stability.files/fileList.paired.trim.contigs.good.unique.good.align as input file for the fasta parameter.
[ERROR]: stability.files/fileList.paired.trim.contigs.good.unique.good.align is blank, aborting.
Using stability.files/fileList.paired.trim.contigs.good.unique.good.align as input file for the fasta parameter.

Using 10 processors.
[ERROR]: stability.files/fileList.paired.trim.contigs.good.unique.good.align is blank. Please correct.
Error in reading your fastafile, at position -1. Blank name.

You have…

mothur > screen.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.align, count=stability.files/fileList.paired.trim.contigs.good.count_table, summary=stability.files/fileList.paired.trim.contigs.good.unique.summary, start=2, end=17531, maxhomop=8)

You want…

mothur > screen.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.align, count=stability.files/fileList.paired.trim.contigs.good.count_table, summary=stability.files/fileList.paired.trim.contigs.good.unique.summary, start=2, end=17530, maxhomop=8)

The end parameter is the position at which you want your sequences to end at or after.

Pat

Thanks Psclöss,
I changed it and run proceed further, but dis.seqs iis showing empty fasta file


mothur > summary.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.align, count=stability.files/fileList.paired.trim.contigs.good.count_table)

Using 10 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: -1 -1 0 0 1 1
2.5%-tile: 2 17531 11 0 3 12634
25%-tile: 2 17531 409 0 8 126335
Median: 2 17531 409 0 8 252670
75%-tile: 2 17531 409 0 8 379005
97.5%-tile: 2 17531 410 0 8 492706
Maximum: 17531 17531 449 0 33 505339
Mean: 319.593 17317.3 396.474 0 7.56959

of unique seqs: 115833

total # of seqs: 505339

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.summary

It took 59 secs to summarize 505339 sequences.

mothur > screen.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.align, count=stability.files/fileList.paired.trim.contigs.good.count_table, summary=stability.files/fileList.paired.trim.contigs.good.unique.summary, start=2, end=17531, maxhomop=8)

Using 10 processors.

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.summary
stability.files/fileList.paired.trim.contigs.good.unique.good.align
stability.files/fileList.paired.trim.contigs.good.unique.bad.accnos
stability.files/fileList.paired.trim.contigs.good.good.count_table


It took 90 secs to screen 115833 sequences.

mothur > summary.seqs(fasta=current, count=current)
Using stability.files/fileList.paired.trim.contigs.good.good.count_table as input file for the count parameter.
Using stability.files/fileList.paired.trim.contigs.good.unique.good.align as input file for the fasta parameter.

Using 10 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 17531 386 0 4 1
2.5%-tile: 2 17531 409 0 5 12068
25%-tile: 2 17531 409 0 8 120677
Median: 2 17531 409 0 8 241354
75%-tile: 2 17531 409 0 8 362031
97.5%-tile: 2 17531 410 0 8 470640
Maximum: 2 17531 449 0 8 482707
Mean: 2 17531 409.015 0 7.73464

of unique seqs: 97228

total # of seqs: 482707

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.summary

It took 71 secs to summarize 482707 sequences.

mothur > filter.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.align, vertical=T, trump=.)

Using 10 processors.
Creating Filter…


Running Filter...

Length of filtered alignment: 795 Number of columns removed: 16736 Length of the original alignment: 17531 Number of sequences used to construct filter: 97228

Output File Names:
stability.files/fileList.filter
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.fasta


mothur > unique.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.fasta, count=stability.files/fileList.paired.trim.contigs.good.count_table) 97228 21486

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.count_table
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.unique.fasta


mothur > pre.cluster(fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.fasta, count=stability.files/fileList.paired.trim.contigs.good.count_table, diffs=2)

Using 10 processors.

Processing group W1S1:
13727 807 12920
Total number of sequences before pre.cluster was 13727.
pre.cluster removed 12920 sequences.

It took 0 secs to cluster 13727 sequences.
26203 1283 24920
Total number of sequences before pre.cluster was 26203.
pre.cluster removed 24920 sequences.

It took 1 secs to cluster 26203 sequences.
27640 1401 26239
Total number of sequences before pre.cluster was 27640.
pre.cluster removed 26239 sequences.

It took 1 secs to cluster 27640 sequences.
28198 1477 26721
Total number of sequences before pre.cluster was 28198.
pre.cluster removed 26721 sequences.

It took 2 secs to cluster 28198 sequences.
30597 1617 28980
Total number of sequences before pre.cluster was 30597.
pre.cluster removed 28980 sequences.

It took 2 secs to cluster 30597 sequences.
It took 7 secs to run pre.cluster.

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.fasta
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.count_table
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.B1S1.map
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.FM1W1.map
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.H1S1.map
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.Mock.map
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.P1S1.map
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.W1S1.map


mothur > chimera.uchime(fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.fasta, count=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.count_table, dereplicate=t)

Using 10 processors.

uchime by Robert C. Edgar
http://drive5.com/uchime
This code is donated to the public domain.

Checking sequences from stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.fasta …

It took 0 secs to check 35 sequences from group Mock.

It took 28 secs to check 1283 sequences from group W1S1.

It took 28 secs to check 807 sequences from group FM1W1.

It took 39 secs to check 1401 sequences from group H1S1.

It took 45 secs to check 1617 sequences from group P1S1.
20930 here

It took 48 secs to check 1477 sequences from group B1S1.

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.count_table
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.chimeras
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.accnos


mothur > remove.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.fasta, accnos=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.accnos) [WARNING]: This command can take a namefile and you did not provide one. The current namefile is stability.files/fileList.paired.trim.contigs.good.names which seems to match stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.fasta. Removed 157 sequences from your fasta file.

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.fasta


mothur > classify.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.fasta, count=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.count_table, reference=output.files/trainset9_032012.pds.fasta, taxonomy=output.files/trainset9_032012.pds.tax, cutoff=80)

Using 10 processors.
Generating search database… DONE.
It took 5 seconds generate search database.

Reading in the output.files/trainset9_032012.pds.tax taxonomy… DONE.
Calculating template taxonomy tree… DONE.
Calculating template probabilities… DONE.
It took 16 seconds get probabilities.
Classifying sequences from stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.fasta …

It took 42 secs to classify 6395 sequences.


It took 0 secs to create the summary file for 6395 sequences.
Output File Names: stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pds.wang.taxonomy stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pds.wang.tax.summary
mothur > remove.lineage(fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.fasta, count=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.count_table, taxonomy=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pds.wang.taxonomy, taxon=Chloroplast-Mitochondria-unknown-Archaea-Eukaryota)

[NOTE]: The count file should contain only unique names, so mothur assumes your fasta, list and taxonomy files also contain only uniques.


Output File Names: stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pds.wang.pick.taxonomy stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.fasta stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.pick.count_table
mothur > get.groups(count=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.pick.count_table, fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.fasta, groups=Mock)

[NOTE]: The count file should contain only unique names, so mothur assumes your fasta, list and taxonomy files also contain only uniques.

Selected 29 sequences from your fasta file.
Selected 984 sequences from your count file.

Output File names:
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.fasta
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.pick.pick.count_table


mothur > seq.error(fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.fasta, reference=output.files/HMP_MOCK.v35.fasta, aligned=F)

Using 10 processors.
[WARNING]: This command can take a namefile and you did not provide one. The current namefile is stability.files/fileList.paired.trim.contigs.good.names which seems to match stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.fasta.
It took 0 to read 32 sequences.
Appending files from process 23850
Appending files from process 23851
Appending files from process 23852
Appending files from process 23853
Appending files from process 23854
Appending files from process 23855
Appending files from process 23856
Appending files from process 23857
Appending files from process 23858
Overall error rate: 0.167376
Errors Sequences
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
18 0
19 0
20 0
21 0
22 0
23 0
24 0
25 0
26 0
27 0
28 0
29 0
30 0
31 0
32 0
33 0
34 0
35 0
36 0
37 0
38 0
39 0
40 0
41 0
42 0
43 0
44 0
45 0
46 0
47 0
48 0
49 0
50 0
51 0
52 0
53 0
54 0
55 0
56 0
57 0
58 0
59 0
60 0
61 0
62 0
63 1
64 1
65 0
66 0
67 0
68 0
69 0
70 0
71 0
72 0
73 0
74 0
75 2
76 0
77 1
It took 1 secs to check 29 sequences.

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.error.summary
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.error.seq
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.error.chimera
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.error.seq.forward
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.error.seq.reverse
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.error.count
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.error.matrix
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.error.ref


mothur > dist.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.fasta, cutoff=0.20)

Using 10 processors.

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.dist

It took 0 seconds to calculate the distances for 29 sequences.

mothur > cluster(column=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.dist, count=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.pick.count_table)
********************###########
Reading matrix: |||||||||||||||||||||||||||||||||||||||||||||||||||


changed cutoff to 0.1639

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.an.unique_list.list

It took 1 seconds to cluster

mothur > make.shared(list=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.an.unique_list.list, count=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.pick.count_table, label=0.03)
0.03

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.an.unique_list.shared


mothur > rarefaction.single(shared=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.an.unique_list.shared)

Using 10 processors.

Processing group B1S1

0.03

Processing group FM1W1

0.03

Processing group H1S1

0.03

Processing group Mock

0.03

Processing group P1S1

0.03

Processing group W1S1

0.03

Output File Names:
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.an.unique_list.groups.rarefaction


mothur > remove.groups(count=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.pick.count_table, fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.fasta, taxonomy=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pds.wang.taxonomy, groups=Mock)

[NOTE]: The count file should contain only unique names, so mothur assumes your fasta, list and taxonomy files also contain only uniques.

Your file contains only sequences from the groups you wish to remove.
Removed 29 sequences from your fasta file.
Removed 984 sequences from your count file.
Removed 29 sequences from your taxonomy file.

Output File names:
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.pick.fasta
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.pick.pick.count_table
stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pds.wang.pick.taxonomy


mothur > dist.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.pick.fasta, cutoff=0.20) [ERROR]: stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.pick.fasta is blank, aborting. Using stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.pick.fasta as input file for the fasta parameter. [ERROR]: stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.pick.fasta is blank. Please correct.

Using 10 processors.
[ERROR]: unable to spawn the number of processes you requested, reducing number to 31906
[ERROR]: unable to spawn the necessary processes. Error code: -1

  1. You don’t really appear to have a mock community. I would remove the steps associated with getting the sequencing error since the analysis is meaningless.

  2. Can you try running dist.seqs with processors=4?

Thanks Pschloss,
I am having the mock community. Mock community which i have been using is just like a control and not like what you have mentioned in ur MiSeq_SOP. For me the mock community is the microorganism present in the water which is used to dissolve the sample (To subtract the microorganism present in our sample due to water being used) during the solution.
Data which I got contains controls, which i named as mock. I will run dis.seqs at processors=4 and let you know.

Thanks

Dear Pschloss,
At processors=4, I am getting the same result which i got earliear. Please suggest me where is the problem?? is there problem with Mock which I am using?
Thanks

First, a mock community is a community where you know the actual sequences of the DNA in the sample. What you describe is a negative control to identify contaminants. It’s not clear what you’re hoping to achieve by running it through seq.error.

Second, at least what you pasted in for running screen.seqs was not what I told you to run. You pasted in the same command. This does not give me a lot of confidence that your files are consistent. I think the problem is that you are using the wrong fasta and taxonomy files in remove.groups. You have:

remove.groups(count=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.pick.count_table, fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.pick.fasta, taxonomy=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pds.wang.taxonomy, groups=Mock)

What you want is

remove.groups(count=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.pick.count_table, fasta=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pick.fasta, taxonomy=stability.files/fileList.paired.trim.contigs.good.unique.good.filter.precluster.pick.pds.pick.wang.taxonomy, groups=Mock)

Third, your data quality is likely to be quite poor since your reads do not fully overlap. You already have a sense of this when you had to remove half of your sequences because they had a sequencing error. For more information please see:

http://blog.mothur.org/2014/09/11/Why-such-a-large-distance-matrix%3F/

Pat

Dear Pat,
I also know mock community which you used for MiSeq_SOP analysis, thats why I mentioned you clearly what I am using as mock, because I was also thinking that I am wrong at that point.
I dont have any mock community as such mentioned by you. Is it possible to carry out our analysis without the mock community. The primer which is used for sequencing were 342F and 806R, due to which amplicon size is big. I have been given the data and told to proceed using Mothur but mock community is not there.

Thanks

Just skip the mock community analysis steps. You can’t calculate an error rate unless you know the actual sequences of what is in the mock community.

Pat

Thanks Pschloss for the help.

Hi Pschloss,
I have done the analysis with my available sequences, but I am not convinced becuase in result I am getting very less number of species. Can i send my log file to you??? The command which I used at some steps are

mothur > summary.seqs(count=current)
Using stability.files/fileList.paired.trim.contigs.good.count_table as input file for the count parameter.
Using stability.files/fileList.paired.trim.contigs.good.unique.fasta as input file for the fasta parameter.

Using 8 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 323 323 0 3 1
2.5%-tile: 1 441 441 0 6 4775
25%-tile: 1 444 444 0 8 47744
Median: 1 447 447 0 8 95487
75%-tile: 1 449 449 0 8 143230
97.5%-tile:1 452 452 0 8 186199
Maximum: 1 612 612 0 286 190973
Mean: 1 448.718 448.718 0 7.86556

of unique seqs: 50738

total # of seqs: 190973


mothur > pcr.seqs(fasta=output.files/silva.nr_v123.align, oligos=output.files/strawberry_primers.mothur, pdiffs=3, keepdots=F, taxonomy=output.files/silva.nr_v123.tax, outputdir=output.files, processors=8)

mothur > align.seqs(fasta=stability.files/fileList.paired.trim.contigs.good.unique.fasta, reference=output.files/silva.nr_v123.pcr.align, flip=true, processors=8)

After chimera.uchime and remove.seqs command i am getting summary.seq as

Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 773 406 0 4 1
2.5%-tile: 1 773 429 0 7 2595
25%-tile: 1 773 429 0 8 25949
Median: 1 773 429 0 8 51898
75%-tile: 1 773 429 0 8 77847
97.5%-tile:1 773 429 0 8 101201
Maximum: 2 773 468 0 8 103795
Mean: 1.00001 773 429.088 0 7.9643

of unique seqs: 3518

total # of seqs: 103795,

mothur > classify.seqs(fasta=current, count=current, reference=output.files/trainset9_032012.pds.fasta, taxonomy=output.files/trainset9_032012.pds.tax, cutoff=80)

I used same command which was used in MiSeq_SOP till count.group and got very less number of species in each group. classify.otu is also giving very less number of species. These sample are from strawberry grown under different conditions.

mothur > count.groups(shared=result.files/stability.an.shared)
FM1W1 contains 28.
P3C1 contains 34.
P3C2 contains 23.
P3C3 contains 39.

Total seqs: 124.

Can this be possible that we get this much less number of species in our analysis??

It sounds like you have very poorly classified sequences that either cannot be classified to the bacterial kingdom (i.e. unknown) or to Archaea, Eukarya, Mitochondria, or Chloroplasts that are being culled in the remove.lineage step. If you look at the tax.summary file generated in classify.seqs, can you see the total number of reads in each of the groups you are removing in remove.lineage?

I am getting most seq in mitochondrial and chloroplast is taxomic summary which I pasted below…we used universal primer 342F and 806R for doing sequencing of Strawberry sample… Primer sequence which we used were

forward CTACGGGGGGCAGCAG
reverse GGACTACCGGGGTATCT

Please suggest whether it is a problem due to primer which I am using for pcr.seq???

taxlevel rankID taxon daughterlevels total FM1W1 P3C1 P3C2 P3C3
0 0 Root 2 103795 28228 29302 20545 25720
1 0.2 Bacteria 4 103786 28226 29299 20544 25717
2 0.2.2 “Actinobacteria” 1 2 0 0 0 2
3 0.2.2.1 Actinobacteria 1 2 0 0 0 2
4 0.2.2.1.2 Actinomycetales 1 2 0 0 0 2
5 0.2.2.1.2.31 Nocardioidaceae 1 2 0 0 0 2
6 0.2.2.1.2.31.4 Marmoricola 0 2 0 0 0 2
2 0.2.21 “Proteobacteria” 3 103535 28196 29225 20483 25631
3 0.2.21.2 Alphaproteobacteria 3 103521 28193 29218 20482 25628
4 0.2.21.2.6 Rhizobiales 1 1 1 0 0 0
5 0.2.21.2.6.10 Phyllobacteriaceae 1 1 1 0 0 0
6 0.2.21.2.6.10.5 Mesorhizobium 0 1 1 0 0 0
4 0.2.21.2.9 Rickettsiales 2 103511 28188 29216 20481 25626
5 0.2.21.2.9.2 Mitochondria 1 103507 28187 29214 20480 25626
6 0.2.21.2.9.2.1 Mitochondria_genus_incertae_sedis 0 103507 28187 29214 20480 25626
5 0.2.21.2.9.4 unclassified 1 4 1 2 1 0
6 0.2.21.2.9.4.1 unclassified 0 4 1 2 1 0
4 0.2.21.2.12 unclassified 1 9 4 2 1 2
5 0.2.21.2.12.1 unclassified 1 9 4 2 1 2
6 0.2.21.2.12.1.1 unclassified 0 9 4 2 1 2
3 0.2.21.6 Gammaproteobacteria 2 3 2 1 0 0
4 0.2.21.6.1 “Enterobacteriales” 1 1 0 1 0 0
5 0.2.21.6.1.1 Enterobacteriaceae 1 1 0 1 0 0
6 0.2.21.6.1.1.44 unclassified 0 1 0 1 0 0
4 0.2.21.6.14 Pseudomonadales 1 2 2 0 0 0
5 0.2.21.6.14.2 Pseudomonadaceae 1 2 2 0 0 0
6 0.2.21.6.14.2.5 Pseudomonas 0 2 2 0 0 0
3 0.2.21.7 unclassified 1 11 1 6 1 3
4 0.2.21.7.1 unclassified 1 11 1 6 1 3
5 0.2.21.7.1.1 unclassified 1 11 1 6 1 3
6 0.2.21.7.1.1.1 unclassified 0 11 1 6 1 3
2 0.2.29 Cyanobacteria_Chloroplast 1 155 11 51 41 52
3 0.2.29.1 Chloroplast 1 155 11 51 41 52
4 0.2.29.1.1 Chloroplast_order_incertae_sedis 1 155 11 51 41 52
5 0.2.29.1.1.1 Chloroplast 2 155 11 51 41 52
6 0.2.29.1.1.1.7 Streptophyta 0 148 7 49 41 51
6 0.2.29.1.1.1.8 unclassified 0 7 4 2 0 1
2 0.2.36 unclassified 1 94 19 23 20 32
3 0.2.36.1 unclassified 1 94 19 23 20 32
4 0.2.36.1.1 unclassified 1 94 19 23 20 32
5 0.2.36.1.1.1 unclassified 1 94 19 23 20 32
6 0.2.36.1.1.1.1 unclassified 0 94 19 23 20 32
1 0.5 unknown 1 9 2 3 1 3
2 0.5.1 unclassified 1 9 2 3 1 3
3 0.5.1.1 unclassified 1 9 2 3 1 3
4 0.5.1.1.1 unclassified 1 9 2 3 1 3
5 0.5.1.1.1.1 unclassified 1 9 2 3 1 3
6 0.5.1.1.1.1.1 unclassified 0 9 2 3 1 3

I suspect your primers are permissive enough and you have a low enough DNA concentration that you’re amplifying and sequencing a lot of Strawberry DNA.

Pat