Name file and group file sequence discrepancy

Hi,

I was on the Detroit course a few weeks ago. Since then I’ve been trying to use mothur on some PacBio reads. I have an error message come up as follows: [ERROR]: Your name file contains 182091 valid sequences, and your groupfile contains 191161, please correct.

I’ve tried so many ways of correcting this, but it still happens. I had to use unique.seqs to make a name file. I’ve pasted the log below, any suggestions would be greatly appreciated!

Thanks so much,

Bethan


mothur v.1.30.2
Last updated: 4/19/2013

by
Patrick D. Schloss

Department of Microbiology & Immunology
University of Michigan
pschloss@umich.edu
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type ‘help()’ for information on the commands that are available

Type ‘quit()’ to exit program



mothur > summary.seqs(fasta=Antarctic1.fasta)
Using 1 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 20 20 0 2 1
2.5%-tile: 1 413 413 0 3 5039
25%-tile: 1 497 497 0 4 50389
Median: 1 500 500 0 5 100777
75%-tile: 1 502 502 0 5 151165
97.5%-tile: 1 992 992 0 7 196514
Maximum: 1 2102 2102 0 24 201552
Mean: 1 514.994 514.994 0 4.65406

of Seqs: 201552

Output File Names:
Antarctic1.summary


mothur > trim.seqs(fasta=current, maxhomop=8, minlength=413, maxlength=650, flip=T)

Using Antarctic1.fasta as input file for the fasta parameter.

Using 1 processors.
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
14000
15000
16000
17000
18000
19000
20000
21000
22000
23000
24000
25000
26000
27000
28000
29000
30000
31000
32000
33000
34000
35000
36000
37000
38000
39000
40000
41000
42000
43000
44000
45000
46000
47000
48000
49000
50000
51000
52000
53000
54000
55000
56000
57000
58000
59000
60000
61000
62000
63000
64000
65000
66000
67000
68000
69000
70000
71000
72000
73000
74000
75000
76000
77000
78000
79000
80000
81000
82000
83000
84000
85000
86000
87000
88000
89000
90000
91000
92000
93000
94000
95000
96000
97000
98000
99000
100000
101000
102000
103000
104000
105000
106000
107000
108000
109000
110000
111000
112000
113000
114000
115000
116000
117000
118000
119000
120000
121000
122000
123000
124000
125000
126000
127000
128000
129000
130000
131000
132000
133000
134000
135000
136000
137000
138000
139000
140000
141000
142000
143000
144000
145000
146000
147000
148000
149000
150000
151000
152000
153000
154000
155000
156000
157000
158000
159000
160000
161000
162000
163000
164000
165000
166000
167000
168000
169000
170000
171000
172000
173000
174000
175000
176000
177000
178000
179000
180000
181000
182000
183000
184000
185000
186000
187000
188000
189000
190000
191000
192000
193000
194000
195000
196000
197000
198000
199000
200000
201000
201552


Output File Names: Antarctic1.trim.fasta Antarctic1.scrap.fasta
mothur > summary.seqs()

Using Antarctic1.trim.fasta as input file for the fasta parameter.

Using 1 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 413 413 0 3 1
2.5%-tile: 1 477 477 0 3 4813
25%-tile: 1 497 497 0 4 48121
Median: 1 500 500 0 5 96242
75%-tile: 1 502 502 0 5 144362
97.5%-tile: 1 524 524 0 7 187670
Maximum: 1 650 650 0 8 192482
Mean: 1 499.152 499.152 0 4.60035

of Seqs: 192482

Output File Names:
Antarctic1.trim.summary


mothur > unique.seqs(fasta=current)

Using Antarctic1.trim.fasta as input file for the fasta parameter.
1000 809
2000 1578
3000 2379
4000 3142
5000 3884
6000 4622
7000 5323
8000 6024
9000 6731
10000 7468
11000 8231
12000 8977
13000 9745
14000 10494
15000 11225
16000 11959
17000 12681
18000 13416
19000 14135
20000 14878
21000 15635
22000 16333
23000 17061
24000 17784
25000 18532
26000 19230
27000 19906
28000 20591
29000 21274
30000 21929
31000 22594
32000 23327
33000 24037
34000 24761
35000 25481
36000 26170
37000 26883
38000 27571
39000 28273
40000 28975
41000 29653
42000 30306
43000 30927
44000 31549
45000 32171
46000 32819
47000 33437
48000 34076
49000 34719
50000 35367
51000 35999
52000 36700
53000 37416
54000 38117
55000 38825
56000 39507
57000 40195
58000 40864
59000 41606
60000 42314
61000 43014
62000 43724
63000 44455
64000 45166
65000 45842
66000 46549
67000 47207
68000 47874
69000 48547
70000 49247
71000 49926
72000 50619
73000 51284
74000 51942
75000 52598
76000 53232
77000 53831
78000 54481
79000 55158
80000 55854
81000 56519
82000 57217
83000 57895
84000 58561
85000 59196
86000 59841
87000 60481
88000 61118
89000 61811
90000 62520
91000 63242
92000 63909
93000 64612
94000 65330
95000 66030
96000 66718
97000 67387
98000 68091
99000 68701
100000 69327
101000 70001
102000 70669
103000 71346
104000 71997
105000 72630
106000 73258
107000 73870
108000 74463
109000 75111
110000 75729
111000 76352
112000 76953
113000 77574
114000 78185
115000 78821
116000 79491
117000 80143
118000 80774
119000 81371
120000 81988
121000 82599
122000 83258
123000 83900
124000 84555
125000 85214
126000 85856
127000 86485
128000 87129
129000 87772
130000 88393
131000 88976
132000 89633
133000 90259
134000 90858
135000 91447
136000 92060
137000 92669
138000 93259
139000 93827
140000 94414
141000 95028
142000 95599
143000 96197
144000 96765
145000 97361
146000 97987
147000 98622
148000 99300
149000 100000
150000 100690
151000 101344
152000 101965
153000 102562
154000 103228
155000 103896
156000 104550
157000 105195
158000 105823
159000 106449
160000 107056
161000 107663
162000 108286
163000 108959
164000 109615
165000 110253
166000 110891
167000 111526
168000 112162
169000 112770
170000 113399
171000 114053
172000 114693
173000 115335
174000 115985
175000 116652
176000 117298
177000 117936
178000 118594
179000 119248
180000 119917
181000 120585
182000 121259
183000 121922
184000 122601
185000 123284
186000 123956
187000 124620
188000 125289
189000 125902
190000 126543
191000 127226
192000 127903
192482 128212

Output File Names:
Antarctic1.trim.names
Antarctic1.trim.unique.fasta


mothur > align.seqs(fasta=current, group=mergegroups, reference=silva.eukarya.fasta, flip=T)

Using Antarctic1.trim.unique.fasta as input file for the fasta parameter.
group is not a valid parameter.
The valid parameters are: reference, fasta, search, ksize, match, align, mismatch, gapopen, gapextend, processors, flip, save, threshold, inputdir, and outputdir.

Using 1 processors.
[ERROR]: did not complete align.seqs.

mothur > align.seqs(fasta=current, reference=silva.eukarya.fasta, flip=T)
Using Antarctic1.trim.unique.fasta as input file for the fasta parameter.

Using 1 processors.

Reading in the silva.eukarya.fasta template sequences… DONE.
It took 3 to read 1238 sequences.
Aligning sequences from Antarctic1.trim.unique.fasta …
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
2100
2200
2300
2400
2500
2600
2700
2800
2900
3000
3100
3200
3300
3400
3500
3600
3700
3800
3900
4000
4100
4200
4300
4400
4500
4600
4700
4800
4900
5000
5100
5200
5300
5400
5500
5600
5700
5800
5900
6000
6100
6200
6300
6400
6500
6600
6700
6800
6900
7000
7100
7200
7300
7400
7500
7600
7700
7800
7900
8000
8100
8200
8300
8400
8500
8600
8700
8800
8900
9000
9100
9200
9300
9400
9500
9600
9700
9800
9900
10000
10100
10200
10300
10400
10500
10600
10700
10800
10900
11000
11100
11200
11300
11400
11500
11600
11700
11800
11900

128100
128200
128212
Some of you sequences generated alignments that eliminated too many bases, a list is provided in Antarctic1.trim.unique.flip.accnos. If the reverse compliment proved to be better it was reported.
It took 2778 secs to align 128212 sequences.


Output File Names: Antarctic1.trim.unique.align Antarctic1.trim.unique.align.report Antarctic1.trim.unique.flip.accnos
mothur > get.current()
Current files saved by mothur: fasta=Antarctic1.trim.unique.align name=Antarctic1.trim.names

mothur > summary.seqs()

Using Antarctic1.trim.unique.align as input file for the fasta parameter.

Using 1 processors.
[WARNING]: This command can take a namefile and you did not provide one. The current namefile is Antarctic1.trim.names which seems to match Antarctic1.trim.unique.align.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1044 1048 3 0 1 1
2.5%-tile: 34102 43116 460 0 3 3206
25%-tile: 34102 43116 477 0 4 32054
Median: 34102 43116 480 0 5 64107
75%-tile: 34102 43116 483 0 5 96160
97.5%-tile: 34105 43116 505 0 7 125007
Maximum: 43113 43116 622 0 8 128212
Mean: 34096 43102.3 480.31 0 4.6038

of Seqs: 128212

Output File Names:
Antarctic1.trim.unique.summary


mothur > get.current()
Current files saved by mothur: fasta=Antarctic1.trim.unique.align name=Antarctic1.trim.names

mothur > screen.seqs(fasta=current, name=current, group=mergegroups, end=43116, start=optimize, criteria=95, processors=2)

Using Antarctic1.trim.unique.align as input file for the fasta parameter.
Using Antarctic1.trim.names as input file for the name parameter.
[ERROR]: cannot convert optimize to an integer.

Using 2 processors.
[ERROR]: did not complete screen.seqs.

mothur > screen.seqs(fasta=current, name=current, group=mergegroups, end=43116, optimize=start, criteria=95, processors=2)

Using Antarctic1.trim.unique.align as input file for the fasta parameter.
Using Antarctic1.trim.names as input file for the name parameter.

Using 2 processors.
Optimizing start to 34104.
Processing sequence: 100
Processing sequence: 100
Processing sequence: 200
Processing sequence: 300
Processing sequence: 200
Processing sequence: 300
Processing sequence: 400
Processing sequence: 500
Processing sequence: 400
Processing sequence: 600
Processing sequence: 500
Processing sequence: 700
Processing sequence: 600
Processing sequence: 800
Processing sequence: 700
Processing sequence: 800
Processing sequence: 900
Processing sequence: 900
Processing sequence: 1000

Processing sequence: 63200
Processing sequence: 63300
Processing sequence: 63400
Processing sequence: 63500
Processing sequence: 63600
Processing sequence: 63700
Processing sequence: 63800
Processing sequence: 63900
Processing sequence: 64000
Processing sequence: 64100
Processing sequence: 64107

Output File Names:
Antarctic1.trim.unique.good.align
Antarctic1.trim.unique.bad.accnos
Antarctic1.trim.good.names
mergegroupsgood


It took 1260 secs to screen 128212 sequences.

mothur > summary.seqs()

Using Antarctic1.trim.unique.good.align as input file for the fasta parameter.

Using 2 processors.
[WARNING]: This command can take a namefile and you did not provide one. The current namefile is Antarctic1.trim.good.names which seems to match Antarctic1.trim.unique.good.align.
m
Start End NBases Ambigs Polymer NumSeqs
Minimum: 28432 43116 415 0 3 1
2.5%-tile: 34102 43116 468 0 3 3015
25%-tile: 34102 43116 478 0 4 30143
Median: 34102 43116 481 0 5 60285
75%-tile: 34102 43116 483 0 5 90427
97.5%-tile: 34104 43116 505 0 7 117554
Maximum: 34104 43116 622 0 8 120568
Mean: 34095.9 43116 481.749 0 4.56966

of Seqs: 120568

Output File Names:
Antarctic1.trim.unique.good.summary


mothur > filter.seqs(fasta=current, vertical=T, trump=., processors=2)

Using Antarctic1.trim.unique.good.align as input file for the fasta parameter.

Using 2 processors.
Creating Filter…
100
100
200
200
300
300
400
400
500
500
600
600
700
700
800
800
900
900
1000
1000
1100
1100
1200
1200
1300
1300
1400
1400
1500
1500
1600
1600
1700
1700
1800
1800
1900
1900
2000
2000
2100
2100
2200
2200
2300
2300
2400
2400
2500
2500
2600
2600
2700
2700
2800
2800
2900
2900
3000
3000
3100
3100
3200
3200
3300
3300
3400
3400
3500
3500
3600
3600
3700
3700
3800
3800
3900
3900
4000
4000
4100
4100
4200
4200
4300
4300
4400
4400
4500
4500
4600
4600
4700
4700
4800
4800
4900
4900
5000
5000
5100
5100
5200
5200
5300
5300
5400
5400
5500
5500
5600
5600
5700
5700
5800
5800
5900
5900
6000
6000
6100
6100
6200
6200
6300
6300
6400
6400
6500
6500
6600
6600
6700
6700
6800
6800
6900
6900
7000
7000
7100
7100
7200
7200
7300
7300
7400
7400
7500
7500
7600
7600
7700
7700
7800
7800
7900
7900
8000
8000
8100
8100
8200
8200
8300
8300
8400
8400
8500
8500
8600
8600
8700
8700
8800
8800
8900
8900
9000
9000
9100
9100
9200
9200
9300
9300
9400
9400
9500
9500
9600
9600
9700
9700
9800
9800
9900
9900
10000
10000
10100
10100
10200
10200
10300
10300
10400
10500
10400
10600
10500
10600
10700
10700
10800
10800
10900
10900
11000
11000
11100
11100
11200
11200
11300
11300
11400
11400
11500
11500
11600
11600
11700
11700
11800
11800
11900
11900
12000
12000
12100
12100
12200
12200
12300
12300
12400
12400
12500
12500
12600
12600
12700
12700
12800
12800
12900
12900
13000
13000
13100
13100
13200
13200
13300
13300
13400
13400
13500
13500
13600
13600
13700
13700
13800
13800
13900
13900
14000
14000
14100
14100
14200
14200
14300
14300
14400
14400
14500
14500
14600
14600
14700
14700
14800
14800
14900
14900
15000
15000
15100
15100
15200
15200
15300
15300
15400
15400
15500
15500
15600
15600
15700
15800
15700
15800
15900
16000
15900
16100
16000
16200
16100
16300
16200
16400
16300
16500
16400
16600
16500
16700
16600
16800
16700
16900
16800
17000
16900
17100
17000
17200
17100
17300
17200
17400
17300
17500
17400
17600
17500
17700
17600
17800
17900
17700
17800
18000
17900
18100
18000
18200
18300
18100
18400
18200
18500
18300
18600
18400
18700
18500
18800
18600
18900
18700
19000
18800
19100
18900
19200
19000
19300
19100
19400
19200
19500
19300
19600
19400
19700
19500
19800
19600
19900
19700
20000
19800
20100
19900
20200
20000
20300
20100
20400
20200
20500
20300
20600
20400
20700
20500
20800
20600
20900
20700
21000
20800
21100
20900
21200
21000
21300
21100
21400
21200
21500
21300
21600
21400
21700
21500
21800
21600
21900
22000
21700
22100
21800
22200
21900
22000
22300
22400
22100
22200
22500
22600
22300
22700
22400
22500
22800
22600
22900
23000
22700
22800
23100
23200
22900
23000
23300
23400
23100
23500
23200
23300
23600
23700
23400
23800
23500
23900
23600
24000
23700
24100
23800
24200
23900
24300
24000
24400
24100
24500
24200
24600
24300
24700
24400
24800
24500
24900
24600
25000
24700
25100
24800
25200
24900
25300
25000
25400
25100
25500
25200
25600
25300
25700
25400
25800
25500
25900
25600
26000
25700
26100
25800
26200
25900
26300
26000
26400
26100
26500
26200
26600
26300
26700
26400
26500
26800
26900
26600
27000
26700
26800
27100
26900
27200
27000
27300
27100
27400
27200
27500
27300
27600
27400
27700
27500
27800
27600
27900
27700
27800
28000
27900
28100
28000
28200
28100
28300
28200
28400
28300
28500
28400
28600
28500
28700
28600
28800
28700
28900
28800
29000
28900
29100
29000
29200
29100
29300
29200
29400
29300
29400
29500
29500
29600
29600
29700
29800
29700
29900
29800
29900
30000
30100
30000
30100
30200
30200
30300
30400
30300
30500
30400
30600
30500
30700
30600
30800
30700
30900
30800
31000
30900
31100
31000
31200
31100
31200
31300
31300
31400
31400
31500
31500
31600
31600
31700
31700
31800
31800
31900
31900
32000
32000
32100
32100
32200
32200
32300
32300
32400
32400
32500
32500
32600
32600
32700
32700
32800
32800
32900
32900
33000
33000
33100
33100
33200
33200
33300
33300
33400
33400
33500
33500
33600
33600
33700
33700
33800
33800
33900
33900
34000
34000
34100
34100
34200
34200
34300
34300
34400
34400
34500
34500
34600
34600
34700
34700
34800
34800
34900
34900
35000
35000
35100
35100
35200
35200
35300
35300
35400
35400
35500
35600
35500
35700
35600
35800
35700
35900
35800
36000
35900
36100
36000
36200
36100
36300
36200
36400
36300
36500
36400
36600
36500
36700
36600
36800
36700
36900
36800
37000
37100
36900
37200
37000
37100
37300
37400
37200
37300
37500
37400
37600
37500
37700
37600
37800
37700
37900
37800
38000
38100
37900
38200
38000
38100
38300
38400
38200
38500
38300
38600
38400
38700
38500
38800
38600
38900
38700
39000
38800
39100
38900
39200
39000
39300
39100
39400
39200
39500
39300
39600
39400
39700
39500
39800
39600
39900
39700
40000
39800
40100
39900
40200
40000
40300
40100
40400
40200
40500
40600
40300
40700
40400
40800
40500
40900
40600
40700
41000
40800
41100
40900
41200
41000
41300
41400
41100
41200
41500
41300
41600
41400
41700
41800
41500
41900
41600
42000
41700
42100
41800
42200
41900
42300
42000
42400
42100
42500
42200
42600
42300
42700
42400
42800
42500
42900
42600
43000
42700
42800
43100
43200
42900
43300
43000
43400
43100
43500
43200
43600
43300
43700
43400
43800
43500
43900
43600
44000
43700
44100
43800
44200
43900
44300
44000
44400
44100
44500
44200
44300
44600
44700
44400
44800
44500
44900
44600
45000
44700
45100
44800
45200
44900
45300
45000
45400
45100
45500
45200
45600
45300
45700
45400
45500
45800
45600
45900
45700
46000
45800
46100
45900
46200
46000
46300
46100
46400
46200
46500
46300
46600
46400
46700
46500
46600
46800
46700
46900
46800
47000
46900
47100
47000
47200
47100
47300
47200
47400
47300
47500
47400
47600
47500
47700
47600
47800
47700
47900
47800
48000
47900
48100
48000
48200
48100
48300
48200
48400
48300
48500
48400
48600
48500
48700
48600
48800
48700
48900
48800
49000
48900
49100
49000
49200
49100
49300
49200
49400
49300
49500
49400
49600
49500
49700
49600
49800
49700
49900
49800
50000
49900
50100
50000
50200
50100
50300
50200
50400
50300
50500
50400
50600
50500
50700
50600
50800
50700
50900
50800
51000
50900
51100
51000
51200
51300
51100
51200
51400
51300
51500
51600
51400
51700
51500
51600
51800
51700
51900
51800
52000
51900
52100
52000
52200
52100
52300
52200
52400
52300
52500
52400
52600
52500
52700
52600
52800
52700
52900
52800
53000
52900
53100
53200
53000
53300
53100
53400
53200
53500
53300
53600
53400
53700
53500
53800
53600
53900
53700
54000
53800
54100
53900
54200
54000
54300
54400
54100
54500
54200
54300
54600
54700
54400
54800
54500
54900
54600
55000
54700
55100
54800
55200
54900
55300
55400
55000
55500
55100
55600
55200
55700
55300
55800
55400
55900
55500
56000
55600
56100
55700
56200
55800
56300
55900
56400
56000
56500
56100
56600
56200
56700
56300
56800
56400
56900
56500
57000
56600
57100
56700
57200
56800
57300
56900
57400
57000
57100
57500
57200
57600
57300
57700
57800
57400
57500
57900
57600
58000
58100
57700
58200
57800
58300
57900
58400
58000
58100
58500
58200
58600
58700
58300
58800
58400
58900
58500
59000
58600
58700
59100
58800
59200
58900
59300
59000
59400
59100
59500
59200
59600
59300
59700
59400
59800
59500
59900
59600
60000
59700
60100
59800
60200
59900
60285
60000
60100
60200
60283


Running Filter... 100 100 200 200 300 300 400 400 500 500 600 600 700 700 800 800 900 900 1000 1000 1100 1100 1200 1300 1200 1400 1300 1500 1400 1500 1600 1600 1700 1700 1800 1800 1900 1900 2000 2000 2100 2100 2200 2200 2300 2400 2300 2400 2500 2500 2600 2600 2700 2700 2800 2800 2900 2900 3000 3000 3100 3200 3100 3300 3200 3400 3300 3500 3400 3600 3700 3500 3800 3600 3900 3700 4000 3800 4100 3900 4200 4300 4000 4400 4100 4200 4500 4300 4600 4700 4400 4800 4500 4900 5000 4600 5100 4700 5200 4800 5300 5400 4900 5500 5000 5600 5100 5700 5200 5800 5300 5900 5400 6000 5500 6100 5600 6200 6300 5700 6400 5800 6500 5900 6600 6000 6100 6700 6800 6200 6900 6300 7000 7100 6400 7200 6500 7300 6600 7400 6700 7500 6800 7600 6900 7700 7000 7800 7100 7900 7200 8000 7300 7400 8100 8200 7500 7600 8300 8400 7700 8500 7800 8600 7900 8700 8000 8800 8100 8900 8200 9000 9100 9200 8300 8400 9300 8500 9400 8600 9500 8700 9600 8800 9700 9800 8900 9900 9000 10000 10100 9100 10200 9200 10300 9300 10400 9400 10500 9500 10600 9600 10700 10800 9700 10900 9800 9900 11000 10000 11100 11200 10100 11300 10200 10300 11400 11500 10400 11600 11700 10500 11800 10600 11900 10700 12000 10800 12100 10900 12200 11000 11100 12300 12400 12500 11200 12600 11300 12700 11400 12800 11500 11600 12900 11700 13000 13100 11800 11900 13200 12000 13300 12100 12200 13400 13500 12300 12400 13600 13700 12500 13800 12600 12700 13900 12800 14000 12900 14100 14200 13000 14300 13100 14400 13200 14500 13300 13400 14600 14700 13500 14800 13600 14900 13700 15000 13800 15100 13900 15200 14000 14100 15300 14200 15400 14300 15500 14400 15600 14500 14600 15700 15800 14700 14800 15900 14900 16000 15000 16100 15100 16200 15200 16300 15300 16400 15400 15500 16500 15600 16600 16700 15700 16800 15800 16900 15900 17000 17100 16000 17200 16100 17300 16200 17400 16300 17500 16400 17600 16500 17700 16600 17800 16700 17900 16800 18000 18100 16900 18200 17000 18300 17100 18400 17200 18500 17300 18600 17400 18700 17500 18800 18900 17600 19000 17700 19100 17800 19200 17900 18000 19300 18100 19400 18200 19500 18300 19600 18400 19700 19800 18500 19900 18600 20000 18700 20100 20200 18800 20300 18900 20400 19000 19100 20500 20600 19200 20700 19300 19400 20800 19500 20900 19600 19700 21000 19800 21100 19900 21200 20000 21300 20100 21400 20200 21500 20300 21600 21700 20400 21800 20500 20600 21900 20700 22000 22100 20800 22200 20900 22300 21000 22400 21100 22500 21200 22600 21300 22700 21400 22800 21500 22900 21600 23000 21700 23100 21800 23200 23300 21900 23400 22000 23500 22100 22200 23600 22300 23700 22400 23800 22500 23900 24000 22600 22700 22800 24100 24200 22900 24300 23000 24400 23100 24500 23200 23300 24600 23400 24700 23500 24800 23600 23700 24900 25000 23800 25100 25200 23900 24000 24100 25300 24200 25400 24300 25500 24400 25600 24500 25700 24600 25800 24700 25900 26000 24800 24900 26100 25000 26200 25100 25200 26300 26400 25300 26500 25400 26600 25500 26700 25600 26800 25700 25800 26900 25900 27000 27100 26000 26100 27200 26200 27300 26300 27400 26400 27500 26500 27600 26600 27700 26700 27800 26800 27900 28000 26900 28100 27000 28200 27100 28300 27200 28400 27300 28500 27400 28600 28700 27500 28800 28900 27600 29000 27700 29100 27800 29200 27900 29300 28000 29400 28100 29500 28200 29600 28300 29700 29800 28400 29900 28500 30000 28600 28700 30100 28800 30200 28900 30300 29000 30400 29100 30500 29200 30600 30700 29300 30800 29400 30900 29500 31000 29600 31100 29700 31200 29800 31300 29900 30000 31400 31500 30100 30200 30300 31600 30400 31700 31800 30500 31900 32000 30600 30700 32100 30800 32200 30900 32300 31000 32400 32500 31100 32600 32700 31200 32800 31300 32900 31400 31500 33000 33100 31600 33200 31700 33300 31800 33400 31900 33500 33600 32000 33700 32100 32200 32300 33800 32400 33900 32500 32600 32700 34000 32800 34100 34200 32900 34300 33000 33100 34400 33200 33300 34500 33400 34600 33500 33600 34700 33700 34800 33800 34900 33900 35000 34000 35100 34100 35200 34200 34300 35300 35400 34400 35500 35600 34500 35700 34600 35800 34700 35900 34800 36000 34900 36100 36200 35000 35100 36300 35200 35300 36400 35400 35500 36500 35600 36600 35700 36700 35800 36800 35900 36900 36000 37000 36100 37100 36200 37200 36300 37300 37400 36400 36500 37500 37600 36600 36700 37700 36800 37800 36900 37900 38000 37000 38100 37100 37200 38200 37300 38300 37400 38400 37500 38500 37600 38600 37700 38700 37800 38800 37900 38900 39000 38000 39100 38100 39200 38200 38300 39300 38400 39400 38500 39500 38600 39600 38700 39700 38800 39800 38900 39900 39000 40000 39100 40100 40200 39200 39300 40300 39400 40400 40500 39500 40600 39600 39700 40700 39800 40800 39900 40000 40900 40100 41000 40200 41100 40300 41200 40400 40500 41300 40600 41400 40700 41500 40800 41600 40900 41700 41000 41800 41100 41900 41200 42000 41300 42100 41400 42200 41500 42300 41600 42400 42500 41700 42600 41800 42700 41900 42800 42000 42100 42900 42200 43000 43100 42300 43200 42400 42500 43300 42600 42700 43400 42800 43500 43600 42900 43700 43000 43800 43100 43200 43900 44000 43300 44100 43400 44200 43500 44300 44400 43600 44500 43700 44600 43800 44700 43900 44800 44000 44900 44100 45000 44200 45100 44300 45200 44400 45300 44500 45400 44600 45500 44700 45600 44800 45700 45800 44900 45000 45900 45100 46000 46100 45200 46200 45300 46300 45400 46400 45500 46500 45600 46600 45700 46700 45800 46800 45900 46900 47000 46000 47100 46100 47200 46200 47300 46300 47400 46400 47500 46500 47600 46600 47700 46700 47800 47900 46800 48000 48100 46900 47000 48200 47100 48300 47200 48400 47300 47400 48500 47500 48600 47600 48700 47700 48800 47800 48900 47900 49000 49100 48000 49200 48100 48200 49300 49400 48300 48400 49500 49600 48500 49700 48600 49800 49900 48700 50000 48800 50100 50200 48900 49000 50300 49100 50400 49200 49300 50500 50600 49400 49500 50700 49600 50800 50900 49700 49800 51000 49900 51100 50000 51200 50100 51300 51400 50200 51500 50300 51600 51700 50400 51800 50500 51900 50600 50700 52000 50800 52100 50900 52200 52300 51000 52400 51100 52500 51200 52600 51300 52700 51400 52800 52900 51500 53000 51600 53100 53200 51700 53300 53400 51800 53500 51900 52000 53600 52100 53700 52200 53800 52300 53900 54000 52400 54100 54200 52500 54300 52600 52700 54400 52800 54500 52900 54600 53000 54700 53100 53200 54800 54900 53300 55000 53400 55100 53500 53600 55200 53700 55300 53800 55400 53900 55500 54000 55600 55700 54100 55800 54200 54300 54400 55900 54500 54600 56000 56100 54700 56200 54800 56300 56400 54900 56500 55000 56600 55100 56700 56800 55200 55300 56900 55400 57000 57100 55500 55600 57200 55700 55800 57300 55900 57400 57500 56000 56100 57600 57700 56200 57800 56300 57900 56400 56500 58000 56600 58100 56700 58200 56800 58300 56900 57000 58400 57100 58500 57200 58600 57300 58700 57400 58800 57500 57600 58900 57700 59000 57800 59100 57900 59200 58000 58100 59300 58200 59400 58300 59500 58400 59600 58500 59700 58600 59800 58700 59900 58800 60000 58900 60100 60200 59000 60285 59100 59200 59300 59400 59500 59600 59700 59800 59900 60000 60100 60200 60283

Length of filtered alignment: 1326 Number of columns removed: 48674 Length of the original alignment: 50000 Number of sequences used to construct filter: 120568

Output File Names:
Antarctic1.filter
Antarctic1.trim.unique.good.filter.fasta


mothur > get.current()
Current files saved by mothur: fasta=Antarctic1.trim.unique.good.filter.fasta group=mergegroupsgood name=Antarctic1.trim.good.names processors=2

mothur > unique.seqs(fasta=current, name=current)

Using Antarctic1.trim.unique.good.filter.fasta as input file for the fasta parameter.
Using Antarctic1.trim.good.names as input file for the name parameter.
1000 984
2000 1976
3000 2962
4000 3941
5000 4917
6000 5896
7000 6880
8000 7864
9000 8856
10000 9838
11000 10815
12000 11797
13000 12776
14000 13753
15000 14742
16000 15728
17000 16708
18000 17680
19000 18657
20000 19635
21000 20612
22000 21589
23000 22566
24000 23545
25000 24517
26000 25496
27000 26473
28000 27455
29000 28419
30000 29396
31000 30368
32000 31344
33000 32317
34000 33297
35000 34277
36000 35257
37000 36233
38000 37205
39000 38182
40000 39163
41000 40146
42000 41123
43000 42100
44000 43083
45000 44053
46000 45032
47000 46017
48000 46985
49000 47960
50000 48931
51000 49902
52000 50881
53000 51861
54000 52836
55000 53814
56000 54794
57000 55772
58000 56752
59000 57735
60000 58716
61000 59691
62000 60676
63000 61655
64000 62634
65000 63596
66000 64570
67000 65549
68000 66515
69000 67491
70000 68463
71000 69437
72000 70413
73000 71381
74000 72365
75000 73341
76000 74320
77000 75301
78000 76280
79000 77256
80000 78227
81000 79207
82000 80181
83000 81156
84000 82133
85000 83103
86000 84076
87000 85049
88000 86023
89000 86998
90000 87960
91000 88937
92000 89906
93000 90885
94000 91861
95000 92838
96000 93806
97000 94789
98000 95756
99000 96728
100000 97710
101000 98675
102000 99658
103000 100631
104000 101603
105000 102580
106000 103553
107000 104522
108000 105503
109000 106482
110000 107461
111000 108441
112000 109414
113000 110393
114000 111363
115000 112338
116000 113313
117000 114284
118000 115262
119000 116242
120000 117231
120568 117783

Output File Names:
Antarctic1.trim.unique.good.filter.names
Antarctic1.trim.unique.good.filter.unique.fasta


mothur > summary.seqs()

Using Antarctic1.trim.unique.good.filter.unique.fasta as input file for the fasta parameter.

Using 2 processors.
[WARNING]: This command can take a namefile and you did not provide one. The current namefile is Antarctic1.trim.unique.good.filter.names which seems to match Antarctic1.trim.unique.good.filter.unique.fasta.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 1326 414 0 3 1
2.5%-tile: 1 1326 467 0 3 2945
25%-tile: 1 1326 477 0 4 29446
Median: 1 1326 480 0 5 58892
75%-tile: 1 1326 482 0 5 88338
97.5%-tile: 1 1326 504 0 7 114839
Maximum: 2 1326 560 0 8 117783
Mean: 1.00002 1326 480.652 0 4.56929

of Seqs: 117783

Output File Names:
Antarctic1.trim.unique.good.filter.unique.summary


mothur > get.current()
Current files saved by mothur: fasta=Antarctic1.trim.unique.good.filter.unique.fasta group=mergegroupsgood name=Antarctic1.trim.unique.good.filter.names processors=2

mothur > pre.cluster(fasta=Antarctic1.trim.unique.good.filter.unique.fasta, name=Antarctic1.trim.unique.good.filter.nmaes, group=mergegroupsgood, diffs=2)

Unable to open Antarctic1.trim.unique.good.filter.nmaes

Using 2 processors.
[WARNING]: This command can take a namefile and you did not provide one. The current namefile is Antarctic1.trim.unique.good.filter.names which seems to match Antarctic1.trim.unique.good.filter.unique.fasta.
[ERROR]: did not complete pre.cluster.

mothur > pre.cluster(fasta=Antarctic1.trim.unique.good.filter.unique.fasta, name=Antarctic1.trim.unique.good.filter.names, group=mergegroupsgood, diffs=2)


Using 2 processors. missing name m130215_173950_42157_c100458282550000001523065103201350_s1_p0/10022 missing name m130215_173950_42157_c100458282550000001523065103201350_s1_p0/10101

etc etc for many sequences…

[ERROR]: Your name file contains 182091 valid sequences, and your groupfile contains 191161, please correct.

/******************************************/
Running command: unique.seqs(fasta=Antarctic1.trim.unique.good.filter.unique.precluster.fasta, name=Antarctic1.trim.unique.good.filter.unique.precluster.names)
[ERROR]: Antarctic1.trim.unique.good.filter.unique.precluster.fasta is blank, aborting.
Using Antarctic1.trim.unique.good.filter.unique.fasta as input file for the fasta parameter.
[ERROR]: Antarctic1.trim.unique.good.filter.unique.precluster.names is blank, aborting.
/******************************************/
Segmentation fault

How did you create the groups file?

Hi,
Thanks so much for getting back to me. I had eight independent files that I merged and then made a group file. Here’s how I did it:

  1. I had 8 files, each with two barcodes. For each one of these files I ran trim.seqs e.g. trim.seqs(fasta=RU1.fasta, oligos=specific 2 barcodes for each file) to create a group file for the two barcodes in the file.
  2. Then I ran split.groups to split two barcodes in each file: e.g. split.groups(fasta=RU1.trim.fasta, group=RU1.groups). For example, this split RU1 into RU1.trim.St24NoFe.fasta and RU1.trim.St24Fe.fasta
  3. Then I made a group file for all the sequences after doing this for all 8 files using make.group = 16 groups
  4. Then I merged all the 16 different files that I split using split.groups using the merge.files command.

Following that I went down the 454 SOP (excluding the shhh.flows bit) - I went straight to trim.seqs with maxhomop=8 and minlength=413, maxlength=650. I am not using quality scores.

Thanks so much for any help you may be able to give me!
Bethan

When you ran trim.seqs(fasta=current, maxhomop=8, minlength=413, maxlength=650, flip=T) mothur removed some sequences from the fasta file, but they remained in your group file. To correct the problem, run the following:

list.seqs(fname=Antarctic1.trim.unique.good.filter.names)
get.seqs(accnos=current, group=mergegroupsgood)

Then proceed with the pre.cluster command.

Thank you SO much, I’ll give that a go. Apologies for my late reply - finished one postdoc and started another and moved from NJ to OR putting this on a hiatus. Back onto 18S analysis now, I appreciate your help.

Bethan

It worked! THANK YOU!