Align.seq Process Killed

Hi All,

I am working to align 16s v4 rRNA gene ASVs. I have successfully done pcr.seqs as described below. I have also read in my ASVs.fa file to ensure that it seems reasonable (using summary.seqs) as described below. This is not my first time processing data this way, but for whatever reason, when I do the align.seqs it starts to read in the reference sequences then quits. I have tried adjusting the number of processors (2 to 16) but that did not solve the issue. I was wondering what else I could do to determine the issue. I am wondering if it is a memory/RAM issue?

input:

pcr.seqs(fasta=silva.nr_v138_2.align, start=11894, end=25319, keepdots=F, processors=16)

output:

\[NOTE\]: no sequences were bad, removing silva.nr_v138_2.bad.accnos

It took 11 secs to screen 164296 sequences.

Output File Names:
silva.nr_v138_2.pcr.align

input:

summary.seqs(fasta=silva.nr_v138_2.pcr.align )

output:

            Start   End     NBases  Ambigs  Polymer NumSeqs
Minimum:        1       9876    85      0       3       1
2.5%-tile:      1       13425   291     0       3       4108
25%-tile:       1       13425   293     0       4       41075
Median:         1       13425   293     0       5       82149
75%-tile:       1       13425   293     0       5       123223
97.5%-tile:     1       13425   460     1       6       160189
Maximum:        4227    13425   1521    5       20      164296
Mean:   1       13424   308     0       4

# of Seqs:      164296

It took 3 secs to summarize 164296 sequences.

input:

summary.seqs(fasta = ./Analysis/ASVs.fa)

output:

            Start   End     NBases  Ambigs  Polymer NumSeqs
Minimum:        1       220     220     0       3       1
2.5%-tile:      1       220     220     0       3       157
25%-tile:       1       252     252     0       4       1568
Median:         1       253     253     0       5       3135
75%-tile:       1       253     253     0       6       4702
97.5%-tile:     1       254     254     0       8       6113
Maximum:        1       257     257     0       220     6269
Mean:   1       246     246     0       5

# of Seqs:      6269

It took 0 secs to summarize 6269 sequences.

input:

align.seqs(fasta = ./Analysis/ASVs.fa, reference= silva.nr_v138_2.pcr.align, processors =2)

output:

Using 2 processors.

Reading in the silva.nr_v138_2.pcr.align template sequences…  Killed

I am not able to reproduce the issue you are having on our test machines. I suspect you are running out of RAM. The reference file is rather large even after pcr trimming, ~2.3G.

How much memory do you have?

Have you tried running the command with processors=1?

Also, what version of mothur are you running?

Hi @westcott

I did confirm that it is a memory issue! I had to increase my usage request and that allowed it to work! Thanks

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.