Hi Pat,
Sequences are of the V1 region using the 27F and 519R, I have a fasta file provided by my vendor where the contigs have already been assembled so I didn’t run make.contigs
Should I run filter.seqs?
According to my provider the DNA library was prepared as follows:
The 16S rRNA gene V1 variable region PCR primers 27F/519R with barcode on the forward primer were used in a 30 cycle PCR using the HotStarTaq Plus Master Mix Kit (Qiagen, USA) under the following conditions: 94°C for 3 minutes, followed by 28 cycles of 94°C for 30 seconds, 53°C for 40 seconds and 72°C for 1 minute, after which a final elongation step at 72°C for 5 minutes was performed. After amplification, PCR products are checked in 2% agarose gel to determine the success of amplification and the relative intensity of bands. Multiple samples are pooled together (e.g., 100 samples) in equal proportions based on their molecular weight and DNA concentrations. Pooled samples are purified using calibrated Ampure XP beads. Then the pooled and purified PCR product is used to prepare DNA library by following Illumina TruSeq DNA library preparation protocol. Sequencing was performed at MR DNA (www.mrdnalab.com, Shallowater, TX, USA) on a MiSeq following the manufacturer’s guidelines. Sequence data were processed using a proprietary analysis pipeline (MR DNA, Shallowater, TX, USA). In summary, sequences were depleted of barcodes then sequences <150bp removed, sequences with ambiguous base calls removed. Sequences were denoised, OTUs generated and chimeras removed. Operational taxonomic units (OTUs) were defined by clustering at 3% divergence (97% similarity). Final OTUs were taxonomically classified using BLASTn against a curated GreenGenes database (DeSantis et al 2006).
The log of the partial job is here: (I removed the numbers to be within the limits of the characters provided here)
mothur > pcr.seqs(fasta=/nfs/16/osu8334/Nramp_full_mothur/silva.bacteria.fasta, start=1044, end=13127, keepdots=F)
Output File Names:
/nfs/16/osu8334/Nramp_full_mothur/silva.bacteria.pcr.fasta
It took 26 secs to screen 14956 sequences.
mothur > trim.seqs(fasta=/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.fasta, oligos=/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina.oligos)
Group count:
N107 211158
N115 169089
N118 276253
N119 209827
N12 263496
N3 242268
N37 164808
N54 184111
N56 183022
N97 229924
P13 219475
P18 195305
P23 209718
P36 227669
P46 228622
P50 206962
P61 208723
P75 229255
P79 198388
P9 216692
Total of all groups is 4274765
Output File Names:
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.fasta
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.scrap.fasta
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.groups
[WARNING]: your sequence names contained ‘:’. I changed them to ‘_’ to avoid problems in your downstream analysis.
mothur > screen.seqs(fasta=current, maxambig=0, maxlength=520, minlength=400)
Using /nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.fasta as input file for the fasta parameter.
Output File Names:
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.fasta
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.bad.accnos
It took 9655 secs to screen 4274765 sequences.
mothur > unique.seqs()
Using /nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.fasta as input file for the fasta parameter.
Output File Names:
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.names
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.fasta
mothur > count.seqs(name=current, group=current)
Using /nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.groups as input file for the group parameter.
Using /nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.names as input file for the name parameter.
Using 1 processors.
[ERROR]: processes reported processing 4257321 sequences, but group file indicates you have 4274765 sequences. Could you have a file mismatch?
It took 58 secs to create a table for 4257321 sequences.
Total number of sequences: 4257321
Output File Names:
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.count_table
mothur > align.seqs(fasta=current, reference=/nfs/16/osu8334/Nramp_full_mothur/silva.bacteria.pcr.fasta)
Using /nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.fasta as input file for the fasta parameter.
Using 1 processors.
Reading in the /nfs/16/osu8334/Nramp_full_mothur/silva.bacteria.pcr.fasta template sequences… DONE.
It took 35 to read 14956 sequences.
Aligning sequences from /nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.fasta …
Total number of sequences before pre.cluster was 134769.
pre.cluster removed 42743 sequences.
It took 14110 secs to cluster 134769 sequences.
Processing group N56:
Total number of sequences before pre.cluster was 132889.
pre.cluster removed 42127 sequences.
It took 13728 secs to cluster 132889 sequences.
Processing group N97:
Total number of sequences before pre.cluster was 157012.
pre.cluster removed 53260 sequences.
It took 19281 secs to cluster 157012 sequences.
Processing group P13:
Total number of sequences before pre.cluster was 152434.
pre.cluster removed 51350 sequences.
It took 16888 secs to cluster 152434 sequences.
Processing group P18:
Total number of sequences before pre.cluster was 132299.
pre.cluster removed 42853 sequences.
It took 13041 secs to cluster 132299 sequences.
Processing group P23:
Total number of sequences before pre.cluster was 146093.
pre.cluster removed 49103 sequences.
It took 15304 secs to cluster 146093 sequences.
Processing group P36:
Total number of sequences before pre.cluster was 155034.
pre.cluster removed 52782 sequences.
It took 18472 secs to cluster 155034 sequences.
Processing group P46:
Total number of sequences before pre.cluster was 160452.
pre.cluster removed 52945 sequences.
It took 19876 secs to cluster 160452 sequences.
Processing group P50:
Total number of sequences before pre.cluster was 140916.
pre.cluster removed 47194 sequences.
It took 14511 secs to cluster 140916 sequences.
Processing group P61:
Total number of sequences before pre.cluster was 137797.
pre.cluster removed 45009 sequences.
It took 14363 secs to cluster 137797 sequences.
Processing group P75:
Total number of sequences before pre.cluster was 158641.
pre.cluster removed 53460 sequences.
It took 19205 secs to cluster 158641 sequences.
Processing group P79:
Total number of sequences before pre.cluster was 136129.
pre.cluster removed 46539 sequences.
It took 13794 secs to cluster 136129 sequences.
Processing group P9:
Total number of sequences before pre.cluster was 155773.
pre.cluster removed 49893 sequences.
It took 18359 secs to cluster 155773 sequences.
It took 340013 secs to run pre.cluster.
Output File Names:
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.align
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.count_table
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.N107.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.N115.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.N118.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.N119.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.N12.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.N3.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.N37.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.N54.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.N56.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.N97.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.P13.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.P18.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.P23.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.P36.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.P46.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.P50.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.P61.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.P75.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.P79.map
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.P9.map
mothur > unique.seqs(fasta=current, count=current)
Using /nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.count_table as input file for the count parameter.
Using /nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.align as input file for the fasta parameter.
Output File Names:
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.unique.count_table
/nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.unique.align
mothur > pre.cluster(fasta=current, count=current, diffs=2)
Using /nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.unique.count_table as input file for the count parameter.
Using /nfs/16/osu8334/Nramp_full_mothur/031914BA27Fillumina_full.trim.good.unique.precluster.unique.align as input file for the fasta parameter.
Using 1 processors.
Processing group N107:
Total number of sequences before pre.cluster was 103240.
pre.cluster removed 0 sequences.
It took 14572 secs to cluster 103240 sequences.
Processing group N115:
Total number of sequences before pre.cluster was 85551.
pre.cluster removed 0 sequences.
It took 9641 secs to cluster 85551 sequences.
Processing group N118:
Total number of sequences before pre.cluster was 117571.
pre.cluster removed 0 sequences.
It took 20886 secs to cluster 117571 sequences.
Processing group N119:
Total number of sequences before pre.cluster was 103824.
pre.cluster removed 0 sequences.
It took 14865 secs to cluster 103824 sequences.
Processing group N12:
Total number of sequences before pre.cluster was 112216.
pre.cluster removed 0 sequences.
It took 20972 secs to cluster 112216 sequences.
Processing group N3:
Total number of sequences before pre.cluster was 107956.
pre.cluster removed 0 sequences.
It took 18113 secs to cluster 107956 sequences.
Processing group N37:
Total number of sequences before pre.cluster was 87336.
pre.cluster removed 0 sequences.
It took 11805 secs to cluster 87336 sequences.
Processing group N54:
Total number of sequences before pre.cluster was 92026.
pre.cluster removed 0 sequences.
It took 13490 secs to cluster 92026 sequences.
Processing group N56:
Total number of sequences before pre.cluster was 90762.
pre.cluster removed 0 sequences.
It took 13269 secs to cluster 90762 sequences.
Processing group N97:
Total number of sequences before pre.cluster was 103752.
pre.cluster removed 0 sequences.
It took 18458 secs to cluster 103752 sequences.
Processing group P13:
Total number of sequences before pre.cluster was 101084.
pre.cluster removed 0 sequences.
It took 15600 secs to cluster 101084 sequences.
Processing group P18:
Total number of sequences before pre.cluster was 89446.
pre.cluster removed 0 sequences.
It took 12540 secs to cluster 89446 sequences.
Processing group P23:
Total number of sequences before pre.cluster was 96990.
pre.cluster removed 0 sequences.
It took 14479 secs to cluster 96990 sequences.
Processing group P36:
=>> PBS: job killed: walltime 604815 exceeded limit 604800
Resources requested:
mem=48gb
nodes=1:ppn=12
Resources used:
cput=162:44:48
walltime=168:00:16
mem=46.249 GB
vmem=46.480 GB
Resource units charged (estimate):
201.605 RUs
Estimated RU charges under proposed new accounting policy:
201.605 RUs
See http://osc.edu/memcharging for more information.