mothur

Run QIIME2 test data with Mothur


#1

Hello,

I am trying to run QIIME2 qiime2-moving-pictures-tutorial with mothur pipeline. This data is single end fastq format for V4 region sequenced with HiSeq platfom.

I first ran

fastq.info(fastq=sequences.fastq)

followed by

trim.seqs(fasta=sequences.fasta,qual=sequences.qual,oligos=sample.oligos)

After this command I am getting empty trim, group and qual files

This is the link for metadata file from qiime2-moving-pictures-tutorial
https://data.qiime2.org/2018.11/tutorials/moving-pictures/sample_metadata.tsv

And my oligo file looks like

forward GTGCCAGCMGCCGCGGTAA
#reverse GTGCCAGCMGCCGCGGTAA
barcode AGCTGACTAGTC L1S8
barcode ACACACTATGGC L1S57
barcode ACTACGTGTGGT L1S76
barcode AGTGCGATGCGT L1S105
barcode ACGATGCGACCA L2S155
barcode AGCTATCCACGA L2S175
barcode ATGCAGCTCAGT L2S204
barcode CACGTGACATGT L2S222
barcode ACAGTTGCGCGA L3S242
barcode CACGACAGGCTA L3S294
barcode AGTGTCACGGTG L3S313
barcode CAAGTGAGAGAG L3S341
barcode CATCGTATCAAC L3S360
barcode CAGTGTCAGGAC L5S104
barcode ATCTTAGACTGC L5S155
barcode CAGACATTGCGT L5S174
barcode CGATGCACCAGA L5S203
barcode CTAGAGACTCTT L5S222
barcode ATGGCAGCTCTA L1S140
barcode CTGAGATACGCG L1S208
barcode CCGACTGAGATG L1S257
barcode CCTCTCGTGATC L1S281
barcode CATATCGCAGTT L2S240
barcode CGTGCATTATCA L2S309
barcode CTAACGCAGTCA L2S357
barcode CTCAATGACTCA L2S382
barcode ATCGATCTGTGG L3S378
barcode CTCGTGGAGTAG L4S63
barcode GCGTTACACACA L4S112
barcode GAACTGTATCTC L4S137
barcode CTGGACTCATAG L5S240
barcode GAGGCTCATCAT L6S20
barcode GATACGTCCTGA L6S68
barcode GATTAGCACTCT L6S93

Can you help me to figure out what is wrong with this oligo file.

Thank you


#2

Ugh, single read HiSeq data. About the only thing worse is IonTorrent data :slight_smile:

You need to look at the sequences and see if they start with the barcode sequences that you have. trim.seqs will first look for the barcode sequences and then the primer. It needs to be in that order with nothing in between.

Also, once you run it through trim.seqs, you can look at the content of the scrap.fasta file and see what the letters are after the | character in the line that starts with >. If it is a b, that means it was a barcode issue, if it is a f, that means it couldn’t find the forward primer.


#3

Hi Pat,

Thank you for the reply.
Before making the oligo file I have tried to grep the barcodes and primers but unfortunately I am not able to grep them.
And I am not understanding how these samples can be multplixed.

This is the header of my file looks like this:
`

HWI-EAS440_0386_1_23_17547_1423#0/1|b(b) fbdiffs=1000(noMatch), rbdiffs=1000(noMatch)
TACGNAGGATCCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGATGGATGTTTAAGTCAGTTGTGAAAGTTTGCGGCTCAACCGTAAAATTGCAGTTGATACTGGATATCTTGAGTGCAGTTGAGGCAGGGGGGGATTGGTGTG`

I also tried Flip=T for reversecomplement but still the all the data goes to scrap file.

And sorry to bother during holiday time :disappointed_relieved:

Thank you in advance


#5

I don’t think you’d get the index sequence since I think they generated those with a separate sequence read that is used to multiplex the samples. At least for the sequence you’ve posted, they don’t seem to have the primer. Perhaps that was removed too before they posted the sequence data? Is it possible to get the raw fastq files?

Pat