mothur

Summary.seqs = 25% of the base to be ambiguous


#1

can you help me?

I have to do analysis with the MiSeq SOP commands

I have only the foward file (R1_001.fasta) to do the analysis and I’m having trouble performing the summary.seqs, I’ll put down what I did.

1- convert fastaq files to fasta file
(fastq.info (fastq = file) it gave me 2 files [.fasta and .qual]

2- Create 2 groups
group 1
make.group (fasta = 01_R1_001.fasta-02_R1_001.fasta, groups = T1_1-T1_2) it gave me 1 group file, then renamed group1
group 2
[make.group (fasta = 01_R1_001.fasta-02_R1_001.fasta, groups = T2_1-T2_2) it gave me 1 group file, then renamed group2

3 - joined the groups to create a unique file
merge.files (input = group1-group2, output = file) it gave me a file [.file]

4- At this moment I have the problem, now I want to summarize.
summary.seqs (fasta = file) it reports the following message:

Using 4 processors.
[WARNING]: We found more than 25% of the bases in sequence 02526_121_000000000-BBCTW_1_1101_18573_2070 to be ambiguous. Mothur is not setup to process protein sequences.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 6463147 6327110 5918999 24 1
2.5% -tile: 1 6463147 6327110 5918999 24 1
25% -tile: 1 6463147 6327110 5918999 24 1
Median: 1 6463147 6327110 5918999 24 1
75% -tile: 1 6463147 6327110 5918999 24 1
97.5% -tile: 1 6463147 6327110 5918999 24 1
Maximum: 1 6463147 6327110 5918999 24 1
Mean: 1 6463147 6327110 5918999 24
of Seqs: 1

It took 1 secs to summarize 1 sequences.

/ Users / leonardoteixeira / Desktop / analises_caio / filesummary

What am I doing wrong.

Thank you.


Mismatch detected after align
#2

your fasta didn’t read in correctly. I know I’ve seen this before, but can’t remember the exact problem. I think it was an error in unzipping the file. What happens if you feed fastq.info your zipped file?


#3

I’ll try again.
Thank you


#4

I did it again as I had suggested but the problem continues:

We found more than 25% of the bases in sequence EEEF33FFFB<@DBFFHF?//<<?FDGHHHFHEHHHHFGFBF.FGGFFFHHBGE-CHGGDDCCC_C._;GG./_A?-BAB0;C09;;AEFFGCFBE@9B-D.A/B/.;/…BFE.@-B;-ABEEFAB…A9BBF9FE.@B-AEFFDFFDBA?BFFBB;9-BFBF//9;9BFF///9/;/99F_B-;9A/;.9B/B.999//;//999FB;/_F/;;/B/;;//;99;/9//_99 to be ambiguous. Mothur is not setup to process protein sequences.

What will I do?
Can you help me.


#5

It’s still reading in the qual line as the dna line. Can you unzip your fastq then head R1_001.fasta.


#6

Sorry, what would be head?
Rename file?
Thank you.


#7

head is a command from the command prompt that outputs the first 5 lines of a text file. If you are in mothur you can do…

mothur > head(01_R1_001.fasta)

If you are in windows it might be easiest to open the file in a text editor and copy and paste the first few lines from the file.

Can you also post the actual syntax you are using rather than the descriptions you have above?


#8

Ok.
The command didn’t work or I’m not sure how to do it.
Follow what I did:

49

But, nothing chances in the file.

So I was renaming file in the text editor:

44

Have several groups to analyser, believe that when create groups the Mothur reads with a single sequence, even putting the head name.

44

Now I want to summarize.
summary.seqs (fasta = groups)

58

What will I do?
Thank you.


#9

Sorry, it would help if I got the command right!

mothur > system(head 01_R1_001.fasta)

You’re giving it a groups file instead of a fasta file. You should be doing…

summary.seqs(fasta=S59.fasta)

If you haven’t already, I would strongly encourage you to go through the MiSeq SOP and run each of the commands and see what they’re doing. I know you don’t have paired-end data, but it will at least get you thinking more about the inputs and outputs of each step.


#10

Ok.
Thanks.
I’m working now.


#11

Good afternoon, I have another problem (mismatch).
As I had suggested the Miseq SOP commands gave super right, but after aligning with the reference silva, more precisely in the summary.seqs where I have the problem.

What should I do to fix it? And at what time, before or after alignment with the reference?

mothur > summary.seqs(fasta=caiogooduniquealign, count=caiogoodnamescount_table)


#12

Can you post the actual command as you wrote it?


#13

Reviewing the commands that were made in the analysis before alignment with the reference, I didn’t realized anything wrong, follow all the commands:

1
CREATE GROUPS (8 groups)
mothur > make.group(fasta=S59.fasta-S60.fasta-S61.fasta-S62.fasta-S63.fasta-S64.fasta-S65.fasta-S66fasta, groups=T11-T12-T13-T14-T15-T16-T17-T18)
Output File Names: merge.groups
Rename [1merge.groups]

mothur > make.group(fasta=S67.fasta-S68.fasta-S69.fasta-S70.fasta-S71.fasta-S72.fasta-S73.fasta-S74fasta, groups=T21-T22-T23-T24-T25-T26-T27-T28)
Output File Names: merge.groups
Rename [2merge.groups]

2
REGROUP GROUPS
mothur > merge.files(input=1merge.groups-2merge.groups-3merge.groups-4merge.grups-5merge.groups-6merge.groups-7merge.groups-8merge.groups, output=caiomerge.groups)
Output File Names: caiomerge.groups

3
CREATE ARCHIVE (.fasta)
mothur > merge.files(input=S59.fasta-S60.fasta-S61.fasta-S62.fasta-S63.fasta-S4.fasta-S65.fasta-S66.fasta-S67.fasta-S68.fasta-S69.fasta-S70.fasta-S71.fasta-S72.fasta-S73.fasta-S74.fasta-S75.fasta-S76.fasta-S77.fasta-S78.fasta-S79.fasta-S80.fasta-S81.fasta-S82.fasta-S83.fasta-S84.fasta-S85.fasta-S86.fasta-S87.fasta-S88.fasta-S89.fasta-S90.fasta-S91.fasta-S92.fasta-S93.fasta-S94.fasta-S95.fasta-S96.fasta-S97.fasta-S98.fasta-S99.fasta-S100.fasta-S101.fasta-S102.fasta-S103.fasta-S104.fasta-S105.fasta-S106.fasta-S107.fasta-S108.fasta-S109.fasta-S110.fasta-S111.fasta-S112.fasta-S113.fasta-S114.fasta-S115.fasta-S116.fasta-S117.fasta-S118.fasta-S119.fasta-S120.fasta-S121.fasta-S122.fasta, output=caio)
Output File Names:caio

4
SUMMARY (caio.fasta)
Pasted%20Graphic%202

5
SCREEN SEQS
mothur > screen.seqs(fasta=caio, group=caiogroups, summary=caiosummary, maxambg=0, maxlength=305)
Pasted%20Graphic%201

6
UNIQUE SEQS
mothur > unique.seqs(fasta=caiogood)
22

7
COUNT SEQS
mothur > count.seqs(name=caiogoodnames, group=caiomerge.good.groups)
1__%23%24!%40%25!%23__Pasted%20Graphic

8
SUMMARY SEQS
mothur > summary.seqs(fasta=caiogoodunique)
1__%23%24!%40%25!%23__Pasted%20Graphic%201

9
PCR.SEQS
mothur > pcr.seqs(fasta=silva.bacteria.fasta, start=11894, end=25319, keepdots=F, processors=8)
2__%23%24!%40%25!%23__Pasted%20Graphic%201

10
RENAME
mothur > rename.file(input=silva.bacteria.pcr.fasta, new=silva.v4.fasta)
1__%23%24!%40%25!%23__Pasted%20Graphic%202

11
SUMMARY SEQS
mothur > summary.seqs(fasta=silva.v4.fasta)
Pasted%20Graphic%204

12
ALIGN
mothur > align.seqs(fasta=caiogoodunique, reference=silva.v4.fasta)
Pasted%20Graphic%203

13
SUMMARY SEQS
mothur > summary.seqs(fasta=caiogooduniquealign, count=caiogoodnamescount_table)

These were the commands I made.
Thanks a lot.


#14

I’m not sure that it will matter but in your first step can you set output=caio.fasta and update everything else? Please make sure that you aren’t touching the files between steps. Seeing the losses of .'s in file names makes me worried that you’re doing something to the files in between steps.


#15

Could you send your caiogoodunique and caiogoodnamescount_table files, as well as your logfile to mothur.bugs@gmail.com, so I can try to track down the issue for you?


#16

Sorry, the mistake was mine, I created the wrong groups.
Thank you very much for the attention of the forum and until the next doubt.
Good week and thank you everyone.


#17

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.