screen.seqs problem

Hello All,

Any advice would be gratefully received (I am running the windows version). Below is my out put after:

summary.seqs(fasta=merged.shhh.trim.unique.align, name=merged.shhh.trim.unique.names)

To my amateur eye this did not look quite right.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 0 0 0 0 1 1
2.5%-tile: -1 -1 0 0 1 652
25%-tile: 1044 5443 1 0 1 6514
Median: 1044 5690 9 0 2 13027
75%-tile: 43116 43116 258 0 5 19540
97.5%-tile: 43116 43116 286 0 5 25401
Maximum: 43116 43116 302 0 7 26052
Mean: 17512.9 19618 122.419 0 2.9582

of unique seqs: 5405

total # of seqs: 26052

Especially when it came to the screen.seq command:

screen.seqs(fasta=merged.shhh.trim.unique.align, name=merged.shhh.trim.unique.names, group=merged.shhh.groups, start=yyy, optimize=end, criteria=95, processors=4)

I have no idea what number I would put into the start command underline and bolded above. As I mentioned the summary looks very wrong and may indicate an earlier problem.

If I carry on my merry way with the process after screen.seqs Mothur crashes at the step commend below

pre.cluster(fasta=merged.shhh.trim.unique.good.filter.unique.fasta, name=merged.shhh.trim.unique.good.filter.names, group=merged.shhh.good.groups, diffs=2)

I also receive this warning (abbreviated):

[ERROR]: HZ4HL4X04I7INC is in your name file and not in your groupfile, please correct.
[ERROR]: HZ4HL4X04JUMWS is in your name file and not in your groupfile, please correct.
[ERROR]: HZ4HL4X04I890X is in your name file and not in your groupfile, please correct.
[ERROR]: HZ4HL4X04J1X2H is in your name file and not in your groupfile, please correct.

/******************************************/
Running command: unique.seqs(fasta=merged.shhh.trim.unique.good.filter.unique.precluster.fasta, name=merged.shhh.trim.unique.good.filter.unique.precluster.names)
[ERROR]: merged.shhh.trim.unique.good.filter.unique.precluster.fasta is blank, aborting.
Using merged.shhh.trim.unique.good.filter.unique.fasta as input file for the fasta parameter.
[ERROR]: merged.shhh.trim.unique.good.filter.unique.precluster.names is blank, aborting.
/******************************************/

There is line after line after line of [ERROR]: XXXXXXXXXXX is in your name file and not in your groupfile, please correct.

I suspect because of empty files.

Any advice wold be fantastic!

Thank you in advanced and forgive me if this has come across as gibberish!

C

What are you sequencing and which region are you targeting?

Thanks for the reply,

16S sequence V1-V3 from lung tissue.

Something to note a large amount of non-specific human sequence was returned. I am not sure if this would have an impact.

C

So the non-specific human DNA sequence would screw up your summary.seqs output after align.seqs since those reads won’t map to the proper region. I would use start=1044 and minlength=250 or end=5443.

As for the error about things that are missing, I suspect something weird might have happened in your merge step that caused some of the things to not merge correct.

Pat

Thank you Pat,

Yes, I see! Update is I have remerged and rerun and still get his message:

[ERROR]: Your name file contains 11879 valid sequences, and your groupfile contains 13898, please correct.
[ERROR]: process 0 only processed 1 of 4 groups assigned to it, quitting.

/******************************************/
Running command: unique.seqs(fasta=merged.shhh.trim.unique.good.filter.unique.precluster.fasta, name=merged.shhh.trim.unique.good.filte
r.unique.precluster.names)
[ERROR]: merged.shhh.trim.unique.good.filter.unique.precluster.fasta is blank, aborting.
Using merged.shhh.trim.unique.good.filter.unique.fasta as input file for the fasta parameter.
[ERROR]: merged.shhh.trim.unique.good.filter.unique

Everything up to this point seems perfect.

Mothur usually works perfectly well for me, its just this once and of course when I need it fast!

C

how many sequences are in merged.shhh.groups?