Are you using pdiffs and bdiffs?
Yes I am using both. I have been following your SOP.
do you have any idea how to help me?
Are you running the command with bdiffs=1? You might try running the set.dir(debug=t) before the trim.seqs command. Mothur will output the aligned fragments and the barcode fragments as well as the number of diffs. This may help you determine the issue.
I have been running the trim.seqs command with bdiffs=1. When I run the set.dir(debug=t) then it is showing me that the numdiffs range from 4 to 15 on average! Is this normal? When I adjust the bdiffs=8 then I get a lot more of my sequences saved from the scrap.flows (18/24 of my samples). Is there an acceptable range to adjust bdiffs? If I go higher in bdiffs= in order to prevent the majority of sequences being scrapped, am I compromising the data analysis further down the pipeline?
Thanks for your help.
No, that’s a problem. Have you tried using order=B in trim.flows and shhh.flows? Do you know what type of 454 sequencing was done on these?
pat
I have tried order=B, it doesn’t improve the amount of sequences saved from the scrap.flows. It looks like there are also quite some that are scrapped based strictly on length but the majority are “lbp”. These were supposedly done with FLX… I am at a loss here. This is the first time this has happened to me that we have had such terrible sequencing…
Can you post your sff file and oligos file somewhere (e.g. dropbox/google drive) for me to take a look?
I have send you the files via dropbox to the mothurbugs email. I included the URL for this forum. Please let me know if this email address is correct or incorrect. Thank you.
Sorry for the delay in getting back to you - we were on vacation at the end of the month and are slowly getting back to all our missed emails and forum posts…
It appears that you do have FLX and so you should be using flow order A. Like you, when I run the following…
trim.flows(flow=sc.flow, oligos=sc.oligos.txt, pdiffs=1, bdiffs=2, processors=8)
I get a lot of sequences scrapped because of length:
3 bf
856 f
108913 l
2157 lbf
7964 lf
The length gets flagged because you have noisy data in the first 450 flows that end up inserting ambiguous base calls (caused by reagents) or because you’re accidentally sequencing short reads (such as primer dimers).
However of the 378614 total flows, 258721 flows are good, so I’m not sure that I would be overly concerned.
Thank you!
Ok, I have other similar data. From the same sequencing group. Same kind of analysis. When I run trim.flows it puts all except 25bytes of the 6.2GB of sequencing data into the scrap.flows. If this is a case of simply poor sequencing. Is there anything I can do to recover some information from this data?
Are you sure that it’s not FLX+?
I have tried both order=A and order=B. Order A got me 2 samples out of 48, order B got all files scrapped.
Can you try running trim.seqs with the fasta and qual file and see what happens?
See:
http://www.mothur.org/wiki/454_SOP#Using_quality_scores
I suspect the data are bad and that perhaps it’s all primer dimer or there were reagent problems.
Pat