trim option in sffinfo

Hello! Does somebody know what the “clipQualLeft” and “clipQualRight” values are? By default, when running sffinfo, the sequences are trimmed based on these values.
Moreover, when I pass trim=F, the output files .fasta and .qual are bigger (what makes sense, since they were not trimmed), whereas the files .sff.txt and .flow have the same size. Can someone explain me why the latters are not affected by trimming :?:
Thanks a lot in advance!

I also have the same question. I am very new with 454 data. And using mothur to process my data. I want to know more clearly in what basis sffinfo default trims the data.
I ran my 1 file, 2 times using default and using trim=F, then I noticed that a good part of the data is trimmed off. I am more bothered because when I checked the the end of the untrimmed sequence, I see they have good quality values also (like 40). Still they are trimmed off in default.
For example I am copying the first read (I know the first read is bad, still just copying as small example)

fasta result of sffinfo with trim=F

IDQAZ7Q01CI0YW xy=919_2838
tcagACGCTCGACAGAGTTGATCCTGGCTCAGCCATCTCATACCAGCAGCCGCGGTAAtctgagactgccaaggcacacaggagtagtg

fasta result of sffinfo default

IDQAZ7Q01CI0YW xy=919_2838
ACGCTCGACAGAGTTGATCCTGGCTCAGCCATCTCATACCAGCAGCCGCGGTAA

Now if I check the quality file:
with trim=F

IDQAZ7Q01CI0YW xy=919_2838 length=89
40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 36 33 33 33 40 40 40 40 40 40 37 37 37 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 35 29 27 29 29 33 37 40 40 40 40 40 40 40 40 40 40 40 40 40 40 39 40 40 38 39 31 24 16 16 16 14 16 18 19 18 18
default
IDQAZ7Q01CI0YW xy=919_2838 length=54
40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 36 33 33 33 40 40 40 40 40 40 37 37 37 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 35 29 27 29 29 33


So the default version clipped off all the bold part. Though I see there are many base with 40 quality. Please help me to understand this. Further I have another question, if I run sffinfo using trim=F, some part of the seq is in lower case. what does that mean? Thanks a lot, Mitra

The sffinfo command trims the sequences to the clipQualLeft and clipQualRight values provided in the sff file. If you set trim=f, mothur sets the bases that would have been trimmed to lowercase. It does seem odd that the clipQualRight value is not further right.

Thanks westcott. This is the reason I posted with the example. I know how sffinfo softmusk the bases which is not trimmed. But my question is in what basis it decides, what to trim as I wrote in the example, I see there are many base with 40 quality.
Any suggestion in this would be really great.
Thanks,
Mitra

This was likely set by the sequence provider. There are many parameters that they can set to alter the trimming points and some parameter sets are more conservative than others. mothur only uses what she gets from the user. My sense is that most places use the shotgun settings (w/ GS FLX) instead of the amplicon settings which are overly conservative. You might use trim=F and then carry on with our QC steps (trim.flows), which are pretty stringent themselves.

Pat

Dear Dr. Scholss, thanks for your kind reply. I am doing that accordingly…