R1-R2 lengths


I’m trying to check the mean length of R1 (on one hand) and R2 sequences (on the other). I suspect the quality of R2 is not so good, so I’d like to compare the lengths in both cases. Is there any script/command to perform such a task?

Thanks in advance,


You could just do fasta.info followed by summary.seqs on each file and compare the outputs.

Alternatively, FastQC is a good program for getting an overview of your sequences before you start to process them.

Thanks Dwaite. But I wonder if there is any way to pool all R1 files together, check them at ones and thus see the mean length by summary.seqs. I’ve got quite a few samples, so I could be rather tedious to check them one by one.


If you don’t care about your sample distribution, you could just cat all the R1 files together:

cat *R1* > R1.fastq
cat *R2* > R2.fastq

And then proceed from there.