I’m downloading demultiplexed data from basespace, each sample has it’s own path to data that looks like this-in my sample sheet I use sample name (human readable name) and sample ID (barcode of sample)
sample1-99999999/Data/Intensities/BaseCalls/R1.fastq.gz
sample1-99999999/Data/Intensities/BaseCalls/R2.fastq.gz
sample2-99999998/Data/Intensities/BaseCalls/R1.fastq.gz
sample2-99999998/Data/Intensities/BaseCalls/R2.fastq.gz
I would like to be really lazy and use mothur to do all the processing before make.contigs, using the files exactly as they are downloaded from basespace-the sample name is the first folder name before the dash.
so far I’ve moved all my samples to one folder and renamed them by appending the human readable name
system(for i in */Data/Intensities/BaseCalls/*.gz; do mv $i "fastq""/"${i%%-*}"."`basename $i`; done)
then make.file- resulting lines look like:
fastq/sample1.R1_001.fastq.gz fastq/sample1.R2_001.fastq.gz
Now I just need to use something like gawk to create the first column, but I’m not figuring it out. Help?