Classify.seqs : adding extra data columns in tax.summary file

I couldn’t find any answer on how to cope with my problem and therefore allow myself to ask it here.

As a scientific mediator at the University of Lausanne in Switzerland, I am currently developing an activity based on molecular biology - and more specifically on metabarcoding targeting the rbcL gene of diatoms - for bioindication purposes for the assessment of river quality.

The goal is to integrate this activity into the biology curriculum of the port-compulsory school ant to make the students actors in the preservation of their close environment. As such, students will participate in three workshops : (1) field sampling and introduction to the problematic of river pollution, (2) molecular biology and (3) bioinformatics processing of sequencing data.

The idea is to obtain a water quality index (DI-CH) from the sequences obtained. This one is based on a calculation including for each species 2 factors : a correction factor G to avoid the bias of the number of reads by over-representation of certains species according to the size of their cell, and a factor D which expresses an ecological value of the species. I finally come to my request.

Is there a possibility to implement additional columns in the “tax.summary” output file of the “classify.seqs” command that would contain these D and G values for each species ? Or would there be a synthaxis to adjust in the “.tax” file used in the command so that the “tax.sumary” returns these additional columns ? Or any other solution ?

Arnaud Guggisberg

Hi - you would need to do this on your own outside of mothur since we don’t have anything like this in mothur.

