Hi my fellows,
I Have analyzed metagenomic NGS data and now I have too many bacteria species and most of them has very low percentage like 0.01 and less values. Since I try to write an article and do some pis charts I can’t represent all of them which would be so chaotic. So what ahould be the lowest percentage that I need to include in my article ? Is there a threshold for this that is commonly used? Thanks for your answers in advance.
Do you mean shotgun sequence data (i.e. metagenomics) or 16S rRNA gene sequence data (i.e. amplicon data)? I would probably only report the abundances of interesting organisms such as those that are significantly different between your groups. Also, I would never encourage anyone to generate a pie chart or stacked barchart. If you are using R, you can see the Note in the document produced when you run ?piechart.
I use stacked bar chats for summarization of higher taxa, i.e. phyla and genera. For species, I limit to those that are significantly different between the groups or that are present in all samples (core microbiome); separately I include a comprehensive list of all identified species as supplementary table. If it is necessary to include a stacked bar chart for species, then do it, but only include those that are above an arbitrarily-identified abundance cutoff in the figure key!
Hope this helps