If I understood well, LefSe analysis is performed to determine the “significant” biomarkers of two or more environments under comparison. But will it be correct to segregate the data and use just the “abundant” OTUs (excluding the rare OTUs) for this analysis; I am more interested in the abundant taxa? Or is it a MUST to use complete data for this analysis?

It really shouldn’t affect the time it takes to run. The algorithm should account for relative abundance when it does the LDA step to find the largest effect sizes.