Objective:This project explored a random forest(RF) analysis of high-dimensional data with the confounding effects. Methods:We used computer simulations and real data validation to evaluate the performance of 2 methods which can potentially account for the confounding effects in RF analysis:RF analysis with maximum candidate variables at each split(RFMCV) and RF with glm-based correction. The distribution of ranks of the causal variable was used to evaluate these approaches. Results:Simulation experiments suggested that RF with glm-based correction was more effective than the RFMCV to correct the confounding effects. The real data validation showed that rs3754686 and rs2322660 were ranked first and second,respectively. Analysis results of GWAS data confirmed that RF with glm-based correction can effectively remove the spurious association between the LCT gene and height. Conclusion:The confounding effects should be correctly adjusted in RF analysis. RF with glm-based correction was applicable to adjust the confounding effects and variable selection in high-dimensional data.