Abstract:Objective:To investigate the effect of data perturbation in microarray on selecting differentially expressed genes by false discovery rate (FDR). Methods:A total of 1 991 DNA microarray data of colon cancer were afforded random perturbation of different error limits based on a computer simulation. Every perturbation comprised 1 000 random simulations. The differentially expressed genes were selected from data with and without perturbation,respectively,by adaptive linear step-up (ALSU),a method of FDR. The repetition rates between both results were compared. The effect of each gene sort order was analyzed by data perturbation. Results:The single average and overall average repetition rates of differentially expressed genes both decreased with increasing data perturbation. The more significant differentially expressed the genes,the less they were affected by perturbation. When the error limit was less than or equal to 50%,the overall average repetition rate of differentially expressed genes decreased with increasing data perturbation linearly. For each 1% increase of perturbation error limit,the overall average repetition rate decreased approximately by 1.85%. The higher the perturbation error limit,the greater the fluctuation the gene sort order had. Conclusion:Data perturbation is a reason why differentially expressed genes exhibit low repeatability;the effect of data perturbation on selecting differentially expressed genes can be quantitatively investigated by using computer simulation.