文章摘要
陈婉婷,刘成友,吴 静,丁 勇.微阵列数据扰动对错误发现率方法筛选差异表达基因的影响[J].南京医科大学学报,2014,(7):991-995~1002
微阵列数据扰动对错误发现率方法筛选差异表达基因的影响
Effect of data perturbation in microarray on selecting differentially expressed genes by false discovery rate
投稿时间:2013-10-17  
DOI:10.7655/NYDXBNS20140728
中文关键词: 微阵列  差异表达基因  错误发现率  数据扰动
英文关键词: microarray  differentially expressed genes  false discovery rate  data perturbation
基金项目:江苏省高校自然科学基金(13KJB310007);南京医科大学科技发展基金重点项目(2013NJMU006)
作者单位
陈婉婷 南京医科大学生物医学工程系,江苏 南京 210029 
刘成友 南京医科大学生物医学工程系,江苏 南京 210029 
吴 静 南京医科大学数学与计算机教研室,江苏 南京 210029 
丁 勇 南京医科大学数学与计算机教研室,江苏 南京 210029 
摘要点击次数: 1271
全文下载次数: 843
中文摘要:
      目的:探讨微阵列数据的扰动对错误发现率(FDR)方法筛选差异表达基因的影响?方法:用计算机模拟仿真的方法,对1 991个结肠癌微阵列基因数据给予不同相对误差限的随机扰动,每个扰动进行1 000次随机模拟;用FDR的ALSU方法对无扰动数据与有扰动数据分别筛选差异表达基因,比较两者之间的重复率;分析数据扰动对每次基因排序位次变化的影响?结果:差异表达基因的单个平均重复率与总体平均重复率都随数据扰动的增加而下降?差异表达越显著的基因,受扰动误差的影响越小;在扰动误差限≤50%时,数据扰动与差异表达基因总体平均重复率呈线性递减趋势,数据扰动误差限每增加1%,总体平均重复率约下降1.85%?扰动误差限越大,基因排序位次的波动越大?结论:数据扰动是导致差异表达基因可重复性差的原因,用计算机模拟的方法可定量探讨数据扰动对差异基因筛选的影响?
英文摘要:
      Objective:To investigate the effect of data perturbation in microarray on selecting differentially expressed genes by false discovery rate (FDR). Methods:A total of 1 991 DNA microarray data of colon cancer were afforded random perturbation of different error limits based on a computer simulation. Every perturbation comprised 1 000 random simulations. The differentially expressed genes were selected from data with and without perturbation,respectively,by adaptive linear step-up (ALSU),a method of FDR. The repetition rates between both results were compared. The effect of each gene sort order was analyzed by data perturbation. Results:The single average and overall average repetition rates of differentially expressed genes both decreased with increasing data perturbation. The more significant differentially expressed the genes,the less they were affected by perturbation. When the error limit was less than or equal to 50%,the overall average repetition rate of differentially expressed genes decreased with increasing data perturbation linearly. For each 1% increase of perturbation error limit,the overall average repetition rate decreased approximately by 1.85%. The higher the perturbation error limit,the greater the fluctuation the gene sort order had. Conclusion:Data perturbation is a reason why differentially expressed genes exhibit low repeatability;the effect of data perturbation on selecting differentially expressed genes can be quantitatively investigated by using computer simulation.
查看全文   查看/发表评论  下载PDF阅读器