A Genome-wide Epistasis Analysis Method Based on Multiple Criteria Fusion

LI Ze-jun, CHEN Min, ZENG Li-jun


Traditional units of genome-wide association studies have serious defects such as low repeatability, difficulty to interpret, and epistasis analysis based on machine learning has troubles such as high computational complexity and insufficient prediction accuracy. This paper presented a new approach for the analysis of genome-wide epistatic. This method uses the framework of two-phase epistatic analysis method. It includes a filtering stage and an epistatic combinatorial optimization stage. The characteristics of the filtering stage presents a multicriteria fusion strategy for the evaluation of genetic loci from multiple perspectives to ensure that the weak effect of susceptibility loci can be retained, and then, this method uses the multiple criteria sorting fusion strategy to eliminate the low degree of genetic variation associated with disease states. Epistatic combinatorial optimization phase uses the greedy algorithm combination of heuristic search space in order to reduce the time complexity. Finally, a support vector machine was used as the epistatic evaluation model. Experiments with different parameters of linkage disequilibrium SNPruler with classical algorithms were compared with the performance of the ACO, and the experiment results show that the method can effectively keep weak effect locus and improve disease forecasting accuracy considerably.



Keywords: GWAS(Genome-Wide Association Study),  epistasis,  complex diseases,  intelligent computing

Full Text:



КRUPP A J, ZIEGLER A. KONIG I R. Risk estimation and risk prediction using machine-learning methods [J]. Human Genetics, 2012, 131 (10): 1639-1654.

PAHIKALA T. OKSER S, Airola A, et al. Wrapper-based selection of genetic features in genome-wide association studies through fast matrix operations [J]. Algorithm Mol Biol,2012, 7(1); 11.

OKSER S. LEHTIMAKI T. ELO L L. et al. Genetic variants and their interactions in the prediction of increased Pre-clinical carotid atherosclerosis [J]. The Cardiovascular Risk in Young Finns Study, PLoS Genet,2010, 6(9);el00l 146.

KOOPERBERG C, LEBLANC M, OBENCHAIN V. Risk prediction using genome-wide association studies [J]. Genet Epidemiol, 2010, 34(7) ;643 — 652.

KRAFT P. WACHOLDER S. CORNELIS M C, et al. Beyond odds ratios: communicating disease risk based on genetic profiles [J]. Nat Rev Genet ,2009,10 (4) :264 — 269.

ASHLEY К A. BUTTE A J . WHEELER М Т, et al. Clinical assessment incorporating a personal genome [J]. Lancet. 2010,375(9725): 1525-1535.

MANOLIO T A. Bringing genome-wide association findings into clinical use [J]. Nat Rev Genet, 2013,14(8):549— 558.

GIBSON G. Hints of hidden heritability in GWAS [J]. Nat Genet ,2010. 42(8) :558-560.

YANG J, BENYAMIN B, MCE VOY В P, et al. Common SNPs explain a large proportion of the heritability for human height [J], Nat Genet,2010,42(11) ;565-569.

MAKOWSKY R, PAJEWSKI N M. KLIMENTIDIS YC, et al. Beyond missing heritability: prediction of complex traits [J]. PLoS Genet ,2011.7 :el002051.

WEI Z. WANG K. QU H Q. et al. From disease association to risk assessment; an optimistic view from genome-wide association studies on type 1 diabetes[J], PLoS Genet. 2009.5; el000678.

WEI Z. WANG W. BRADFIELD J. et al. Large sample size. wide variant spectrum, and advanced machine-learning technique boost risk prediction for inflammatory bowel disease [J]. Am J Hum Genetics.2013, 92(6); 1008—1012.


  • There are currently no refbacks.