Optimizing Football Match Outcome Prediction : A Comparative Study of Boosting Algorithms and Deep Learning on Tabular Sports Data
Abstract
Predicting event outcomes, particularly in sports, has attracted increasing research attention due to the growing availability of historical and performance-related data. Football match outcome prediction has traditionally relied on expert judgment, statistical analysis of past results, and qualitative assessments of team strengths and weaknesses; however, such approaches may be limited by subjectivity, incomplete feature representation, and restricted predictive consistency. This study develops and compares predictive models for football match outcomes using ensemble learning and deep learning algorithms applied to tabular sports data. A publicly available football match dataset obtained from Kaggle was used, and five algorithms were implemented: Deep Neural Network (DNN), TabTransformer, Neural Oblivious Decision Ensembles (NODE), XGBoost, and LightGBM. Model performance was evaluated using standard classification metrics, including precision, recall, F1-score, and accuracy. The results show that the deep learning models achieved moderate predictive performance, with accuracies ranging from 78% for NODE to 87% for the best-performing deep learning model. In contrast, XGBoost demonstrated strong performance across all metrics, achieving 0.96 precision, 0.96 recall, 0.95 F1-score, and 96% accuracy. LightGBM achieved the highest overall performance, with 0.98 precision, 0.98 recall, 0.98 F1-score, and 99% accuracy. These findings indicate that LightGBM is the most effective model for this tabular classification task, followed closely by XGBoost. Although the deep learning models, particularly TabTransformer, show potential, they did not outperform the boosting algorithms in this evaluation. The study recommends the use of ensemble-based algorithms for football match outcome prediction, especially when working with structured tabular datasets. Future research may extend this work by applying advanced hyperparameter optimization techniques, such as grid search, random search, or Bayesian optimization, to further improve the performance of LightGBM and XGBoost.
Keywords: football match prediction; match outcome classification; ensemble learning; deep learning; XGBoost; LightGBM; tabular data.
Full Text:
PDFReferences
WUNDERLICH F., and MEMMERT D. Forecasting the outcomes of sports events: A review. European Journal of Sport Science, 2021, 21(7): 944–957. [Online]. Available: https://doi.org/10.1080/17461391.2020.1829115
COSSICH V. R. A., CARLGREN D., HOLASH R. J., and KATZ L. Technological breakthroughs in sport: Current practice and future potential of artificial intelligence, virtual reality, augmented reality, and modern data visualization in performance analysis. Applied Sciences, 2023, 13(23): 12965. [Online]. Available: https://doi.org/10.3390/app132312965
IMARTICUS LEARNING. Identifying patterns, trends and relationships in data: Time series, cluster, correlation analysis and more. Imarticus Learning, n.d. [Online]. Available: https://imarticus.org
OGUNSANWO G. O., OKOGBUE B. C., ODULAJA G. O., and OWOADE A. A. Development of a machine learning model for age prediction of footballers. Dutse Journal of Pure and Applied Sciences, 2024, 10(4b): 325–337.
PATIL S., KATE A., WAVARE K., GUJAR M., and BACHAV G. Predicting football match results using machine learning. International Journal of Creative Research Thoughts (IJCRT), 2023. [Online]. Available: https://ijcrt.org/papers/IJCRT2304812.pdf
BERRAR D., LOPES P., and DUBITZKY W. A data- and knowledge-driven framework for developing machine learning models to predict soccer match outcomes. Machine Learning, 2024, 113: 8165–8204. [Online]. Available: https://doi.org/10.1007/s10994-024-06625-9
HASSARD P., and KERR D. Predicting football match outcomes using event data and machine learning algorithms. Proc. 35th Irish Systems and Signals Conference (ISSC 2024), IEEE, 2024. [Online]. Available: https://doi.org/10.1109/ISSC61953.2024.10603147
YEUNG C., BUNKER R., UMEMOTO R., and FUJII K. Evaluating soccer match prediction models: A deep learning approach and feature optimization for gradient-boosted trees. Machine Learning, 2024, 113(1): 66. [Online]. Available: https://doi.org/10.1007/s10994-024-06608-w
OBRADOVIĆ A., and KEČO D. Sports results prediction model using machine learning. SAR Journal – Science and Research, 2024, 7(3): 184–189. [Online]. Available: https://doi.org/10.18421/SAR73-03
MILLS E. F. E. A., DENG Z., ZHONG Z., and LI J. Data-driven prediction of soccer outcomes using enhanced machine and deep learning techniques. Journal of Big Data, 2024, 11: 170. [Online]. Available: https://doi.org/10.1186/s40537-024-01008-2
SUN Y., and CHU H. The outcome prediction method of football matches by the quantum neural network based on deep learning. Scientific Reports, 2025, 15: 19875. [Online]. Available: https://doi.org/10.1038/s41598-025-19875-7
HUANG S., KRUEGER D., LACOSTE A., and COURVILLE A. TabTransformer: Tabular data modeling using contextual embeddings. arXiv preprint, arXiv:2012.06678, 2020. [Online]. Available: https://arxiv.org/abs/2012.06678
POPOV E., BABENKO A., and VETROV D. Neural oblivious decision ensembles for deep learning on tabular data. Advances in Neural Information Processing Systems (NeurIPS), 2020. [Online]. Available: https://arxiv.org/abs/1909.06312
MCELFRESH D., KHANDAGALE S., VALVERDE J., PRASAD V., FEUER B., HEGDE C., RAMAKRISHNAN G., GOLDBLUM M., and WHITE C. When do neural nets outperform boosted trees on tabular data? arXiv preprint, arXiv:2305.02997, 2024. [Online]. Available: https://doi.org/10.48550/arXiv.2305.02997
RAPARTHI M., DHABLIYA D., KUMARI T., UPADHYAYA R., and SHARMA A. Implementation and performance comparison of gradient boosting algorithms for tabular data classification. Proc. Int. Conf. Intelligent Computing and Applications, Springer, 2024, pp. 453–464. [Online]. Available: https://doi.org/10.1007/978-981-97-4533-3_36
LEE S. LightGBM vs XGBoost: A comparative study on speed and efficiency. Number Analytics Blog, 2025. [Online]. Available: https://www.numberanalytics.com/blog/lightgbm-vs-xgboost-comparison (numberanalytics.com in Bing)
Refbacks
- There are currently no refbacks.


