Machine Learning-Based Prediction Model for Loan Status Approval

Suliman Mohamed Fati


Loan approval in financial organizations is one of the challenges that affect the operational financial process due to the inaccurate estimation or the lack of information. Thus, the banks aim to minimize the credit risks by assessing the loan status through an intensive evaluation process to avoid unforeseen issues. Therefore, loan prediction based on the given and collected information is very important in this regard. Data mining, particularly Machine learning, is a promising direction to give accurate and on-time decisions to approve/disapprove the loans. The main goal of this work is to investigate the loan prediction process by applying different machine learning algorithms. The proposed methodology starts with data pre-processing to clean the data, remove outliers, and find the correlation between the features to find the most noteworthy feature. Then, three machine-learning algorithms will be trained and tested: Logistic Regression, Decision Tree, and Random Forest. The novelty of this research can be represented by comparing three machine-learning algorithms to find the most accurate prediction. The experimental results showed the superiority of Logistic Regression on the other two algorithms in terms of accuracy precision, Recall, F1, and Area under the curve (AUC). The decision tree algorithms also underwent Receiver operating characteristic (ROC), which demonstrated the ability of Logistic Regression to predict the loan status under different thresholds.


Keywords: loan approval, machine learning algorithm, logistic regression, data mining, prediction model.

Full Text:



WITTEN, H., and FRANK, E. Data mining practical machine learning tools and techniques with Java implementations. Elsevier, 2017.

ZHOU, P.Y., CHAN, K.C., OU, C.X., and CHAWAN, P.M. Corporate communication network and stock price movements: insights from data mining. IEEE Transactions on Computational Social Systems, 2018, 5(2): 391-402.

YONGMING, S., and PENG, Y. A MCDM-based evaluation approach for imbalanced classification methods in financial risk prediction. IEEE Access, 2019, 7: 84897-84906.

KEMALBAY, G., and KORKMAZOĞLU, Ö.B. Categorical principal component logistic regression: a case study for housing loan approval. Procedia-Social Behavioral Science, 2017, 109: 730–736.

ZAMANI, S., and MOGADDAM, A. Natural Customer Ranking of Banks in Terms of Credit Risk by Using Data Mining: A Case Study: Branches of Mellat Bank of Iran. Journal of UMP Social Science Technology Management, 2016, 3(2).

BAE, J.K., and KIM, J. A personal credit rating prediction model using data mining in ubiquitous smart environments. International Journal of Distributed Sensor Networks, 2018, 11(9): 179060.

KUMAR, B., BAWANE, I., SHIRSATHE, A., and PARDESHI, P. An Expert System Based On Fuzzy Logic for Automated Decision Making For Loan Approval. 2016.

JIN, Y., and ZHU, Y. A data-driven approach to predicting loan default risk for online peer-to-peer (P2P) lending. 2015 Fifth International Conference on Communication Systems and Network Technologies, 2017, 609–613.

HAMID, A.J., and AHMED, T.M. Developing prediction model of loan risk in banks using data mining. Machine Learning Appliances an International Journal, 2016, 3(1).

TURKSON, R.E., BAAGYERE, E.Y., and WENYA, G.E. A machine learning approach for predicting bank creditworthiness. 2016 Third International Conference on Artificial Intelligence and Pattern Recognition (AIPR), 2016, 1–7.

ARUN, K., ISHAN, G., and SANMEET, K. Loan Approval Prediction based on Machine Learning Approach. IOSR Journal of Computing Engineering, 2016, 18(3): 18–21.

GAHLAUT, A., and SINGH, P.K. Prediction analysis of risky credit using Data mining classification models. 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 2017, 1–7.

LAWI, A., AZIZ, F., and SYARIF, S. Ensemble GradientBoost for increasing classification accuracy of credit scoring. 2017 4th International Conference on Computer Applications and Information Processing Technology (CAIPT), 2017, 1–4.

BAGHERPOUR, A. Predicting mortgage loan default with machine learning methods. University of California/Riverside, 2017.

VAIDYA, A. Predictive and probabilistic approach using logistic regression: Application to prediction of loan approval. 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 2017, 1–6.

PRIYA, K.U., PUSHPA, S., KALAIVANI, K., and SARTIHA, A. Exploratory Analysis on Prediction of Loan Privilege for Customers using Random Forest. International Journal of Engineering Technology, 2018, 7(2.21): 339–341.

JIANG, C., WANG, Z., WANG, R., and DING, Y. Loan default prediction by combining soft information extracted from descriptive text in online peer-to-peer lending. Annual Operation Resources, 2018, 266(1–2): 511–529.

TANEJA, S., SURI, B., GUPTA, S., NARWAL, H., JAIN, A., and KATHURIA, A. A fuzzy logic-based approach for data classification. Data engineering and intelligent computing. Springer, 2018.

YADAV, O., SONI, C., KANDAKATLA, S., and SAWANT, S. Loan Prediction System Using Decision Tree.

COŞER, A., MAER-MATEI, M.M., and ALBU, C. Predictive Models for Loan Default Risk Assessment. Economical Computer: Economic Cybernetics Study Resources, 2019, 53(2).

ODEGUA, R. Predicting Bank Loan Default with Extreme Gradient Boosting. 2020, arXiv Prepr. arXiv2002.02011.


  • There are currently no refbacks.