Data Set Analysis Using Rapid Miner to Predict Cost Insurance Forecast with Data Mining Methods

Johanes Fernandes Andry, Henny Hartono, Honni, Aziza Chakir, Rafael

Abstract

The insurance protection program cannot be separated from everyday human life because there will always be risks in every human activity. Most people have entered into insurance agreements with state-owned and national private-owned insurance companies. The information system is one of the resources to increase competitive advantage. Information systems can be used to obtain, process, and disseminate information to support day-to-day operations and support strategic decision-making activities. The rapid growth of data accumulation has created data-rich but insufficient information conditions. Data mining is the mining or discovery of new information by looking for specific patterns or rules from large amounts of data expected to overcome these conditions. It is hoped that customer data can accurately produce information about insurance cost predictions. In this analysis, the authors use the RapidMiner Studio version 9.1 software. With the RapidMiner Studio app, authors can analyze the insurance data. A scientific novelty of this research is investigating data set cost insurance with data mining techniques consisting of classification, association, and clustering. Research goals for data mining techniques with classification, association, and clustering case studies implemented are to find all associative rules with high confidence, organize objects into groups whose members are similar, and collect objects between them. The following methods can be used: decision tree for data modeling, FP-Growth for determining which dataset occurs most frequently, and K-Means to classify the data attributes to facilitate the analysis.

 

Keywords: insurance, information system, data mining, RapidMiner.

 

https://doi.org/10.55463/issn.1674-2974.49.6.17


Full Text:

PDF


References


WISEMAN V, THABRANY H, ASANTE A. et al. An evaluation of health systems equity in Indonesia: study protocol. International Journal for Equity in Health, 2018, 17(1): 138. https://doi.org/10.1186/s12939-018-0822-0

ELTAHIR O A B. The Effect of Information Technology on The Cooperative Insurance Industry Case Study: Shiekan Insurance and Reinsurance Company – Sudan (Empirical Study). International Journal of Economics, Business and Accounting Research, 2020, 4(1): 27–37, https://jurnal.stie-aas.ac.id/index.php/IJEBAR

HIWASE V A, and AGRAWA A J. Review on Application of Data Mining in Life Insurance. International Journal of Engineering & Technology, 2018, 7(45):159-162, http://dx.doi.org/10.14419/ijet.v7i4.5.20035.

KARAMIZADEH F, and ZOLFAGHARIFAR S A. Using the Clustering Algorithms and Rule-based of Data Mining to Identify Affecting Factors in the Profit and Loss of Third-Party Insurance, Insurance Company Auto. Indian Journal of Science and Technology. 2016, 9(7): 1-9. https://doi.org/10.17485/ijst/2016/v9i7/87846.

TEMBHURNE D S, ADHIKARI J, and BABU R. Implementation of Data Mining Techniques in CRM of Pharmaceutical Industry. Proceedings of the 2019 International Conference on Innovation & Research in Engineering, Science & Technology, 2019: 7-12.

SAADATDOOST R, SIM A T H, JAFARKARIMI H, and HEE J M. Knowledge Discovery for Large Databases in Education Institutes. In Information Retrieval and Management: Concepts, Methodologies, Tools, and Applications 2018, Chapter 10: 158-245. https://doi.org/10.4018/978-1-5225-5191-1.Ch010

SENOUSY Y, HANNA W K, SHEHAB A, RIAD A M, EL-BAKRY H M, and ELKHAMISY N. Egyptian social insurance big data mining using supervised learning algorithms. Revue d’Intelligence Artificielle, 2019, 33(5): 349–357. https://doi.org/10.18280/ria.330504.

JIJO B T, and ABDULAZEEZ A M. Classification Based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends, 2021, 2(1): 20-28.

GUPTA N, NARAYAN R and CHAUDHARI A. Implementation of Meteorological Data Analysis Using Techniques for Weather Prediction. International Journal of Engineering Applied Sciences and Technology, 2018, 2(12): 28-31.

BALA A. A Comparative Study of Various Clustering Algorithms in Data Mining. International Journal of Advanced Research in Science and Engineering, 2018, 7(7): 1182-1191.

ANDRY J F, GUNADI J, REMBULAN G D, and TANNADY H. Big Data Implementation in Tesla Using Classification with Rapid Miner. International Journal of Nonlinear Analysis and Applications, 2021, 12(Special Issue): 2057-2066. http://dx.doi.org/10.22075/ijnaa.2021.6016

. SAFRI Y F, ARIFUDIN R, and MUSLIM M A. K-nearest neighbor and naive Bayes classifier algorithm in determining the classification of healthy card Indonesia giving to the poor. Scientific Journal of Informatics, 2018, 5(1): 9-18. https://doi.org/10.15294/sji.v5i1.12057

MADYATMADJA E D, JORDAN S I, and ANDRY J. F. Big Data Analysis Using RapidMiner Studio to Predict Suicide Rate in Several Countries, ICIC Express Letters Part B: Applications, 2021, 12(8): 757-764, https://doi.org/10.24507/icicelb.12.08.757.

MADYATMADJA E D, MARVEL, ANDRY J F, TANNADY H. and CHAKIR A. Implementation of Big Data in Hospital using Cluster Analytics. Proceedings of the 2021 International Conference on Information Management and Technology (ICIMTech): 496-500.

ANDRY J F, REYNALDO S A, CHRISTIANTO K, et al. Algorithm of Trending Videos on YouTube Analysis using Classification, Association and Clustering. Proceedings of the 2021 International Conference on Data and Software Engineering (ICoDSE), IEEE Catalog Number: CFP21AWL-USB.

MADYATMADJA E D, SEMBIRING D J. M, PERANGIN ANGIN S M, FERDY D, and ANDRY J. F. Big Data in Educational Institutions using RapidMiner to Predict Learning Effectiveness, Journal of Computer Science, 2021, 17(4): 403-413, https://doi.org/10.3844/jcssp.2021.403.413

SILALAHI R M P, ANDRY J F, BERNANDA D Y, TANNADY H, and ENIRIANTI. Big Data Analytics in Library to Classification Book Publishers, Journal of Positive School Psychology, 2022, 6(2): 4303 – 4310,

ANDRY J F, TANNADY H, LIMAWAL I I, REMBULAN G D, and MARTA R F. Big Data Analysis on YouTube with Tableau. Journal of Theoretical and Applied Information Technology, 2021, 99(22): 5460-5469.

MADYATMADJA E D, RIANTO A, ANDRY J F, TANNADY H, and CHAKIR A. Analysis of Big Data in Healthcare Using Decision Tree Algorithm, Proceedings of 2021 1st International Conference on Computer Science and Artificial Intelligence (ICCSAI), IEEE Part Number: CFP19H83-ART

MARTÍNEZ-PLUMED F, OCHANDO L. C, FERRI C, et al. CRISP-DM Twenty Years Later: From Data Mining Processes to Data Science Trajectories. IEEE Transactions on Knowledge and Data Engineering, 2021, 33(8): 3048-3061, https://doi.org/10.1109/tkde.2019.2962680.

SCHRÖER C, KRUSE F, and GÓMEZ J M. A Systematic Literature Review on Applying CRISP-DM Process Model. Procedia Computer Science, 2021, 181: 526-534.

PLOTNIKOVA V, DUMAS M, and MILANI F. Adaptations of data mining methodologies: a systematic literature review, PeerJ Computer Science, 2020; 6: e267, https://doi.org/10.7717/peerj-cs.267.

SCHÄFER F, ZEISELMAIR C, and BECKER J. Synthesizing CRISP-DM and Quality Management: A Data Mining Approach for Production Processes 2020 IEEE International Conference on Technology Management, Operations and Decisions (ICTMOD), 2018: 190-195, https://doi.org/10.1109/ITMC.2018.8691266.


Refbacks

  • There are currently no refbacks.