Lithological Boundaries Identification in Dense Vegetation Area Based on Satellite Data Using Rare Training Data

Hary Nugroho, Ketut Wikantika, Satria Bijaksana, Asep Saepuloh


One of the most critical geological maps is the boundary of rock types or lithology. Machine learning algorithm such as Random Forest (RF) is a useful classification method for producing lithology predictions. In the lithology mapping that was carried out for the first time in difficult areas to access, the problem faced was the collection of training data. It often happens that the amount of training data that can be collected is very limited, especially if the location is an area with high vegetation density. This study aims to assess the performance of the RF algorithm with hyperparameter tuning in identifying lithological boundaries in dense vegetation areas by using remote sensing data with rare training data. We conducted experiments that simulated remote predictive mapping (RPM) using an RF algorithm using satellite data to obtain a predictive lithological map of Komopa located in Paniai District, Papua Province, Indonesia. This study area has dense vegetation and thick soil layers. We used remote sensing data consisting of Sentinel 2A, ALOS PALSAR and DEM, and 1000 drill log points. The results of nine representative models indicated that the test accuracy of lithological classification was moderate (0.53-0.75), but low values on recall (0.24-0.59), precision (0.24-0.51), and F1 score (0.24-0.39). Meanwhile, the training accuracy achieved by each model was very high (0.92-1.0). Model 9, which only used 50 balanced training points, gives the best classification result. Although its test accuracy and F1-score were relatively low, the resulting lithological boundaries are closest to the existing lithological map. This study shows that the RF classification using balanced training data can provide good classification results in the predictive lithological mapping, even though the number is small.


Keywords: lithological map, remote predictive mapping, machine learning, random forest, remote sensing.

Full Text:



CHOE, B.H., TORNABENE, L.L., OSINSKI, G.R., and NEWMAN, J.D. Remote Predictive Mapping of the Tunnunik Impact Structure in the Canadian Arctic Using Multispectral and Polarimetric SAR Data Fusion. Canadian Journal of Remote Sensing, 2018, 44(5): 513–531.

KUHN, S., CRACKNELL, M.J., and READING, A.M. Lithological Mapping via Random Forests : Information Entropy as a Proxy for Inaccuracy Lithological Mapping via Random Forests : Information Entropy as a Proxy for Inaccuracy. ASEG Extended Abstracts, 2016, 1: 1–4.

XIE, Y., ZHU, C., ZHOU, W., LI, Z., LIU, X., and TU, M. Evaluation of Machine Learning Methods for Formation Lithology Identification: A Comparison of Tuning Processes and Model Performances. Journal of Petroleum Science and Engineering, 2018, 160: 182–193.

REN, Q., CHENG, H., and HAN, H. Research on Machine Learning Framework Based on Random Forest Algorithm. AIP Conference Proceedings, 2017, 1820: 1–7.

YING, X. An Overview of Overfitting and Its Solutions. Journal of Physics: Conference Series, 2019, 1168(2).

YANG, L., and SHAMI, A. On Hyperparameter Optimization of Machine Learning Algorithms : Theory and Practice. Neurocomputing, 2020, 415: 295–316.

LIVING TEXTBOOK. Spectral reflectance curves.

HARVEY, A.S., and FOTOPOULOS, G. Geological Mapping Using Machine Learning Algorithms. Int. Arch. Photogramm. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2016, 41(July): 423–430.

BROWNLEE, J. Imbalanced Classification with Python: Choose Better Metrics, Balance Skewed Classes, and Apply Cost-Sensitive Learning. Machine Learning Mastery, 2020.

BREIMAN, L. Random Forests. Machine Learning Journal, 2001, 45: 1–33.

BOGNER, C., SEO, B., and ROHNER, D. Classification of Rare Land Cover Types : Distinguishing Annual and Perennial Crops in an Agricultural Catchment in South Korea. PLoS One, 2018, 13(1): 1–22.

SCHRATZ, P., MUENCHOW, J., and ITURRITXA, E. Performance Evaluation and Hyperparameter Tuning of Statistical and Machine-Learning Models Using Spatial Data. Ecological Modelling, 2018, 406: 109–120.

MUKUNTHU, D., SHAH, P., and TOK, W.H. Practical Automated Machine Learning on Azure: Using Azure Machine Learning to Build AI Solutions Quickly. O’Reilly Media, Inc., 2019.

SCIKITLEARN. Sklearn. Ensemble Random Forest Classifier. generated/sklearn.ensemble.RandomForestClassifier.html.

PROBST, P., WRIGHT, M., and BOULESTEIX, A.-L. Hyperparameters and Tuning Strategies for Random Forest. WIREs Data Mining and Knowledge Discovery, 2019, 9: 1–19.

GLOVER, J.K. The Structural and Lithological Setting, Controls of Mineralization and Potential in the Area of the Komopa-Dawagu Prospects. NBM block II, 1999.

MINE SERVE INTERNATIONAL. Geological Map Scale of 1:25.000, 2nd Edition. Komopa, Papua, Indonesia, 2000.

EUROPEAN SPACE AGENCY. Sentinel 2A. types/level-2a.

L3HARRIS. Vegetation Suppression.

EUROPEAN SPACE AGENCY. Level-1 Radiometric Calibration.

EUROPEAN SPACE AGENCY. Radar Course 2 - Shadow - Course 2 - ERS Radar Courses - ESA Operational EO Missions - Earth Online – ESA.

OTTINGER, M., and KUENZER, C. Spaceborne L-Band Synthetic Aperture Radar Data for Geoscientific Analyses in Coastal Land Applications : A Review. Remote Sensors, 2020, 12(14): 1–36.

BADAN INFORMASI GEOSPASIAL REPUBLIK INDONESIA. DEMNAS Seamless Digital Elevation Model (DEM) dan Batimetri Nasional.

NOORHALIM, N., ALI, A., and SHAMSUDDIN, S.M. Handling Imbalanced Ratio for Class Imbalance Problem Using SMOTE. Proceedings of the Third International Conference on Computing, Mathematics and Statistics (iCMS2017). 2019, 19–30.

PEDREGOSA, F., VAROQUAU, G., GRAMFORT, A., MICHEL, V., THIRION, B., GRISEL, O., BLONDEL, M., PRETTENHOFER, P., DUBOURG, V., and VANDERPLAS, J. Scikit-Learn : Machine Learning in Python. Journal of Machine Learning Research, 2018, 12: 2825-2830.

CAKIR, E., and ULUKAN, Z. Digitalization on Aviation 4.0: Designing a Scikit-Fuzzy Control System for In-Flight Catering Customer Satisfaction. Springer, 2022.




  • There are currently no refbacks.