Identifying Landslide Hotspots Using Unsupervised Clustering: A Case Study
DOI:
https://doi.org/10.62411/faith.3048-3719-37Keywords:
Algorithms, Clustering, Landslide, Mean, Mean Shift, Metrics, Topographic data, Unsupervised Machine LearningAbstract
Landslides pose significant threats to life, property, and infrastructure. This study explores applying unsupervised learning techniques to identify and understand landslide-prone areas. We analyzed topographic data by employing K-Means, Hierarchical Clustering, Spectral Clustering, Mean Shift Clustering, and DBSCAN to uncover hidden patterns in landslide occurrence. Evaluation metrics, including the Silhouette Score, Davies-Bouldin Index, and Calinski-Harabasz Index, were used to assess the performance of these algorithms. Hierarchical Clustering achieved the highest Silhouette Score of 0.635, indicating excellent cluster separation. However, Mean Shift Clustering outperformed the other methods with a superior Davies-Bouldin Index of 0.603 and the highest Calinski-Harabasz Index of 4121.75, demonstrating the best overall clustering performance. DBSCAN also performed well, with a Silhouette Score of 0.610 and 12 noise points identified. These findings contribute to a deeper understanding of landslide spatial distribution and can inform the development of effective early warning systems and mitigation strategies.
Downloads
References
R. L. Schuster and L. M. Highland, “Socioeconomic and environmental impacts of landslides in the Western Hemisphere,” 2001. doi: 10.3133/ofr01276.
K. Sassa, H. Fukuoka, F. Wang, and G. Wang, Eds., Landslides : risk analysis and sustainable disaster management. Berlin/Heidelberg: Springer-Verlag, 2005. doi: 10.1007/3-540-28680-2.
Y. Alimohammadlou, A. Najafi, and A. Yalcin, “Landslide process and impacts: A proposed classification method,” CATENA, vol. 104, pp. 219–232, May 2013, doi: 10.1016/j.catena.2012.11.013.
O. Hungr, S. Leroueil, and L. Picarelli, “The Varnes classification of landslide types, an update,” Landslides, vol. 11, no. 2, pp. 167–194, Apr. 2014, doi: 10.1007/s10346-013-0436-y.
F. Miao, F. Zhao, Y. Wu, L. Li, and Á. Török, “Landslide susceptibility mapping in Three Gorges Reservoir area based on GIS and boosting decision tree model,” Stoch. Environ. Res. Risk Assess., vol. 37, no. 6, pp. 2283–2303, Jun. 2023, doi: 10.1007/s00477-023-02394-4.
A.-X. Zhu et al., “A comparative study of an expert knowledge-based model and two data-driven models for landslide susceptibility mapping,” CATENA, vol. 166, pp. 317–327, Jul. 2018, doi: 10.1016/j.catena.2018.04.003.
M. A. Thomas, B. B. Mirus, and B. D. Collins, “Identifying Physics‐Based Thresholds for Rainfall‐Induced Landsliding,” Geophys. Res. Lett., vol. 45, no. 18, pp. 9651–9661, Sep. 2018, doi: 10.1029/2018GL079662.
L. Tatard, “Statistical analysis of triggered landslides: Implication for earthquake and weather controls,” Université Joseph-Fourier-Grenoble I, 2010. [Online]. Available: https://theses.hal.science/tel-00498011/file/LT_FTHESIS.pdf
F. S. Tehrani, M. Calvello, Z. Liu, L. Zhang, and S. Lacasse, “Machine learning and landslide studies: recent advances and applications,” Nat. Hazards, vol. 114, no. 2, pp. 1197–1245, Nov. 2022, doi: 10.1007/s11069-022-05423-7.
V. Singh and S. Tyagi, “Machine Learning Models for Prediction of Landslides in the Himalayas,” in Utilizing AI and Machine Learning for Natural Disaster Management, 2024, pp. 146–174. doi: 10.4018/979-8-3693-3362-4.ch009.
P. Jahn, C. M. M. Frey, A. Beer, C. Leiber, and T. Seidl, “Data with Density-Based Clusters: A Generator for Systematic Evaluation of Clustering Algorithms,” in In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2024, pp. 3–21. doi: 10.1007/978-3-031-70368-3_1.
M. Y. Ansari, A. Ahmad, S. S. Khan, G. Bhushan, and Mainuddin, “Spatiotemporal clustering: a review,” Artif. Intell. Rev., vol. 53, no. 4, pp. 2381–2423, Apr. 2020, doi: 10.1007/s10462-019-09736-1.
D. R. I. M. Setiadi, A. R. Muslikh, S. W. Iriananda, W. Warto, J. Gondohanindijo, and A. A. Ojugo, “Outlier Detection Using Gaussian Mixture Model Clustering to Optimize XGBoost for Credit Approval Prediction,” J. Comput. Theor. Appl., vol. 2, no. 2, pp. 244–255, Nov. 2024, doi: 10.62411/jcta.11638.
F. Guzzetti, A. Carrara, M. Cardinali, and P. Reichenbach, “Landslide hazard evaluation: a review of current techniques and their application in a multi-scale study, Central Italy,” Geomorphology, vol. 31, no. 1–4, pp. 181–216, Dec. 1999, doi: 10.1016/S0169-555X(99)00078-1.
S. Samarasinghe and G. Strickert, “Mixed-method integration and advances in fuzzy cognitive maps for computational policy simulations for natural hazard mitigation,” Environ. Model. Softw., vol. 39, pp. 188–200, Jan. 2013, doi: 10.1016/j.envsoft.2012.06.008.
B. R. Nakileza and S. Nedala, “Topographic influence on landslides characteristics and implication for risk management in upper Manafwa catchment, Mt Elgon Uganda,” Geoenvironmental Disasters, vol. 7, no. 1, p. 27, Dec. 2020, doi: 10.1186/s40677-020-00160-0.
Z. S. Dhahir, “A Hybrid Approach for Efficient DDoS Detection in Network Traffic Using CBLOF-Based Feature Engineering and XGBoost,” J. Futur. Artif. Intell. Technol., vol. 1, no. 2, pp. 174–190, Sep. 2024, doi: 10.62411/faith.2024-33.
R. C. Sidle and T. A. Bogaard, “Dynamic earth system and ecological controls of rainfall-initiated landslides,” Earth-Science Rev., vol. 159, pp. 275–291, Aug. 2016, doi: 10.1016/j.earscirev.2016.05.013.
P. Miščević and G. Vlastelica, “Impact of weathering on slope stability in soft rock mass,” J. Rock Mech. Geotech. Eng., vol. 6, no. 3, pp. 240–250, Jun. 2014, doi: 10.1016/j.jrmge.2014.03.006.
A. S. Hermiati, R. Herteno, F. Indriani, T. H. Saragih, Muliadi, and T. Triwiyanto, “A Comparative Study: Application of Principal Component Analysis and Recursive Feature Elimination in Machine Learning for Stroke Prediction,” J. Electron. Electromed. Eng. Med. Informatics, vol. 6, no. 3, 2024, doi: 10.35882/jeeemi.v6i3.446.
H. Sun, W. Li, M. Scaioni, J. Fu, X. Guo, and J. Gao, “Influence of spatial heterogeneity on landslide susceptibility in the transboundary area of the Himalayas,” Geomorphology, vol. 433, p. 108723, Jul. 2023, doi: 10.1016/j.geomorph.2023.108723.
K. Strząbała, P. Ćwiąkała, and E. Puniach, “Identification of Landslide Precursors for Early Warning of Hazards with Remote Sensing,” Remote Sens., vol. 16, no. 15, p. 2781, Jul. 2024, doi: 10.3390/rs16152781.
F. Liu, H. Lu, L. Wu, R. Li, X. Wang, and L. Cao, “Automatic Extraction for Land Parcels Based on Multi-Scale Segmentation,” Land, vol. 13, no. 2, p. 158, Jan. 2024, doi: 10.3390/land13020158.
N. Lawrence, “Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models,” J. Mach. Learn. Res., vol. 6, no. 60, p. 1783−1816, 2005, [Online]. Available: http://jmlr.org/papers/v6/lawrence05a.html
D. Asir, S. Appavu, and E. Jebamalar, “Literature Review on Feature Selection Methods for High-Dimensional Data,” Int. J. Comput. Appl., vol. 136, no. 1, pp. 9–17, Feb. 2016, doi: 10.5120/ijca2016908317.
B. Liu, H. Guo, J. Li, X. Ke, and X. He, “Application and interpretability of ensemble learning for landslide susceptibility mapping along the Three Gorges Reservoir area, China,” Nat. Hazards, vol. 120, no. 5, pp. 4601–4632, Mar. 2024, doi: 10.1007/s11069-023-06374-3.
M. M. Najafabadi, F. Villanustre, T. M. Khoshgoftaar, N. Seliya, R. Wald, and E. Muharemagic, “Deep learning applications and challenges in big data analytics,” J. Big Data, vol. 2, no. 1, p. 1, Dec. 2015, doi: 10.1186/s40537-014-0007-7.
C. Teutschbein and J. Seibert, “Regional Climate Models for Hydrological Impact Studies at the Catchment Scale: A Review of Recent Modeling Strategies,” Geogr. Compass, vol. 4, no. 7, pp. 834–860, Jul. 2010, doi: 10.1111/j.1749-8198.2010.00357.x.
T. Doppler, M. Honti, U. Zihlmann, P. Weisskopf, and C. Stamm, “Validating a spatially distributed hydrological model with soil morphology data,” Hydrol. Earth Syst. Sci., vol. 18, no. 9, pp. 3481–3498, Sep. 2014, doi: 10.5194/hess-18-3481-2014.
I. H. Sarker, “Machine Learning: Algorithms, Real-World Applications and Research Directions,” SN Comput. Sci., vol. 2, no. 3, p. 160, May 2021, doi: 10.1007/s42979-021-00592-x.
F. Huang et al., “Uncertainties in landslide susceptibility prediction modeling: A review on the incompleteness of landslide inventory and its influence rules,” Geosci. Front., vol. 15, no. 6, p. 101886, Nov. 2024, doi: 10.1016/j.gsf.2024.101886.
A. G. Lerchundi, “Data analysis and machine learning approaches for time series pre- and post-processing pipelines,” University of The Basque Country, 2022. [Online]. Available: https://core.ac.uk/download/pdf/547378083.pdf
M. K. Dahouda and I. Joe, “A Deep-Learned Embedding Technique for Categorical Features Encoding,” IEEE Access, vol. 9, pp. 114381–114391, 2021, doi: 10.1109/ACCESS.2021.3104357.
A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,” Inf. Sci. (Ny)., vol. 622, pp. 178–210, Apr. 2023, doi: 10.1016/j.ins.2022.11.139.
F. Catani, V. Tofani, and D. Lagomarsino, “Spatial patterns of landslide dimension: A tool for magnitude mapping,” Geomorphology, vol. 273, pp. 361–373, Nov. 2016, doi: 10.1016/j.geomorph.2016.08.032.
J. Corominas and J. Moya, “A review of assessing landslide frequency for hazard zoning purposes,” Eng. Geol., vol. 102, no. 3–4, pp. 193–213, Dec. 2008, doi: 10.1016/j.enggeo.2008.03.018.
B. D. Malamud, D. L. Turcotte, F. Guzzetti, and P. Reichenbach, “Landslide inventories and their statistical properties,” Earth Surf. Process. Landforms, vol. 29, no. 6, pp. 687–711, Jun. 2004, doi: 10.1002/esp.1064.
Z. Li, M. Wu, N. Chen, R. Hou, S. Tian, and M. Rahman, “Risk Assessment and Analysis of Its Influencing Factors of Debris Flows in Typical Arid Mountain Environment: A Case Study of Central Tien Shan Mountains, China,” Remote Sens., vol. 15, no. 24, p. 5681, Dec. 2023, doi: 10.3390/rs15245681.
C. Esposito et al., “Integration of satellite-based A-DInSAR and geological modeling supporting the prevention from anthropogenic sinkholes: a case study in the urban area of Rome,” Geomatics, Nat. Hazards Risk, vol. 12, no. 1, pp. 2835–2864, Jan. 2021, doi: 10.1080/19475705.2021.1978562.
Y. Han et al., “Extraction of Landslide Information Based on Object-Oriented Approach and Cause Analysis in Shuicheng, China,” Remote Sens., vol. 14, no. 3, p. 502, Jan. 2022, doi: 10.3390/rs14030502.
Y. Tang et al., “Integrating principal component analysis with statistically-based models for analysis of causal factors and landslide susceptibility mapping: A comparative study from the loess plateau area in Shanxi (China),” J. Clean. Prod., vol. 277, p. 124159, Dec. 2020, doi: 10.1016/j.jclepro.2020.124159.
T. Lindeberg, “Feature Detection with Automatic Scale Selection,” Int. J. Comput. Vis., vol. 30, pp. 79–116, 1998, doi: 10.1023/A:1008045108935.
E. Barbierato, A. Pozzi, and D. Tessera, “Controlling Bias Between Categorical Attributes in Datasets: A Two-Step Optimization Algorithm Leveraging Structural Equation Modeling,” IEEE Access, vol. 11, pp. 115493–115510, 2023, doi: 10.1109/ACCESS.2023.3325235.
G. Sahar, K. Bin Abu Bakar, F. T. Zuhra, S. Rahim, T. Bibi, and S. H. Hussain Madni, “Data Redundancy Reduction for Energy-Efficiency in Wireless Sensor Networks: A Comprehensive Review,” IEEE Access, vol. 9, pp. 157859–157888, 2021, doi: 10.1109/ACCESS.2021.3128353.
D. López, S. Ramírez-Gallego, S. García, N. Xiong, and F. Herrera, “BELIEF: A distance-based redundancy-proof feature selection method for Big Data,” Inf. Sci. (Ny)., vol. 558, pp. 124–139, May 2021, doi: 10.1016/j.ins.2020.12.082.
J. Martínez Sotoca and F. Pla, “Supervised feature selection by clustering using conditional mutual information-based distances,” Pattern Recognit., vol. 43, no. 6, pp. 2068–2081, Jun. 2010, doi: 10.1016/j.patcog.2009.12.013.
Y. Lin, Q. Hu, J. Liu, J. Li, and X. Wu, “Streaming Feature Selection for Multilabel Learning Based on Fuzzy Mutual Information,” IEEE Trans. Fuzzy Syst., vol. 25, no. 6, pp. 1491–1507, Dec. 2017, doi: 10.1109/TFUZZ.2017.2735947.
A. Jasinska-Piadlo et al., “Data-driven versus a domain-led approach to k-means clustering on an open heart failure dataset,” Int. J. Data Sci. Anal., vol. 15, no. 1, pp. 49–66, Jan. 2023, doi: 10.1007/s41060-022-00346-9.
Z. Ma, G. Mei, and F. Piccialli, “Machine learning for landslides prevention: a survey,” Neural Comput. Appl., vol. 33, no. 17, pp. 10881–10907, Sep. 2021, doi: 10.1007/s00521-020-05529-8.
F. Abbas et al., “Landslide Susceptibility Mapping: Analysis of Different Feature Selection Techniques with Artificial Neural Network Tuned by Bayesian and Metaheuristic Algorithms,” Remote Sens., vol. 15, no. 17, p. 4330, Sep. 2023, doi: 10.3390/rs15174330.
D. Comaniciu and P. Meer, “Mean shift analysis and applications,” in Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, pp. 1197–1203 vol.2. doi: 10.1109/ICCV.1999.790416.
T. A. O’Brien, K. Kashinath, N. R. Cavanaugh, W. D. Collins, and J. P. O’Brien, “A fast and objective multidimensional kernel density estimation method: fastKDE,” Comput. Stat. Data Anal., vol. 101, pp. 148–160, Sep. 2016, doi: 10.1016/j.csda.2016.02.014.
J. E. Chacón, “A Population Background for Nonparametric Density-Based Clustering,” Stat. Sci., vol. 30, no. 4, Nov. 2015, doi: 10.1214/15-STS526.
M. Á. Carreira-Perpiñán, “A review of mean-shift algorithms for clustering,” arXiv. Mar. 02, 2015. [Online]. Available: http://arxiv.org/abs/1503.00687
V.-H. Nhu et al., “Landslide Susceptibility Mapping Using Machine Learning Algorithms and Remote Sensing Data in a Tropical Environment,” Int. J. Environ. Res. Public Health, vol. 17, no. 14, p. 4933, Jul. 2020, doi: 10.3390/ijerph17144933.
S. Mandal and R. Maiti, Semi-quantitative Approaches for Landslide Assessment and Prediction. Singapore: Springer Singapore, 2015. doi: 10.1007/978-981-287-146-6.
M. Mohammadpour, S. Mostafavi, and S. Mirjalili, “Solving dynamic optimization problems using parent–child multi-swarm clustered memory (PCSCM) algorithm,” Neural Comput. Appl., vol. 36, no. 31, pp. 19549–19583, Nov. 2024, doi: 10.1007/s00521-024-10205-2.
M. Ahmed, R. Seraj, and S. M. S. Islam, “The k-means Algorithm: A Comprehensive Survey and Performance Evaluation,” Electronics, vol. 9, no. 8, p. 1295, Aug. 2020, doi: 10.3390/electronics9081295.
A. K. Jain, “Data clustering: 50 years beyond K-means,” Pattern Recognit. Lett., vol. 31, no. 8, pp. 651–666, Jun. 2010, doi: 10.1016/j.patrec.2009.09.011.
Jau-Yuen Chen, C. A. Bouman, and J. C. Dalton, “Hierarchical browsing and search of large image databases,” IEEE Trans. Image Process., vol. 9, no. 3, pp. 442–455, Mar. 2000, doi: 10.1109/83.826781.
A. A. Wani, “Comprehensive analysis of clustering algorithms: exploring limitations and innovative solutions,” PeerJ Comput. Sci., vol. 10, p. e2286, Aug. 2024, doi: 10.7717/peerj-cs.2286.
Q. Zhang and T. Wang, “Deep Learning for Exploring Landslides with Remote Sensing and Geo-Environmental Data: Frameworks, Progress, Challenges, and Opportunities,” Remote Sens., vol. 16, no. 8, p. 1344, Apr. 2024, doi: 10.3390/rs16081344.
M. Salam, M. T. Iqbal, R. A. Habib, A. Tahir, A. Sultan, and T. Iqbal, “Novel application of unsupervised machine learning for characterization of subsurface seismicity, tectonic dynamics and stress distribution,” Appl. Comput. Geosci., vol. 24, p. 100200, Dec. 2024, doi: 10.1016/j.acags.2024.100200.
M. A. Hael, H. Ma, A. S. Al-Sakkaf, H. A. AL-kuhali, A. Thobhani, and F. Al-selwi, “Dynamic clustering of spatial–temporal rainfall and temperature data over multi-sites in Yemen using multivariate functional approach,” Stoch. Environ. Res. Risk Assess., vol. 38, no. 7, pp. 2591–2609, Jul. 2024, doi: 10.1007/s00477-024-02700-8.
J. Zhao, J. Ouenniche, and J. De Smedt, “A complex network analysis approach to bankruptcy prediction using company relational information-based drivers,” Knowledge-Based Syst., vol. 300, p. 112234, Sep. 2024, doi: 10.1016/j.knosys.2024.112234.
Ö. Akgüller, M. A. Balcı, and G. Cioca, “Clustering Molecules at a Large Scale: Integrating Spectral Geometry with Deep Learning,” Molecules, vol. 29, no. 16, p. 3902, Aug. 2024, doi: 10.3390/molecules29163902.
L. Ding, C. Li, D. Jin, and S. Ding, “Survey of spectral clustering based on graph theory,” Pattern Recognit., vol. 151, p. 110366, Jul. 2024, doi: 10.1016/j.patcog.2024.110366.
A. Kumar, A. Kumar, R. Mallipeddi, and D.-G. Lee, “High-density cluster core-based k-means clustering with an unknown number of clusters,” Appl. Soft Comput., vol. 155, p. 111419, Apr. 2024, doi: 10.1016/j.asoc.2024.111419.
S. M. Miraftabzadeh, C. G. Colombo, M. Longo, and F. Foiadelli, “K-Means and Alternative Clustering Methods in Modern Power Systems,” IEEE Access, vol. 11, pp. 119596–119633, 2023, doi: 10.1109/ACCESS.2023.3327640.
K. Taha, P. D. Yoo, C. Yeun, D. Homouz, and A. Taha, “A comprehensive survey of text classification techniques and their research applications: Observational and experimental insights,” Comput. Sci. Rev., vol. 54, p. 100664, Nov. 2024, doi: 10.1016/j.cosrev.2024.100664.
O. Ajmal et al., “Enhanced Parameter Estimation of DENsity CLUstEring (DENCLUE) Using Differential Evolution,” Mathematics, vol. 12, no. 17, p. 2790, Sep. 2024, doi: 10.3390/math12172790.
P. Lu, N. Casagli, F. Catani, and V. Tofani, “Persistent Scatterers Interferometry Hotspot and Cluster Analysis (PSI-HCA) for detection of extremely slow-moving landslides,” Int. J. Remote Sens., vol. 33, no. 2, pp. 466–489, Jan. 2012, doi: 10.1080/01431161.2010.536185.
J. Pecuchova and M. Drlik, “Enhancing the Early Student Dropout Prediction Model Through Clustering Analysis of Students’ Digital Traces,” IEEE Access, pp. 1–1, 2024, doi: 10.1109/ACCESS.2024.3486762.
A. Patrício, R. S. Costa, and R. Henriques, “Pattern‐centric transformation of omics data grounded on discriminative gene associations aids predictive tasks in TCGA while ensuring interpretability,” Biotechnol. Bioeng., vol. 121, no. 9, pp. 2881–2892, Sep. 2024, doi: 10.1002/bit.28758.
H. Kreft and W. Jetz, “A framework for delineating biogeographical regions based on species distributions,” J. Biogeogr., vol. 37, no. 11, pp. 2029–2053, Nov. 2010, doi: 10.1111/j.1365-2699.2010.02375.x.
H. Carrão, G. Naumann, and P. Barbosa, “Mapping global patterns of drought risk: An empirical framework based on sub-national estimates of hazard, exposure and vulnerability,” Glob. Environ. Chang., vol. 39, pp. 108–124, Jul. 2016, doi: 10.1016/j.gloenvcha.2016.04.012.
Q. Ge, Z. Liu, X. Wang, X. Wang, and H. Y. Sun, “A comparative evaluation of clustering methods and data sampling techniques in the prediction of reservoir landslide deformation state,” Georisk Assess. Manag. Risk Eng. Syst. Geohazards, pp. 1–17, Apr. 2024, doi: 10.1080/17499518.2024.2341257.
M. B. Teferi and L. A. Akinyemi, “Deep Learning-Based Cross-Cancer Morphological Analysis: Identifying Histopathological Patterns in Breast and Lung Cancer,” J. Futur. Artif. Intell. Technol., vol. 1, no. 3, pp. 235–248, Oct. 2024, doi: 10.62411/faith.3048-3719-36.
R. J. Whittaker, M. B. Bush, and K. Richards, “Plant Recolonization and Vegetation Succession on the Krakatau Islands, Indonesia,” Ecol. Monogr., vol. 59, no. 2, pp. 59–123, Jun. 1989, doi: 10.2307/2937282.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Journal of Future Artificial Intelligence and Technologies
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.