Introduction

Welcome to the project on Landslide Susceptibility Mapping and Exposure Assessment, part of the Master of Science in Geoinformatics at Politecnico di Milano. This project focuses on assessing and visualizing the risk of landslides in a specific region using a combination of advanced geospatial technologies and data analysis techniques.


What is a Landslide?

A landslide is the movement of rock, earth, or debris down a slope due to gravity. Landslides can be triggered by various factors including heavy rainfall, earthquakes, volcanic activity, or human activities such as construction. They pose significant risks to life, property, and the environment, making it crucial to identify and monitor susceptible areas.


Role of GIS and Remote Sensing

Geographic Information Systems (GIS) and remote sensing technologies are essential tools in landslide susceptibility mapping. GIS allows for the integration, analysis, and visualization of various geospatial data layers, helping to identify patterns and relationships. Remote sensing involves the acquisition of information about the Earth's surface from satellite or aerial sensors, providing critical data for monitoring and analysis.

In this project, we have utilized GIS and remote sensing to integrate various datasets such as Digital Terrain Model (DTM), land use data (DUSAF), river and road buffers, landslide inventory maps, and Normalized Difference Vegetation Index (NDVI). These datasets help in understanding the terrain, land cover, proximity to water bodies and infrastructure, historical landslide events, and vegetation health, respectively.


Importance of Monitoring and Predicting Landslides

Monitoring and predicting landslides is crucial for disaster risk reduction. By identifying areas that are susceptible to landslides, authorities can implement preventive measures, plan land use more effectively, and respond promptly to potential landslide events. Evaluating the exposure of populations in these areas helps in prioritizing resources and efforts for evacuation and emergency response, thereby minimizing the impact on communities.

Through this project, we aim to identify areas susceptible to landslides and evaluate the exposure of the population within these areas. By integrating various datasets and utilizing advanced geospatial analysis techniques, we have generated a comprehensive landslide susceptibility map.

The final output includes a WebGIS application that provides an interactive platform for exploring the susceptibility map and related data layers. This tool not only aids in understanding the spatial distribution of landslide risk but also serves as a valuable resource for decision-makers in land use planning and disaster management.


Study Area

The study area is situated in the Lombardy region, within the province of Bergamo, Italy. Positioned to the north of Lake Iseo, this region encompasses an area of approximately 56 square kilometers and hosts a population of 12,109 residents. The area includes the towns of Cerete, Songavazzo, Fino del Monte, and portions of Rovetta and Sovere.

Geographic Coordinates and Extent

  • Latitude: 45.85° N to 45.95° N
  • Longitude: 9.85° E to 10.05° E

The study area is characterized by its diverse topography, which includes the foothills of the Alps, rolling hills, and proximity to Lake Iseo. This geographical diversity contributes to the unique environmental and climatic conditions observed within the region.

Area

Key Urban Areas

Cerete

Coordinates 45.8801° N, 10.0567° E. Known for its historical architecture and community-oriented environment.

Songavazzo

Coordinates 45.8854° N, 9.9953° E. Noted for its scenic landscapes and outdoor recreational opportunities.

Fino del Monte

Coordinates 45.8815° N, 9.9964° E. Recognized for its tranquil atmosphere and picturesque views.

Rovetta

Coordinates 45.8920° N, 10.0512° E. A town with a rich historical heritage and natural beauty.

Sovere

Coordinates 45.8103° N, 10.0569° E. Known for its historical landmarks and scenic surroundings.

Demographic and Socioeconomic Profile

The population density of the study area is approximately 216 inhabitants per square kilometer. The socioeconomic activities are predominantly centered around agriculture, tourism, and small-scale industries. The demographic distribution exhibits a balanced age structure with a mix of young and elderly populations.

Environmental and Climatic Features

The region experiences a temperate climate with distinct seasonal variations. The proximity to Lake Iseo influences the local microclimate, contributing to milder winters and cooler summers compared to the surrounding inland areas. The area's elevation ranges from approximately 180 meters to 1,200 meters above sea level, influencing both climate and biodiversity.

About

This project is part of the Master of Science in Geoinformatics at Politecnico di Milano. The MSc in Geoinformatics program at Polimi aims to equip students with advanced skills in geographic information systems, remote sensing, and spatial data analysis. The program emphasizes the application of geoinformatics in various fields such as environmental monitoring, urban planning, and disaster management.


Politecnico di Milano

Politecnico di Milano, founded in 1863, is one of the leading technical universities in Europe. The university is known for its high-quality education and research in engineering, architecture, and design. The Department of Civil and Environmental Engineering (DICA) at Polimi is dedicated to advancing knowledge and technologies in civil and environmental engineering through innovative research and teaching.


GIS Course

The Geographic Information Systems course for the academic year 2023-2024 covers various aspects of GIS and remote sensing technologies. The course includes practical projects such as landslide susceptibility mapping and population exposure assessment, providing hands-on experience in geospatial analysis and visualization.


Course Instructors


Project Team Members


Amir Hossein Donyadidegan

Amir Hossein
Donyadidegan

LinkedIn Profile

Firoozeh Rahimian

Firoozeh
Rahimian

LinkedIn Profile

Hadi Kheiri

Hadi
Kheiri

LinkedIn Profile

Workflow

This section outlines the comprehensive workflow for the Landslide Susceptibility Mapping and Exposure Assessment project. The workflow is divided into six main steps, each crucial for the successful completion of the project.


  1. Data Collection



  2. Data Preprocessing

    • Reproject Datasets to a Common CRS

    • Ensure all datasets are in the same CRS to avoid misalignments. All layers were reprojected to the CRS WGS84, which is commonly used for global datasets.

    • Clip Datasets to the Study Area

    • Focus the analysis on the specific study area by clipping all vector and raster datasets accordingly. This ensures that only relevant data within the study area is used for further analysis.

    • Rasterize Vector Datasets

    • Convert all vector data to raster format to ensure uniform data structure for analysis. This step is essential for integrating vector data such as river and road buffers into the susceptibility analysis.

    • Generate Additional Derived Layers

    • To enhance the terrain analysis, create additional layers from the Digital Terrain Model (DTM) data, such as slope, aspect, and curvature. These derived layers are essential for understanding terrain characteristics, which significantly impact landslide susceptibility analysis.
      Next, generate a "No Landslide Zone" from the slope layer. Additionally, create training and testing points from both the Landslide Inventory and No Landslide Zone layers.


  3. Susceptibility Map Generation

    • Combine Environmental Factors

    • To simplify the process of classification, combine all environmental factors (DTM, DUSAF, NDVI, RIVER, ROAD, ASPECT, SLOPE, PLAN, PROFILE, and FAULTS) into a single virtual raster layer.

    • GENERATE CLASSIFICATION MAP

    • Use the Dzetsaka Plugin in QGIS for running the Random Forest classifier.
      Input the generated virtual raster layer and Training Sampled Points (in shapefile format) into Dzetsaka plugin, selecting the appropriate attribute for generating classification map.

    • GENERATE CONFIDENCE MAP

    • The confidence map shows the classification confidence level.

    • GENERATE THE SUSCEPTIBILITY(PROBABILITY) MAP

    • The generated classification map shows whether the landslide occurs or not (no landslide zones= 1, landslide zones=2), but to generate susceptibility map we want the probability of landslide classes ranges from 0 to 100.
      Run the raster calculator to generate the landslide susceptibility map by defining the proper condition and saving the output as a raster file.


  4. Exposure Assessment

    • DATA PREPROCESSING

    • population data

    • Reproject and clip population raster data set to match the study area.

    • susceptibility map

    • Reclassify the susceptibility map into low, moderate, high, and very high classes using QGIS Raster Calculator and Resample the reclassified susceptibility map to match the population raster’s resolution.

    • EXPOSURE ASSESSMENT

    • COMPUTATION

    • Calculate population counts in each susceptibility class using QGIS Zonal Statistics tool and generate its table.

    • visualizations

    • Plot (e.g. with a pie chart) the percentage of population per each susceptibility class.

Results

Landslide Susceptibility Mapping

The results are presented in the following map:

The susceptibility map highlights high-risk areas in red, while lower-risk areas are indicated in green. This visual representation aids in identifying zones that require immediate attention and preventive measures.

Susceptibility Map

Susceptibility Map


Population Exposure Assessment

The population exposure assessment involved overlaying the landslide susceptibility map with population distribution data. This analysis was conducted to quantify the percentage of people at risk in each susceptibility class.

  • Zone 1 (Blue - Low Risk)
  • Zone 2 (Green - Medium Low Risk)
  • Zone 3 (Yellow - Medium High Risk)
  • Zone 4 (Red - High Risk)

The charts reveal that although the high-risk areas cover a significant portion of the region, the population residing in these areas is relatively low. This emphasizes the need for targeted mitigation strategies and emergency preparedness plans focused on those high-risk areas.


Critical Discussion of Numerical Outcomes

The numerical outcomes from our analysis reveal varying levels of landslide susceptibility across different zones. Notably:

  • Zone 1 (Blue - Low Risk)

    Covers 12,008,953.03 m² with a mean population exposure of 1.5316 people per unit area, indicating lower risk and moderate population density.

  • Zone 2 (Green - Medium Low Risk)

    Covers 10,427,195.27 m² and has the highest mean population exposure (4.7756 people per unit area), highlighting a higher population density in medium-risk areas.

  • Zone 3 (Yellow - Medium High Risk)

    Covers 1,867,094.454 m² with a mean population exposure of 1.0469 people per unit area, indicating moderate risk and moderate population density.

  • Zone 4 (Red - High Risk)

    Despite covering the largest area (31,666,170.06 m²), it has the lowest mean population exposure (0.1248 people per unit area). This suggests that while a significant portion of the land is high-risk, fewer people live there.

The analysis shows that while Zone 4 has a significant area marked as high risk, the population density in this zone is relatively low. Conversely, Zone 2, though medium risk, has a higher population density, which suggests that mitigation efforts should focus not only on high-risk areas but also on zones with higher population densities to reduce overall risk.


Conclusion

The results of this project underscore the efficacy of using geospatial techniques for landslide susceptibility mapping and population exposure assessment. These findings provide a foundational tool for decision-makers to implement effective land use planning and disaster mitigation strategies. By understanding the distribution of both risk and population, authorities can better allocate resources and design interventions that target the most vulnerable areas.

Validation

Validate the susceptibility map using accuracy assessment and sampling

Use a Python script for accuracy assessment, comparing predicted susceptibility with actual landslide occurrences to generate an error matrix for validation results and producing an error matrix to quantify the accuracy of the susceptibility map.

from qgis.core import QgsProcessing
from qgis.core import QgsProcessingAlgorithm
from qgis.core import QgsProcessingMultiStepFeedback
from qgis.core import QgsProcessingParameterFileDestination
from qgis.core import QgsProcessingParameterFile
from qgis.core import QgsProcessingParameterVectorLayer
from qgis.core import QgsProcessingParameterRasterLayer
from qgis.core import QgsProcessingParameterField
from qgis.core import QgsProcessingParameterString
from qgis.core import QgsVectorLayer
from qgis.core import QgsProcessingParameterFeatureSink
from qgis.core import QgsProject
from qgis.core import QgsFeatureSink
import processing
import numpy as np
import random
import string
import os
import pandas as pd
class Accuracy_assessment_and_sampling(QgsProcessingAlgorithm):        
def initAlgorithm(self, config=None):
	self.addParameter(QgsProcessingParameterVectorLayer('vectorwithclassificationandreference', 'Vector with reference data', types=[QgsProcessing.TypeVector], defaultValue=None))
	#self.addParameter(QgsProcessingParameterVectorLayer('vectorwithclassificationandreference', 'Vector with classification', types=[QgsProcessing.TypeVector], defaultValue='C:/Users/Gorica/Documents/Teaching_2020_2021/Lab/Sampling points/stratified_random_sampling1_classified.gpkg'))
	self.addParameter(QgsProcessingParameterField('reference', 'Reference data column', type=QgsProcessingParameterField.Numeric, parentLayerParameterName='vectorwithclassificationandreference', allowMultiple=False, defaultValue=None))
	self.addParameter(QgsProcessingParameterString('Newfieldname', 'Name of new field for raster values', multiLine=False, defaultValue='Thematic_class'))
	self.addParameter(QgsProcessingParameterRasterLayer('raster', 'Raster to be sampled', defaultValue=None))
	self.addParameter(QgsProcessingParameterFileDestination('Outputfolder', 'Error matrix output path', fileFilter='CSV Files (*.csv)', defaultValue=None))
	self.addParameter(QgsProcessingParameterFeatureSink('Sampled', 'Sampled', type=QgsProcessing.TypeVectorAnyGeometry, createByDefault=True))#, defaultValue=None))
	#self.addParameter(QgsProcessingParameterVectorDestination('Sampled', 'Sampled', type=QgsProcessing.TypeVectorPoint, defaultValue=None, optional=True))
	
def processAlgorithm(self, parameters, context, model_feedback):
	# Use a multi-step feedback, so that individual child algorithm progress reports are adjusted for the
	# overall progress through the model
	feedback = QgsProcessingMultiStepFeedback(1, model_feedback)
	results = {}
	outputs = {}

	#reference=parameters['reference']
	#classification=parameters['Newfieldname'] 
	#output_folder=parameters['Outputfolder']
	
	# Sample raster values
	alg_params = {
		'COLUMN_PREFIX': parameters['Newfieldname'],
		'INPUT': parameters['vectorwithclassificationandreference'],
		'RASTERCOPY': parameters['raster'],
		'OUTPUT': parameters['Sampled']
	}

	SampleRasterValues= processing.run('qgis:rastersampling', alg_params, context=context, feedback=feedback, is_child_algorithm=True)
	

	vlayer = QgsVectorLayer(SampleRasterValues['OUTPUT'])

	
	idx_1 = vlayer.fields().indexFromName(parameters['reference'])
	idx_2 = vlayer.fields().indexFromName(parameters['Newfieldname'])
	

	list_class = []
	list_ref = []
	
	features =  vlayer.getFeatures()
	
	for ft in features:
		if ft.attributes()[idx_2]!=None and ft.attributes()[idx_1]!=None:
			list_class.append(ft.attributes()[idx_2])
			list_ref.append(ft.attributes()[idx_1])

	feedback.pushInfo(str(idx_2))
	error_matrix1=pd.crosstab(pd.Series(list_class, name=parameters['Newfieldname']),pd.Series(list_ref, name=parameters['reference']), dropna=False)

	cls_cat=error_matrix1.index.values
	
	#extract all columns values (classes of existing dataset)
	ref_cat=error_matrix1.columns.values
	#make union of index and column values
	cats=(list(set(ref_cat) | set(cls_cat)))
	#reindex error matrix so that it has missing columns and fill the emtpy cells with 0.00000001
	error_matrix=error_matrix1.reindex(index=cats, columns=cats, fill_value=0.00000001)
	error_matrix.index.name=error_matrix.index.name+"/"+error_matrix.columns.name
	
	# OUTPUT 		
	diag_elem=np.diagonal(np.matrix(error_matrix))
	UA=(diag_elem/error_matrix.sum(axis=1))*(diag_elem>0.01)
	PA=diag_elem/error_matrix.sum(axis=0)*(diag_elem>0.01)
	OA=sum(diag_elem)/error_matrix.sum(axis=1).sum()
	error_matrix['UA']=UA.round(2)
	error_matrix['PA']=PA.round(2)
	error_matrix['OA']=np.nan
	error_matrix.loc[error_matrix.index[0],'OA']=OA

	error_matrix.to_csv(parameters['Outputfolder'])
	feedback.pushConsoleInfo('Error matrix saved in '+parameters['Outputfolder'])
	return results

def name(self):
	return 'Accuracy_assessment_and_sampling'

def displayName(self):
	return 'Accuracy assessment and sampling'

def group(self):
	return 'raster'

def groupId(self):
	return ''
def createInstance(self):
	return Accuracy_assessment_and_sampling()

											

Algorithm Description

This algorithm takes sample points with reference classes and sample raster in the same location of sample points. Then it estimates error matrix and accuracy indexes by cross tabulating field with reference classes and field with classification classes.

Input Parameters

  • Vector with Reference Data: Vector of sampling points which are already classified (i.e. reference point data).
  • Raster to be Sampled: Raster is the thematic map for which accuracy is being estimated. This raster will be sampled in the location of sampling points.
  • Reference Data Column: Name of the column in which sampled raster values will be stored.

Outputs

  • Error Matrix Output Path: Mandatory!!! Path and name of the output file with error matrix and accuracy indexes. Accuracy indexes estimated are Overall accuracy, User's accuracy, and Producer's accuracy.
  • Sampled: Mandatory!!! Path and name of the output vector file that contains all the data as "Vector with reference data" plus the new column with values sampled from raster.

Algorithm Author: Gorica Bratic
Algorithm Version: 1.0


Error Matrix and Validation

To validate the accuracy of our susceptibility map, we computed an Error Matrix using a subset of known landslide locations. The following table presents the computed matrix:

Predicted/Actual No Landslide Landslide UA PA OA
No Landslide 120 77 0.61 0.80 0.69
Landslide 30 118 0.80 0.61

WebGIS

Explore the interactive WebGIS application developed to allows users to interact with the data layers, toggle between different views, and gain insights into the spatial distribution of landslide risk and population exposure. This tool serves as a valuable resource for stakeholders, researchers, and policymakers involved in disaster risk management and urban planning.

Go to WebGIS