1. Machine learning based attribution of flood risk trends 


Heidi Kreibich

Our research Interests are the following: flood risk assessment, flood vulnerability analysis and loss modelling, flood damage mitigation and risk management, multi risk assessments. Our approaches are: machine learning particularly decision tree approaches, Bayesian Networks, Bayesian statistic, using R and Stan.

What is the data science project’s research question? How can qualitative and quantitative flood risk data from the last 50 years be combined and analyzed together to attribute temporal changes?

What data will be worked on? The Panta Rhei benchmark dataset of socio-hydrological data of paired events comprises 24 single or multiple paired event case studies of flood across different socio-economic and hydro-climatic contexts globally. The (semi-)quantitative and qualitative data categorized in about 20 variables per event, describe hazard, exposure and vulnerability of the paired events as well as changes in risk management and policy that happened between the events. Combined with daily and monthly streamflow data from the Global Runoff data center (https://www.bafg.de/GRDC/EN/Home/homepage_node.html), covering the period 1901 – up to present. 

What tasks will this project involve?  Data analyses, particualrly trend detection and attribution using socio-hydrological differntial equations and Bayesian inference in R and stan

What makes this project interesting to work on?  Tackling the challenge of considering and jointly analysing quantitative and qualitative data is exciting, novel and urgently needed, since it becomes more and more apparent, that human behaviour and socio-economic processes have an important, partly even dominating influence on risk of natural hazards. Which leads us to the “difficult” empirical data, i.e. mainly quantitative data to describe the physical processes, mainly semi-quantitative and qualitative data to describe the socio-economic processes.

What is the expected outcome?  Contribution to research paper

What infrastructure, programs and tools will be used? Can they be used remotely?  Mainly R and Stan, can be used remotely

What skills are necessary for this project? Data analytics / statistics, Scientific computation, Data mining / Machine learning

Is the data open source? the data will be made open source as soon as the first paper about it will be published. Currently it is avaialbel for my lab and some others.

Interested candidates should be at PhD level.  Heidi Kreibich is looking for 1 visiting scientist, working on the project together with the team.