Druckansicht der Internetadresse:

Macroecology and Biogeography meeting

May 3rd to 6th 2023 - Universität Bayreuth

print page

Species distribution models based on ML and DL - can we reliably infer ecological effects?

Maximilian Pichler1, Florian Hartig1
1 Theoretical Ecology, University of Regensburg

O 4.4 in Session 4: From range dynamics to extinction

05.05.2023, 10:30-10:45, SWO conference room

Species distribution models (SDMs) increasingly use Machine learning (ML) and deep learning (DL) algorithms to describe the response of species to environmental predictors. The main reason to prefer these models over statistical models such as logistic regression is their superior predictive performance. A concern, however, is that this comes at the cost of explanatory power, in particular because i) ML and DL are more likely to exploit spurious correlations in the data and ii) because explainable AI (xAI) tools that extract effects or response curves from fitted models are often sensitive to feature collinearity.

Here, we show that all of these concerns are justified, but can be alleviated by appropriate methodological choices. First, similar to statistical models, a prerequisite for ML to learn correct causal effects is that feature selection must be based on causal principles, such as conditioning on confounders following Pearl's backdoor adjustment. We also show that this can increase the generalizability of the models, for example when predicting under climate change. Second, appropriate explainable AI (xAI) tools must be used: We propose to use Average Conditional Effects (ACE), which are robust to feature collinearity, to extract effects from ML models. Finally, the choice of ML algorithm is crucial. We show that if the other two conditions are met, neural networks and boosted regression trees are better suited than random forest to reliably separate collinear effects.

We conclude that under the right conditions and with the right tools, predictive ML models can provide more reliable effect estimates (e.g. about the environmental niche). Moreover, as a byproduct, causally constrained ML models often exhibit lower generalization errors, which is relevant when trying to build models for extrapolation, as is done in climate change predictions.



Export as iCal: Export iCal
Youtube-KanalKontakt aufnehmen
This site makes use of cookies More information