Limits and Promises of Earth Observation Foundation Models in Predicting Multi-Trophic Soil Biodiversity

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Limits and Promises of Earth Observation Foundation Models in Predicting Multi-Trophic Soil Biodiversity

Authors

Cerna, S.; Si-Moussi, S.; Calderon-Sanou, I.; Miele, V.; Thuiller, W.

Abstract

Soil biodiversity is essential for terrestrial ecosystems, influencing nutrient cycling, carbon sequestration, agricultural productivity, and resilience to environmental changes. Yet, it faces significant threats from land-use changes, pollution, agricultural intensification, and climate change. Effective predictive modeling tools are urgently needed to inform conservation and management strategies. Although species distribution models (SDMs) have been successful for aboveground biodiversity, their application to soil biodiversity is limited by scarce large-scale datasets with spatial mismatches between environmental data and soil habitats. Recent advances, including environmental DNA (eDNA) metabarcoding, now allow extensive multi-taxa assessments of soil biodiversity. Simultaneously, remote sensing technologies provide high-resolution spatial data, potentially overcoming traditional coarse-gridded environmental limitations. This study evaluates Earth Observation Foundation (EOF) models, deep learning models pretrained on massive remote sensing datasets to summarize earth observation images into embeddings, to predict multi-trophic soil biodiversity in the French Alps. We compare models using EOF-derived embeddings from orthophotos with coarse-gridded and high-quality in-situ variables. We modeled relative abundance for 51 trophic groups across seven taxa using Random Forest, Light Gradient Boosting Machine, and Artificial Neural Networks, evaluating four data configurations: coarse-gridded environmental data, high-quality in-situ data, EOF embeddings, and a hybrid embedding-tabular approach. High-quality in-situ climate and soil data consistently delivered the highest predictive accuracy, especially for microbial and fungal groups. EOF embeddings provided valuable spatial context but did not surpass in-situ data performance, showing partial redundancy. Integrating remote sensing data can enhance biodiversity modeling in areas lacking detailed in-situ measurements, underscoring their complementary role in ecological assessments.

Follow Us on

0 comments

Add comment