Tier 4 maps of soil pH at 25 m resolution for the Netherlands DOI Creative Commons
Anatol Helfenstein, Vera Leatitia Mulder, G.B.M. Heuvelink

et al.

Geoderma, Journal Year: 2021, Volume and Issue: 410, P. 115659 - 115659

Published: Dec. 25, 2021

Accurate and high resolution spatial soil information is essential for efficient sustainable land use, management conservation. Since the establishment of digital mapping (DSM) goals set by GlobalSoilMap (GSM) working group, great advances have been made to attain worldwide. Highly populated areas such as Netherlands demand multi-functional which key properties pH make decisions. We a) provide prediction maps at six standard depth layers between 0 m 2 25 resolution, whereby calibrated Quantile Regression Forest (QRF) model allows any desired depth, b) determine map accuracy using various statistical validation strategies evaluation uncertainty. This study unique among GSM products including design-based inference a probability sample an external assessment providing Tier 4 with spatially explicit thresholds end-users based on specifications. QRF models were tuned 15 338 observations from 4230 locations 195 covariates representing soil-forming factors. The following used quality: out-of-bag, location-grouped 10-fold cross-validation, independent (5677 observations, 1367 locations) stratified random separated layer. Mean error (ME), root mean squared (RMSE), efficiency coefficient (MEC) interval coverage (PICP) calculated in all four strategies. In addition, 90th intervals categorize each pixel into "none", A, AA or AAA quality measure internal assessment. obtained large differences depending layer (ME = −0.08–0.20, RMSE 0.41–0.83, MEC 0.64–0.90, PICP PI90 0.80–0.94). Design-based (LSK-SRS) was most indicative sampling theory 0.09–0.17, 0.7–0.79, 0.73–0.82). uncertainty slightly overestimated. Less than 10 % pixels designated therefore we recommend future studies also test achievability maps. believe these 3D are useful variety end users that our workflow can be applied elsewhere other further diminish gap missing information.

Language: Английский

Machine learning-based global maps of ecological variables and the challenge of assessing them DOI Creative Commons
Hanna Meyer, Edzer Pebesma

Nature Communications, Journal Year: 2022, Volume and Issue: 13(1)

Published: April 22, 2022

The recent wave of published global maps ecological variables has caused as much excitement it received criticism. Here we look into the data and methods mostly used for creating these maps, discuss whether quality predicted values can be assessed, globally locally.

Language: Английский

Citations

175

Global models and predictions of plant diversity based on advanced machine learning techniques DOI Creative Commons
Lirong Cai, Holger Kreft, Amanda Taylor

et al.

New Phytologist, Journal Year: 2022, Volume and Issue: 237(4), P. 1432 - 1445

Published: Nov. 14, 2022

Summary Despite the paramount role of plant diversity for ecosystem functioning, biogeochemical cycles, and human welfare, knowledge its global distribution is still incomplete, hampering basic research biodiversity conservation. Here, we used machine learning (random forests, extreme gradient boosting, neural networks) conventional statistical methods (generalized linear models generalized additive models) to test environment‐related hypotheses broad‐scale vascular gradients model predict species richness phylogenetic worldwide. To this end, 830 regional inventories including c . 300 000 predictors past present environmental conditions. Machine showed a superior performance, explaining up 80.9% 83.3% richness, illustrating great potential such techniques disentangling complex interacting associations between environment diversity. Current climate heterogeneity emerged as primary drivers, while conditions left only small but detectable imprints on Finally, combined predictions from multiple modeling (ensemble predictions) reveal patterns centers at resolutions down 7774 km 2 Our predictive maps provide accurate estimates available grain sizes relevant conservation macroecology.

Language: Английский

Citations

108

Spatially autocorrelated training and validation samples inflate performance assessment of convolutional neural networks DOI Creative Commons
Teja Kattenborn, Felix Schiefer, Julian Frey

et al.

ISPRS Open Journal of Photogrammetry and Remote Sensing, Journal Year: 2022, Volume and Issue: 5, P. 100018 - 100018

Published: June 21, 2022

Deep learning and particularly Convolutional Neural Networks (CNN) in concert with remote sensing are becoming standard analytical tools the geosciences. A series of studies has presented seemingly outstanding performance CNN for predictive modelling. However, such models is commonly estimated using random cross-validation, which does not account spatial autocorrelation between training validation data. Independent method, dependence will inevitably inflate model performance. This problem ignored most CNN-related suggests a flaw their procedure. Here, we demonstrate how neglecting during cross-validation leads to an optimistic assessment, example tree species segmentation multiple, spatially distributed drone image acquisitions. We evaluated CNN-based predictions test data sampled from 1) randomly hold-outs 2) blocked hold-outs. Assuming that block provides realistic performance, holdouts overestimated by up 28%. Smaller sample size increased this optimism. Spatial among observations was significantly higher within than different Thus, should be tested strategies multiple independent Otherwise, any geospatial deep method likely overestimated.

Language: Английский

Citations

83

Global relationships in tree functional traits DOI Creative Commons
Daniel S. Maynard, Lalasia Bialic‐Murphy, Constantin M. Zohner

et al.

Nature Communications, Journal Year: 2022, Volume and Issue: 13(1)

Published: June 8, 2022

Abstract Due to massive energetic investments in woody support structures, trees are subject unique physiological, mechanical, and ecological pressures not experienced by herbaceous plants. Despite a wealth of studies exploring trait relationships across the entire plant kingdom, dominant traits underpinning these aspects tree form function remain unclear. Here, considering 18 functional traits, encompassing leaf, seed, bark, wood, crown, root characteristics, we quantify multidimensional expression. We find that nearly half variation is captured two axes: one reflecting leaf economics, other size competition for light. Yet orthogonal axes reveal strong environmental convergence, exhibiting correlated responses temperature, moisture, elevation. By subsequently relationships, show full dimensionality space eight distinct clusters, each aspect function. Collectively, this work identifies core set needed global patterns biodiversity, it contributes our fundamental understanding functioning forests worldwide.

Language: Английский

Citations

80

Dealing with clustered samples for assessing map accuracy by cross-validation DOI Creative Commons
Sytze de Bruin, D.J. Brus, G.B.M. Heuvelink

et al.

Ecological Informatics, Journal Year: 2022, Volume and Issue: 69, P. 101665 - 101665

Published: May 5, 2022

Mapping of environmental variables often relies on map accuracy assessment through cross-validation with the data used for calibrating underlying mapping model. When points are spatially clustered, conventional leads to optimistically biased estimates accuracy. Several papers have promoted spatial as a means tackle this over-optimism. Many these blame autocorrelation cause bias and propagate widespread misconception that proximity calibration validation invalidates classical statistical maps. We present evaluate alternative approaches assessing from clustered sample data. The first method uses inverse sampling-intensity weighting correct selection bias. Sampling-intensity is estimated by two-dimensional kernel approach. two other model-based methods rooted in geostatistics, where assumes homogeneity residual variance over study area whilst second accounts heteroscedasticity function sampling intensity. were tested compared against k-fold blocked estimate metrics above-ground biomass soil organic carbon stock maps covering western Europe. Results acquired 100 realizations five designs ranging non-clustered strongly confirmed heteroscedastic had smaller than all but most design. For design large portions predicted extrapolation, was closest reference metrics, still biased. such cases, extrapolation best avoided additional or limitation prediction area. Weighted recommended moderately samples, while random suits fairly regularly spread samples.

Language: Английский

Citations

72

Assessing and improving the transferability of current global spatial prediction models DOI Creative Commons
Marvin Ludwig, Álvaro Moreno‐Martínez, Norbert Hölzel

et al.

Global Ecology and Biogeography, Journal Year: 2023, Volume and Issue: 32(3), P. 356 - 368

Published: Jan. 26, 2023

Abstract Aim Global‐scale maps of the environment are an important source information for researchers and decision makers. Often, these created by training machine learning algorithms on field‐sampled reference data using remote sensing as predictors. Since field samples often sparse clustered in geographic space, model prediction requires a transfer trained to regions where no available. However, recent studies question feasibility predictions far beyond location data. Innovation We propose novel workflow spatial predictive mapping that leverages developments this combines them innovative ways with aim improved transferability performance assessment. demonstrate, evaluate discuss from recently published global environmental maps. Main conclusions Reducing predictors those relevant leads increase map accuracy without decrease quality areas high sampling density. Still, reliable gap‐free were not possible, highlighting their evaluation hampered limited availability

Language: Английский

Citations

53

Corn Grain Yield Prediction Using UAV-Based High Spatiotemporal Resolution Imagery, Machine Learning, and Spatial Cross-Validation DOI Creative Commons
Patrick Killeen, Iluju Kiringa, Tet Yeap

et al.

Remote Sensing, Journal Year: 2024, Volume and Issue: 16(4), P. 683 - 683

Published: Feb. 14, 2024

Food demand is expected to rise significantly by 2050 due the increase in population; additionally, receding water levels, climate change, and a decrease amount of available arable land will threaten food production. To address these challenges security, input cost reductions yield optimization can be accomplished using precision maps created machine learning models; however, without considering spatial structure data, map’s accuracy evaluation assessment risks being over-optimistic, which may encourage poor decision making that lead negative economic impacts (e.g., lowered crop yields). In fact, most research involving including unmanned aerial vehicle (UAV) imagery-based prediction literature, ignore likely obtain over-optimistic results. The present work UAV corn study analyzed effects image spectral resolution, acquisition date, model scheme on performance. We used various generalization methods, cross-validation (CV), (a) identify models overfit found inside datasets (b) estimate true compared ranked power 55 vegetation indices (VIs) five bands over growing season. gathered data UAV-based multispectral (MS) red-green-blue (RGB) imagery from Canadian smart farm trained random forest (RF) linear regression (LR) 10-fold CV approaches. middle season produced best RF LR generally performed with high low resolution respectively. MS led better performance than RGB imagery. Some best-performing VIs were simple ratio index(near-infrared red-edge), normalized difference red-edge index, green index. coupled could models. When imagery, obtained 0.81 0.56 correlation coefficient (CC), respectively, when CV, 0.39 0.41, k-means-based approach. Furthermore, only location features, an average CC 1.00 0.49, This suggested had generalizability RF, was overfitting data.

Language: Английский

Citations

25

The problematic case of data leakage: A case for leave-profile-out cross-validation in 3-dimensional digital soil mapping DOI Creative Commons
Kingsley John, Daniel D. Saurette, Brandon Heung

et al.

Geoderma, Journal Year: 2025, Volume and Issue: 455, P. 117223 - 117223

Published: March 1, 2025

Language: Английский

Citations

2

Spatial statistics and soil mapping: A blossoming partnership under pressure DOI Creative Commons
G.B.M. Heuvelink, R. Webster

Spatial Statistics, Journal Year: 2022, Volume and Issue: 50, P. 100639 - 100639

Published: Feb. 15, 2022

For the better part of 20th century pedologists mapped soil by drawing boundaries between different classes which they identified from survey on foot or vehicle, supplemented air-photo interpretation, and backed an understanding landscape processes is formed. Its limitations for representing gradual spatial variation predicting conditions at unvisited sites became evident, in 1980s introduction geostatistics specifically ordinary kriging revolutionized thinking to a large extent practice. Ordinary based solely sample data variable interest—it takes no account related covariates. The latter were incorporated 1990s onward as fixed effects regression predictors, giving rise with external drift kriging. Simultaneous estimation coefficients variogram parameters best done residual maximum likelihood estimation. In recent years machine learning has become feasible huge sets environmental obtained sensors aboard satellites other sources produce digital maps. techniques are classification regression, but take correlations. Further, effectively 'black boxes'; lack transparency, their output needs be validated if trusted. They undoubtedly have merit; here stay. too, however, shortcomings when applied data, statisticians can help overcome. Spatial pedometricians still much do incorporate uncertainty into predictions, averages totals over regions, errors measurement positions data. must also communicate these uncertainties end users maps, whatever means made.

Language: Английский

Citations

65

Satellite Imagery to Map Topsoil Organic Carbon Content over Cultivated Areas: An Overview DOI Creative Commons
Emmanuelle Vaudour, Asa Gholizadeh, Fabio Castaldi

et al.

Remote Sensing, Journal Year: 2022, Volume and Issue: 14(12), P. 2917 - 2917

Published: June 18, 2022

There is a need to update soil maps and monitor organic carbon (SOC) in the upper horizons or plough layer for enabling decision support land management, while complying with several policies, especially those favoring storage. This review paper dedicated satellite-based spectral approaches SOC assessment that have been achieved from satellite sensors, study scales geographical contexts past decade. Most relying on pure models carried out since 2019 dealt temperate croplands Europe, China North America at scale of small regions, some hundreds km2: dry combustion wet oxidation were analytical determination methods used 50% 35% satellite-derived studies, which measured topsoil contents mainly referred mineral soils, typically cambisols luvisols lesser extent, regosols, leptosols, stagnosols chernozems, annual cropping systems value ~15 g·kg−1 range 30 median. prediction limited preprocessing based bare pixel retrieval after Normalized Difference Vegetation Index (NDVI) thresholding. About one third these partial least squares regression (PLSR), another random forest (RF), remaining included machine learning such as vector (SVM). We did not find any studies either deep all-performance evaluations uncertainty analysis spatial model predictions. Nevertheless, literature examined here identifies information, derived under conditions, an interesting approach deserves further investigations. Future research includes considering simultaneous imagery acquired dates i.e., temporal mosaicking, testing influence possible disturbing factors mitigating their effects fusing mixed incorporating non-spectral ancillary information.

Language: Английский

Citations

55