Document Type
Article
Publication Date
12-31-2023
Publication Source
Geocarto International
Abstract
In this paper, we examined the degree to which inherent spatial structure in soil properties influences the outcomes of machine learning (ML) approaches to predicting soil spatial variability. We compared the performances of four ML algorithms (support vector machine, artificial neural network, random forest, and random forest for spatial data) against two non-ML algorithms (ordinary least squares regression and spatial filtering regression). None of the ML algorithms produced residuals that had lower mean values or were less autocorrelated over space compared with the non-ML approaches. We recommend the use of random forest when a soil variable of interest is weakly autocorrelated (Moran's I < 0.1) and spatial filtering regression when it is relatively strongly autocorrelated (Moran's I > 0.4). Overall, this work opens the door to a more consistent selection of model algorithms through the establishment of threshold criteria for spatial autocorrelation of input variables.
ISBN/ISSN
1010-6049
Document Version
Published Version
Publisher
Taylor & Francis
Volume
38
Peer Reviewed
yes
Issue
1
eCommons Citation
Kim, Daehyun; Song, Insang; Miralha, Lorrayne; Hirmas, Daniel R.; McEwan, Ryan W.; Mueller, Tom G.; and Samonil, Pavel, "Consequences of Spatial Structure in Soil–Geomorphic Data on the Results of Machine Learning Models" (2023). Biology Faculty Publications. 368.
https://ecommons.udayton.edu/bio_fac_pub/368
Included in
Biology Commons, Biotechnology Commons, Cell Biology Commons, Genetics Commons, Microbiology Commons, Molecular Genetics Commons
Comments
This open-access article is provided for download in compliance with the publisher’s policy on self-archiving. To view the version of record, use the DOI: https://doi.org/10.1080/10106049.2023.2245381