TY - JOUR
T1 - Quantifying effects of tracking data bias on species distribution models
AU - O'Toole, Malcolm
AU - Queiroz, Nuno
AU - Humphries, Nicolas E.
AU - Sims, David W.
AU - Sequeira, Ana M.M.
N1 - Publisher Copyright:
© 2020 British Ecological Society
PY - 2021/1
Y1 - 2021/1
N2 - Telemetry datasets are becoming increasingly large and covering a wider range of species using different technologies (GPS, Argos, light-based geolocation). Together, such datasets hold tremendous potential to understand species' space use at broad spatial scale, through the development of species distribution or habitat suitability models (SDMs) to predict environmental dependencies of species across space and time. However, tracking datasets can be heavily biased and an assessment of how such biases affect SDM predictions, and therefore, our interpretation of animal distributions is lacking. We generated simulated tracks based on predetermined environmental values for a random predator and a central place forager, and then sampled positions from those tracks based on a combination of five common biases in tracking datasets: (a) tagging location; (b) tracking device; (c) data gaps within tracks; (d) premature tag detachment (or failure) and (e) different processing methods. We then used 240 combinations of the resulting biased simulated datasets to develop binomial generalised linear (GLM) and additive (GAM) models to estimate habitat suitability in different environmental sets (cool deep, cool coastal, warm deep and warm coastal environments). Our results show that tagging location and length of tracks have the largest effects in decreasing model performance, but that these biases can be overcome by adding a small percentage of additional, relatively less biased tracks to the dataset. In comparison, the effects from all other biases were almost negligible, including for low resolution tracking datasets for which sufficient tracks are available. We also highlight the need for a cautionary approach when using processing methods that can introduce other biases (e.g. interpolated locations). Similar trends were obtained for the random predator and the central place forager, but with relatively lower model performance for the latter. We provide evidence that even non-GPS tracking datasets can be readily used to improve the knowledge of large-scale space use by species without the need for detailed processing and tracking reconstruction. This is especially relevant in the current context of rapid increase in data acquisition and the urgent need to address the large spatial scale ecological consequences of global change.
AB - Telemetry datasets are becoming increasingly large and covering a wider range of species using different technologies (GPS, Argos, light-based geolocation). Together, such datasets hold tremendous potential to understand species' space use at broad spatial scale, through the development of species distribution or habitat suitability models (SDMs) to predict environmental dependencies of species across space and time. However, tracking datasets can be heavily biased and an assessment of how such biases affect SDM predictions, and therefore, our interpretation of animal distributions is lacking. We generated simulated tracks based on predetermined environmental values for a random predator and a central place forager, and then sampled positions from those tracks based on a combination of five common biases in tracking datasets: (a) tagging location; (b) tracking device; (c) data gaps within tracks; (d) premature tag detachment (or failure) and (e) different processing methods. We then used 240 combinations of the resulting biased simulated datasets to develop binomial generalised linear (GLM) and additive (GAM) models to estimate habitat suitability in different environmental sets (cool deep, cool coastal, warm deep and warm coastal environments). Our results show that tagging location and length of tracks have the largest effects in decreasing model performance, but that these biases can be overcome by adding a small percentage of additional, relatively less biased tracks to the dataset. In comparison, the effects from all other biases were almost negligible, including for low resolution tracking datasets for which sufficient tracks are available. We also highlight the need for a cautionary approach when using processing methods that can introduce other biases (e.g. interpolated locations). Similar trends were obtained for the random predator and the central place forager, but with relatively lower model performance for the latter. We provide evidence that even non-GPS tracking datasets can be readily used to improve the knowledge of large-scale space use by species without the need for detailed processing and tracking reconstruction. This is especially relevant in the current context of rapid increase in data acquisition and the urgent need to address the large spatial scale ecological consequences of global change.
KW - Global Positioning System
KW - big data
KW - geolocation
KW - global scale
KW - habitat suitability models
KW - marine megafauna
KW - tracking
UR - http://www.scopus.com/inward/record.url?scp=85092719062&partnerID=8YFLogxK
U2 - 10.1111/2041-210X.13507
DO - 10.1111/2041-210X.13507
M3 - Article
SN - 2041-210X
VL - 12
SP - 170
EP - 181
JO - Methods in Ecology and Evolution
JF - Methods in Ecology and Evolution
IS - 1
ER -