Creating biodiversity projections
Besides inferring environmental responses and mapping the contemporary distribution of any species (see Train simple Model), species distribution models (SDMs) are commonly used to make predictions on how biodiversity is expected to change under future conditions. Usually this is being done by first training a model on present observations and then projecting the resulting \(\beta\) coefficients to another set of environmental predictors from another time period or spatial region.
ibis.iSDM R-package provides direct support for
creating such projections. This requires a previously fitted
DistributionModel] and a new set of covariates which match
in names the covariates on which the fitted model was trained. The key
functions here are
which are specific to such projections and result in the creation of a
BiodiversityScenario] object. For other functions the same
syntax as for previously trained models applies, e.g. new predictor can
be added via
add_predictors() or thresholds applied via
threshold(). In addition the
R-package allows the specification of a number of constrains on the
projections, such as for instance dispersal constrains based on the
expected or simulated chance of individuals dispersing to neighbouring
grid cells. Such constrains can be added via
add_constrain() (and other constrain functionalities in
Note: Projecting relative suitability estimates into conditions other than the observational records for which the SDM was trained comes with a number of assumptions, most importantly that relationships between occurrences and environmental conditions are in equilibrium and similar biases and conditions apply in the future. See Elith et al. (2010) and discussion in Zurell et al. (2016) for an introduction and comparative overview.
Load relevant packages and testing data
# Load the packages library(ibis.iSDM) #> Warning: replacing previous import 'posterior::%in%' by 'raster::%in%' when #> loading 'ibis.iSDM' #> Warning: replacing previous import 'posterior::match' by 'raster::match' when #> loading 'ibis.iSDM' library(stars) library(xgboost) library(raster) library(igraph) library(ggplot2) library(ncdf4) library(assertthat)
For the purpose of this example we are loading some testing data on species distributions as well as contemporary and future predictors. Note the names of predictors used for building a distribution model have to be consistent with those for creating projections!
# Background and biodiversity data background <- raster::raster(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM')) virtual_points <- sf::st_read(system.file('extdata/input_data.gpkg', package='ibis.iSDM'), 'points', quiet = TRUE) # Note we are loading different predictors than in previous examples # These are in netcdf4 format, a format specific for storing spatial-temporal data including metadata. ll <- list.files(system.file("extdata/predictors_presfuture/", package = "ibis.iSDM", mustWork = TRUE), "*.nc",full.names = TRUE) # From those list of predictors are first loading the current ones as raster data # We are loading only data from the very first, contemporary time step for model fitting pred_current <- raster::stack() for(i in ll) pred_current <- addLayer(pred_current, raster::raster(i, layer = 1) ) projection(pred_current) <- projection(background) # Same projection as background # Get future predictors # These we will load in using the stars package and also ignoring the first time step pred_future <- stars::read_stars(ll) |> stars:::slice.stars('Time', 2:86) st_crs(pred_future) <- st_crs(4326) # Set projection # Rename future predictors to those of current names(pred_future) <- names(pred_current) # Plot the test data plot(pred_current$Secondary.forest.vegetation, col = colorRampPalette(c("grey20", "orange", "lightgreen", "green"))(10), main = "Share of secondary vegetation")
## Train model and create a future projection
We will make use of the data loaded above to (a) first create a species distribution model for contemporary conditions and (b) project the obtained coefficients into the future using future predictors. For guidance on how distribution models are trained, see other vignettes (1).
# Train model adding the data loaded above x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'Observed', name = 'Virtual points') |> # Note that we scale the predictors here add_predictors(pred_current, transform = 'scale',derivates = 'none') |> engine_glmnet(alpha = 0) #> Loaded glmnet 4.1-7 #> [Setup] 2023-06-01 10:22:43.470171 | Creating distribution object... #> [Setup] 2023-06-01 10:22:43.479532 | Adding poipo dataset... #> [Setup] 2023-06-01 10:22:44.10428 | Adding predictors... #> [Setup] 2023-06-01 10:22:44.105618 | Transforming predictors... # Train the model modf <- train(x, runname = 'Simple PPM', verbose = FALSE) #> [Estimation] 2023-06-01 10:22:44.467157 | Collecting input parameters. #> [Estimation] 2023-06-01 10:22:44.545805 | Adding engine-specific parameters. #> [Estimation] 2023-06-01 10:22:44.547683 | Engine setup. #> [Estimation] 2023-06-01 10:22:44.848218 | Starting fitting: Virtual points #> [Estimation] 2023-06-01 10:22:46.154542 | Starting prediction... #> [Done] 2023-06-01 10:22:46.285072 | Completed after 1.81 secs # Add a threshold to this model by getting 05 percentile of values modf <- threshold(modf, method = 'percentile', value = 0.05) # -- # # Now lets create a scenarios object via scenarios sc <- scenario(modf) |> # Apply the same variable transformations as above. add_predictors(pred_future, transform = 'scale') |> # Calculate thresholds at each time step. The threshold estimate is taken from the model object. threshold() #> [Setup] 2023-06-01 10:22:46.415373 | Adding scenario predictors... #> [Setup] 2023-06-01 10:22:46.415801 | Transforming predictors... # This creates a scenario object sc #> Spatial-temporal scenario: #> Used model: GLMNET-Model #> --------- #> Predictors: bio01, bio12, Cultivated.land, ... (9 predictors) #> Time period: 2016-01-01 -- 2100-01-01 (83.9 years) #> --------- #> Threshold: 0.032 (percentile) #> --------- #> Scenarios fitted: None # The object contains its own functions. See the scenarios help file for more information on # what is possible with them names(sc) #>  "modelid" ".super" "get_resolution" #>  "print" "get_predictors" "show" #>  "get_limits" "get_predictor_names" "summary" #>  "plot_animation" "calc_scenarios_slope" "plot" #>  "set_constraints" "limits" "get_timeperiod" #>  "threshold" "constraints" "plot_migclim" #>  "rm_predictors" "apply_threshold" "get_constraints" #>  "set_predictors" "verify" "save" #>  "plot_threshold" "get_data" ".that" #>  "get_projection" "predictors" "summary_beforeafter" #>  "scenarios" "get_model" "plot_relative_change" #>  "get_threshold" "modelobject"
The scenario object can finally be trained via
sc.fit1 <- sc |> project() #> [Scenario] 2023-06-01 10:22:49.033122 | Starting suitability projections for 85 timesteps. # Note that an indication of fitted scenarios has been added to the object sc.fit1 #> Spatial-temporal scenario: #> Used model: GLMNET-Model #> --------- #> Predictors: bio01, bio12, Cultivated.land, ... (9 predictors) #> Time period: 2016-01-01 -- 2100-01-01 (83.9 years) #> --------- #> Threshold: 0.032 (percentile) #> --------- #> Scenarios fitted: Yes
Summarizing and plotting the fitted projections
As with distribution models there are a number of ways how the scenarios can be visualized and interacted with:
plot()makes a visualization of the projections over all time steps (!)
plot_relative_change()calculates the change in suitability area between the first and the last timestep and categorizes the result accordingly. Note that SDMs as such cannot directly infer colonization or extinction, but only gains or losses of suitable habitat!
calc_scenarios_slope()calculates the slope (rate of change) across timesteps. Useful for summarizing results
summary()creates a summary output of the contained scenarios. If a
threshold()is specified, this function will summarize the amount of area at each timestep.
get_data()gets the created scenarios a
starsobject (plus thresholds if specified).
# Plot all scenarios. With a large number of predictors this figure will be messy... plot(sc.fit1) # or sc.fit1$plot()
# As an alternative, visualize the linear slope per grid cell and across all time steps o <- sc.fit1$calc_scenarios_slope() #> Warning in classInt::classIntervals(na.omit(values), min(nbreaks - 1, n.unq), : #> N is large, and some styles will run very slowly; sampling imposed
# Another option is to calculate the relative change between start and finish o <- sc.fit1$plot_relative_change(plot = TRUE) print(o)
# We can also summarize the thresholded data o <- sc.fit1$summary() plot(area_km2~band, data = o, type = 'b', main = "Suitable habitat across Time", ylab = "Amount of area (km2)", xlab = "Time")
# How does habitat gain and loss change over time? plot(totchange_gain_km2~band, data = o, type = 'n', main = "Habitat gain and loss", ylim = c(-2e4, 2e4), ylab = "Amount of area (km2)", xlab = "Time") lines(o$totchange_gain_km2~o$band, col = "blue") lines((o$totchange_loss_km2)~o$band, col = "red")
Finally, scenarios projections can also be saved as specific outputs.
As before, this is enabled via
write_output() and works
just the same for [
BiodiversityScenario] objects, with the
only difference being that the output can be specified as netCDF-4
Adding constraints to projections
In the simple scenario above we use the naive assumption that any,
depending on the response functions of the fitted distribution model,
any suitable habitat within the background modelling region is
potentially reachable by the species. In reality there might however be
geographic (e.g. islands), environmental and biotic constraints on how
far a species can disperse. These can be specified through the constrain
add_constraint()] and a variety of constraints is
currently available, some of which depend on other packages.
add_constraint()Generic wrapper to which a specific ‘method’ can be supplied. See documentation for more information on available options and parameters.
add_constraint_dispersal()To add a dispersal constraint to the projections which is applied after each time step. Supports various options with
'sdd_fixed'for fixed dispersal kernels,
'sdd_nexpkernel'for a negative exponential kernel or
'sdd_kissmig'for applying the kissmig framework.
add_constraint_MigClim()Use the MigClim R-package to simulate dispersal events between time steps. A number of parameters are required here and adding this constrain will also overwrite some default plotting capacities (For example via
sc$plot_migclim()). See also the help file and Engler et al. (2012) for more information.
add_constraint_connectivity()Add a connectivity constrain to the projection. Currently only hard barriers are implemented, but in future additional sub-modules are planned to enable more options here.
add_constraint_adaptability()Simple constraints on the adaptability of species to novel climatic conditions. Currently only simple nichelimits are implemented, which ‘cap’ projections in novel environments to the observed ranges of contemporary predictors.
add_constraint_boundary()Specifying a hard boundary constraint on all projections, for example by limiting (future) projections to only a certain area such as a biome or a contemporary range.
Lastly there are also options to stabilize suitability
projections via the
project() function. Specifying a
stabilization here results in the projections being smoothed and
informed of incremental time steps. This can particularly help for
projections that use variables known to make sudden, abrupt jumps
between time steps (e.g. precipitation anomalies).
# Adding a simple negative exponential kernel to constrain the predictions sc.fit2 <- sc |> add_constraint(method = "sdd_nex", value = 1e5) |> # Directly fit the object project(stabilize = F) #> [Scenario] 2023-06-01 10:23:02.387105 | Starting suitability projections for 85 timesteps. # Also fit one projection a nichelimit has been added sc.fit3 <- sc |> add_constraint(method = "sdd_nex", value = 1e5) |> add_constraint_adaptability(method = "nichelimit") |> # Directly fit the object project(stabilize = F) #> [Scenario] 2023-06-01 10:23:18.822988 | Starting suitability projections for 85 timesteps. # Note how constrains are indicated in the scenario object. sc.fit3 #> Spatial-temporal scenario: #> Used model: GLMNET-Model #> --------- #> Predictors: bio01, bio12, Cultivated.land, ... (9 predictors) #> Time period: 2016-01-01 -- 2100-01-01 (83.9 years) #> --------- #> Constraints: dispersal (sdd_nexpkernel), adaptability (nichelimit) #> Threshold: 0.032 (percentile) #> --------- #> Scenarios fitted: Yes # The naive assumption is that there is unlimited dispersal across the whole background # Note how the projection with dispersal constrain results in a considerable smaller amount of suitable habitat. sc.fit1$plot(which = 40) # Baseline
sc.fit2$plot(which = 40) # With dispersal constrain
sc.fit3$plot(which = 40) # With dispersal limit and nichelimitation (within a standard deviation)
# Lets compare the difference in projections compared to the naive one defined earlier. o1 <- sc.fit1$summary() o2 <- sc.fit2$summary() o3 <- sc.fit3$summary() arlim <- c(min(o1$area_km2, o2$area_km2, o3$area_km2)-5000, max(o1$area_km2, o2$area_km2, o3$area_km2)) plot(area_km2~band, data = o1, type = 'n', ylim = arlim, main = "Suitable habitat projection", ylab = "Amount of area (km2)", xlab = "Time") lines(o1$area_km2~o1$band, col = "black", lty = 1) lines(o2$area_km2~o2$band, col = "black", lty = 2) lines(o3$area_km2~o3$band, col = "black", lty = 3) legend("bottomleft", legend = c("Unlimited dispersal", "Constrained dispersal", "Constrained dispersal and niche limit"), lty = c(1, 2, 3), cex = 1.2, bty = "n")
# Lastly it is also possible to directly summarize the state # before (usually first year) and end (last year). sc.fit2$summary_beforeafter() #> # A tibble: 13 × 5 #> runname category period value unit #> <chr> <chr> <chr> <dbl> <chr> #> 1 Simple PPM Current range 2016-01-01 415. ha #> 2 Simple PPM Future range 2100-01-01 401. ha #> 3 Simple PPM Unsuitable 84 years 206. ha #> 4 Simple PPM Loss 84 years 15.3 ha #> 5 Simple PPM Gain 84 years 1.12 ha #> 6 Simple PPM Stable 84 years 400. ha #> 7 Simple PPM Percent loss 84 years 3.69 % #> 8 Simple PPM Percent gain 84 years 0.270 % #> 9 Simple PPM Range change 84 years -14.2 ha #> 10 Simple PPM Percent change 84 years -6.41 % #> 11 Simple PPM Sorensen index 84 years 0.983 similarity #> 12 Simple PPM Centroid distance 84 years 20.2 km #> 13 Simple PPM Centroid change direction 84 years 19.4 deg
Another option for constraining prediction is also by imposing a
zonal limit (for instance climatically defined) on the projections (see
add_constraint_boundary() above). This has to
be done while fitting the SDM for the reference conditions (see the
example with limits (1) ) and
is considered when doing (future) projections.
Specific parsers for GLOBIOM related scenarios
IIASA’s Global Biosphere Management Model (GLOBIOM) is a partial equilibrium model and used to analyze the competition for land use between agriculture, forestry, and bioenergy, which are the main land-based production sectors. It builds on.
With the ibis.iSDM being part of IIASA’s suite of integrated models,
there is a direct link available to make use of downscaled GLOBIOM
outputs. Implemented are functions to either directly format the data
formatGLOBIOM()] or add them to any
BiodiversityScenario-class object directly via