Add a control to a BiodiversityModel object to control biases
Source:R/add_control_bias.R
add_control_bias.Rd
Sampling and other biases are pervasive drivers of the spatial location of biodiversity datasets. While the integration of other, presumably less biased data can be one way of controlling for sampling biases, another way is to control directly for the bias in the model. Currently supported methods are:
"partial"
- An approach described by Warton et al. (2013) to control the biases in a model, by including a specified variable ("layer") in the model, but "partialling" it out during the projection phase. Specifically the variable is set to a specified value ("bias_value"), which is by default the minimum value observed across the background."offset"
- Dummy method that points to theadd_offset_bias()
functionality (see note). Makes use of offsets to factor out a specified bias variable."proximity"
- Use the proximity or distance between points as a weight in the model. This option effectively places greater weight on points farther away. Note: In the best case this can control for spatial bias and aggregation, in the worst case it can place a lot of emphasis on points that likely outliers or misidentification (in terms of species).
See also details for some explanations.
Usage
add_control_bias(
x,
layer,
method = "partial",
bias_value = NULL,
maxdist = NULL,
alpha = 1,
add = TRUE
)
# S4 method for class 'BiodiversityDistribution'
add_control_bias(
x,
layer,
method = "partial",
bias_value = NULL,
maxdist = NULL,
alpha = 1,
add = TRUE
)
Arguments
- x
distribution()
(i.e.BiodiversityDistribution
) object.- layer
A
sf
orSpatRaster
object with the range for the target feature. Specify a variable that is not already added to"x"
to avoid issues with duplications.- method
A
character
vector describing the method used for bias control. Available options are"partial"
(Default),"offset"
or"proximity"
.- bias_value
A
numeric
with a value for"layer"
. Specifying anumeric
value here setslayer
to the target value during projection. By default the value is set to the minimum value found in the layer (Default:NULL
).- maxdist
A
numeric
giving the maximum distance if method"proximity"
is used. If unset it uses by default the distance to the centroid of a minimum convex polygon encircling all points.- alpha
A
numeric
given the initial weight to points if method"proximity"
is used (Default:1
). For example, if set to values smaller than1
neighbouring points will be weighted less.- add
logical
specifying whether a new offset is to be added. Setting this parameter toFALSE
replaces the current offsets with the new one (Default:TRUE
).
Value
Adds bias control option to a distribution
object.
Details
In the case of "proximity"
weights are assigned to each point, placing
higher weight on points further away and with less overlap. Weights are are
assigned up to a maximum of distance which can be provided by the user
(parameter "maxdist"
). This distance is ideally informed by some
knowledge of the species to be modelled (e.g., maximum dispersal distance).
If not provided, it is set to the distance of the centroid of a minimum
convex polygon encircling all observations. The parameter "alpha"
is a
weighting factor which can be used to diminish the effect of neighboring
points.
For a given observation \(i\), the weight \(w\) is defined as
$$w_i = 1 / (1 + \epsilon)$$ where $$\epsilon = \sum_{n=1}^{N}((1 -
d_n)/d_sac)^\alpha$$ in which \(N\) is the total number of points closer
than the maximum distance (\(d_sac\)) of point \(i\), and \(d_n\) the
distance between focal point \(i\) and point \(n\).
Note
Covariate transformations applied to other predictors need to be applied to bias too.
Another option to consider biases particular in Poisson-point process models
is to remove them through an offset. Functionality to do so is available
through the add_offset_bias()
method. Setting the method to
"offset"
will automatically point to this option.
References
Warton, D.I., Renner, I.W. and Ramp, D., 2013. Model-based control of observer bias for the analysis of presence-only data in ecology. PloS one, 8(11), p.e79168.
Merow, C., Allen, J.M., Aiello-Lammens, M., Silander, J.A., 2016. Improving niche and range estimates with Maxent and point process models by integrating spatially explicit information. Glob. Ecol. Biogeogr. 25, 1022–1036. https://doi.org/10.1111/geb.12453
Botella, C., Joly, A., Bonnet, P., Munoz, F., & Monestiez, P. (2021). Jointly estimating spatial sampling effort and habitat suitability for multiple species from opportunistic presence‐only data. Methods in Ecology and Evolution, 12(5), 933-945.
Examples
if (FALSE) { # \dontrun{
x <- distribution(background) |>
add_predictors(covariates) |>
add_control_bias(biasvariable, bias_value = NULL)
} # }