Skip to contents

This function conducts a model evaluation based on either on the fitted point data or any supplied independent. Currently only supporting point datasets. For validation of integrated models more work is needed.


# S4 method for ANY,character,sf,character,character

# S4 method for SpatRaster,character,sf,character



A fitted BiodiversityDistribution object with set predictors. Alternatively one can also provide directly a SpatRaster, however in this case the point layer also needs to be provided.


Should the validation be conducted on the continious prediction or a (previously calculated) thresholded layer in binary format? Note that depending on the method different metrics can be computed. See Details.


In case multiple layers exist, which one to use? (Default: 'mean').


A sf object with type POINT or MULTIPOINT.


A character vector with the name of the column containing the independent observations. (Default: 'observed').


Other parameters that are passed on. Currently unused.


Return a tidy tibble with validation results.


The 'validate' function calculates different validation metrics depending on the output type.

The output metrics for each type are defined as follows: Continuous:

  • 'n' = Number of observations.

  • 'rmse' = Root Mean Square Error, $$ \sqrt {\frac{1}{N} \sum_{i=1}^{N} (\hat{y_{i}} - y_{i})^2} $$

  • 'mae' = Mean Absolute Error, $$ \frac{ \sum_{i=1}^{N} y_{i} - x_{i} }{n} $$

  • 'logloss' = Log loss, TBD

  • 'normgini' = Normalized Gini index, TBD

  • 'cont.boyce' = Continuous Boyce index, TBD


  • 'n' = Number of observations.

  • 'auc' = Area under the curve, TBD

  • 'overall.accuracy' = Overall Accuracy, TBD

  • 'true.presence.ratio' = True presence ratio or Jaccard index, TBD

  • 'precision' = Precision, TBD

  • 'sensitivity' = Sensitivity, TBD

  • 'specificity' = Specifivity, TBD

  • 'tss' = True Skill Statistics, TBD

  • 'f1' = F1 Score or Positive predictive value, $$ \frac{2TP}{2TP + FP + FN} $$

  • 'logloss' = Log loss, TBD

  • 'expected.accuracy' = Expected Accuracy, $$ \frac{TP + FP}{N} x \frac{TP + FN}{N} + \frac{TN + FN}{N} x \frac{TN + FP}{N} $$

  • 'kappa' = Kappa value, $$ \frac{2 (TP x TN - FN x FP)}{(TP + FP) x (FP + TN) + (TP + FN) x (FN + TN) } $$,

  • 'brier.score' = Brier score, $$ \frac{ \sum_{i=1}^{N} (y_{i} - x_{i})^{2} }{n} $$, where $y_i$ is predicted presence or absence and $x_i$ an observed. where TP is true positive, TN a true negative, FP the false positive and FN the false negative.


If you use the Boyce Index, please cite the original Hirzel et al. (2006) paper.


  • Liu, C., White, M., Newell, G., 2013. Selecting thresholds for the prediction of species occurrence with presence-only data. J. Biogeogr. 40, 778–789.

  • Hirzel, A. H., Le Lay, G., Helfer, V., Randin, C., & Guisan, A. (2006). Evaluating the ability of habitat suitability models to predict species presences. Ecological modelling, 199(2), 142-152.


if (FALSE) {
 # Assuming that mod is a distribution object and has a thresholded layer
 mod <- threshold(mod, method = "TSS")
 validate(mod, method = "discrete")