The spatial allocation model for Country-level analysis (SPAMc) includes two types of models: (1) Minimization of cross-entropy, which is used in the global SPAM (You and Wood 2006; You, Wood, and Wood-Sichra 2009; You et al. 2014; Yu et al. 2020) and the maximization of fitness score, which was developed by Van Dijk et al. (2020).
The cross-entropy approach (min_entropy
) is defined as a
non-linear optimization problem in which the error between prior
information on the location of crop area shares (\(\pi_{ij}\)) and model allocated area shares
(\(s_{ijl}\)) is minimized subject to a
number of constraints.1 More precisely the cross-entropy objective
function (\(CE\)) is specified as
follows:
\[\begin{equation} \min\ CE(s_{ijl}, \pi_{ijl})= \sum_{i} \sum_{j} \sum_{l}s_{ijl} \ lns_{ijl}- \sum_{i} \sum_{j} \sum_{l}s_{ijl} \ ln\pi_{ijl} \end{equation}\]
In the maximization of fitness score approach
(max_score
) the spatial allocation model is defined as a
linear optimization problem in which the weighted ‘fitness score’ over
all grid cells, crops and farming systems is maximized. The ‘fitness’
score, which ranges between 0 and 1, is a composite indicator of the
economic and biophysical suitability for each crop-farming system
combination at the grid cell level. The solution to the model implies
that on average all crops will be allocated to locations, which are
considered as most appropriate from both an economic and biophysical
viewpoint. The objective function (\(S\)) is defined as follows:
\[\begin{equation} \max\ S(s_{ijl}, score_{ijl})= \sum_{i} \sum_{j} \sum_{l}s_{ijl} \times score_{ijl} \end{equation}\]
where, \(s_{ijl}\) is the share of total physical area of crop \(j\) and farming system \(l\) allocated to grid cell \(i\), and \(score_{ijl}\) is the fitness score for all \(j\) and \(l\) combinations.
Cross entropy minimization is better suited for maps at a resolution of 5 arcmin as it results in a more fragmented allocation per grid cell. This is desirable when the resolution is relatively coarse (e.g. 5 arcmin) and the number of crops and farming systems that occur in a grid cell can be relatively large. In contrast, maximizing the fitness score results in a much more concentrated crop area allocation, where only a few crops and farming systems are allocated to a grid cell. This makes sense for relatively small grid cells (e.g. 30 arcsec) where one only expects a small number of crop and farming systems but is not realistic for large grid cells. Hence, we recommend to use the cross-entropy model for 5 arcmin resolution and the fitness score model for 30 arcsec resolution.
To solve SPAMc and depending on the model, fitness scores or prior area information are needed to allocate the statistics on the grid. The fitness scores and priors are informed by a combination of economic (e.g. market access and population density) and biophysical factors (e.g growing conditions determined by soil, temperature and precipitation) that jointly determine the location of crops. As these factors will have different impact across farming systems, we define separate fitness scores and priors for each of them.
For subsistence farmers, who mainly grow crops for their own consumption, we assume that rural population density can be used as proxy for the location of subsistence farming systems. We used min-max normalization to create an index of rural population density between 0 and 1:
\[\begin{equation} score_{ijl} = \frac{rpd_{ijl}-min(rpd_{ijl})}{max(rpd_{ijl})-min(rpd_{ijl})} \quad l = subsistence \quad \forall i\ \forall j \end{equation}\]
where \(rpd_{ijl}\) is the rural population density defined for each grid cell \(i\), crop \(j\) and farming system \(l\), which is converted to the \(score_{ijl}\) between 0 and 1. In contrast to the other three systems, the allocation of subsistence farmers is modeled by means of a constraint instead of maximizing its score by means of the objective function (see next section). In this way, it is ensured that the total subsistence crop area is allocated proportional to the rural population density. Distributing subsistence farming systems by means of maximizing the fitness score, might lead to a solution in which a large share of subsistence crop area will be allocated to grid cells with the highest rural population density. Such a concentrated allocation is not realistic for subsistence farming.
Low-input rainfed systems are characterized by smallholder farms that mostly sell their products on local markets. We assume that farmers will specialize in growing crops with the highest biophysical suitability, while other factors such as market access, plays a minor role for this farming system. The fitness score is defined as follows:
\[\begin{equation} score_{ijl} = \frac{suit\_index_{ijl}-min(suit\_index_{ijl})}{max(suit\_index_{ijl})-min(suit\_index_{ijl})} \quad l = low\ input \quad \forall i\ \forall j \end{equation}\]
where \(score_{ijl}\) and \(suit\_index_{ijl}\) are the fitness score and the biophysical suitability score for grid cell \(i\), crop \(j\) and farming system \(l\), respectively. Similar to the fitness score for subsistence farming, we apply the min-max normalization approach to create an index in the range of 0-1.
We assume that high-input rainfed and irrigated farming systems mostly consist of medium to large commercial farms, which location is largely determined by profit maximization. To simulate this, the fitness core is defined as the square root of potential revenue and market access:
\[\begin{equation} score_{ijl} = \frac{\sqrt{access_{i} \times rev_{ijl}}-min(\sqrt{access_{i} \times rev_{ijl}})}{max(\sqrt{access_{i} \times rev_{ijl}})-min(\sqrt{access_{i} \times rev_{ijl}})} \quad l \in high\ input,\ irrigated \quad \forall i\ \forall j \end{equation}\]
where \(access_{ijl}\) is market access approximated by the inverse of travel time to large cities and \(rev_{ijl}\) is potential revenue, calculated as the product of potential yield (\(yield_{ijl}\)) for grid cell \(i\), crop \(j\) and system \(l\) and national level crop price (\(price_j\)):
\[\begin{equation} rev_{ijl} = pot\_yield_{ijl} \times price_{i} \end{equation}\]
Again, we use the min-max normalization to create an index in the range of 0-1.
We use the fitness scores as a basis to estimate the priors. To priors for subsistence farming are calculated as follows:
\[\begin{equation} prior_{ijl} =\frac{score_{ijl}}{\sum_{j} \sum_{l}{score_{ijl}}} \times crop\_area_{jl} \quad l = subsistence \quad \forall i\ \forall j \end{equation}\]
where \(prior_{ijl}\) is the prior crop area in ha for subsistence farming defined for each grid cell \(i\) and crop \(j\) and \(crop\_area_{jl}\) is the total physical area for per crop for the subsistence farming system taken from the statistics.
The prior areas for the other three systems are estimated by distributing the residual grid cell cropland (i.e. after subtracting the prior area for subsistence farming systems) using the scores as weights:
\[\begin{equation} prior_{ijl} = \frac{score_{ijl}}{\sum_{j} \sum_{l} score_{ijl}} \times residual\_cropland_{i} \quad l \in low\ input,\ high\ input,\ irrigated \quad \forall i\ \forall j \end{equation}\]
where \(residual\_cropland_{i}\) is the grid cell cropland after subtracting the prior for subsistence farming. In case the prior area for subsistence farming is larger than than the cropland in the grid cell all the priors for the other farming systems are set to zero.
Finally, the priors are converted in to crop area shares (\(\pi_{ij}\)), which are an input into the cross-entropy objective function:
\[\begin{equation} \pi_{ij} = \frac{prior_{ijl}}{\sum_{i}{prior_{ijl}}} \end{equation}\]
The allocation of crop area is determined by either minimizing the cross-entropy function or maximizing the fitness score subject to a set of constraints:
\[\begin{equation} 0 \leq s_{ijl} \leq 1 \end{equation}\]
\[\begin{equation} \sum_{i}s_{ij}=1 \qquad \forall j\ \forall l \end{equation}\]
\[\begin{equation} \sum_{j} \sum_{l}crop\_area_{jl} \times s_{ijl} \leq avail_{i} \qquad \forall i \end{equation}\]
where \(crop\_area_{jl}\) is total national-level physical area for crop \(j\) and farming system \(j\).
\[\begin{equation} \sum_{i \in k} \sum_{l}crop\_area_{jl} \times s_{ijl} = sub\_crop\_area_{jk} \qquad \forall j \in J\ \forall k \end{equation}\]\end{equation}
\[\begin{equation} \sum_{j \in L}crop\_area_{jl} \times s_{ijl} \leq irr\_area_{i} \qquad \forall i \end{equation}\]
max_score
model), that
allocates the subsistence farming system crops proportional to rural
population density. In cases, where the biophysical conditions are
considered not suitable for a certain crop, we assume a zero
allocation:\[\begin{equation} \bar{s}_{ijl} = \frac{rur\_pop_i}{\sum_{i}{rur\_pop_i}} \quad l = subsistence \quad \forall i\ \forall j \end{equation}\]
where \(\bar{s}_{ijl}\) is the share of physical area allocated to grid cell \(i\) for crop \(j\) and farming system \(l\) and \(Pop_i\) is the rural population density in grid cell \(i\).
More precisely, SPAM uses a relative entropy framework because it minimizes the difference between the cross-entropy of probability distribution \(P\) with a reference probability distribution \(Q\) and the cross entropy of \(P\) with itself. This is also sometimes referred to as the the Kullback–Leibler divergence (Kullback and Leibler 1951)↩︎