1. Introduction
Linear programming (LP) has been used to support forest management planning since the [
1,
2,
3,
4,
5,
6]. Many variations of these models have been proposed, including, for example, multi-objective models [
7,
8,
9,
10,
11,
12,
13,
14], spatially-explicit mixed integer programming (MIP) models [
15,
16,
17,
18,
19,
20], and models incorporating uncertainty [
21,
22,
23,
24,
25,
26,
27]. While earlier forest planning LP models often considered only one type of management intervention – final harvest and regeneration of even-aged management units – since at least Randall [
28], analysts have been incorporating a variety of other interventions – such as thinning [
28] and coppice harvests [
29]. These models originally focused on scheduling timber harvests in even-aged forests to maximize harvest volumes or net present values, but today these models’ management objectives include a diverse range of ecosystem services [
14,
30,
31]. Multiple demands on forests for a variety of products and services have led to the development of multifunctional forestry planning models that account for timber and non-timber values and services, such as berry and mushroom picking [
32], carbon sequestration [
33,
34,
35,
36], habitat sustainability, deadwood availability, carbon storage [
37,
38], biodiversity conservation indicators [
39], and hunting benefits [
40].
Johnson and Scheurman [
41] formalized two fundamental ways to formulate LP models of the forest management planning problem, which they named Model I and Model II. Other authors [
22,
42,
43,
44] have used a third formulation that has been referred to as Model III [
45,
46]. While hundreds of papers have been written about LP and MIP forest planning models, most models described in the literature have used Model I [
47]. We have found only a few articles that use Model II. Nanang and Hauer [
40] use Model II to build a model exploring tradeoffs between timber production and hunting benefits; Diaz-Balteiro et al. [
48] present a special harvest scheduling model for a community forest; and Grossberg [
49] integrates road access to harvest scheduling. While not strictly an LP model, Sessions et al. [
50] use simulated annealing to solve a harvest scheduling model with spatial control whose design is based on Model II. However, at least two forest management software packages used for industrial and agency applications at least give the option of using Model II. Woodstock Optimization Studio [
51] a forest planning application widely used in the forest products industry, and PRISM [
52], a U.S. Forest Service application, both allow Model II formulations. We have identified relatively few articles that use Model III [
22,
42,
43,
53,
54]. Salo and Tahvonen [
44] list Model II in their key words, but their model is better described as Model III.
There often are multiple ways to formulate any given real-world problem as a mathematical model. When different formulations all produce the same solution, it is useful to assess the advantages and disadvantages of the different formulations, and whether it matters at all which formulation is used. A few papers have compared the relative performance of Model I vs. Model II. Johnson and Scheurman [
41], who introduced Model II, focused on model size and solution time in the context of large LP models, which in 1977 were resource-intensive to formulate and solve given the computing capacity of the time. Their concern with model size led them to focus on the possibility of combining analysis areas that differ only in their initial age class after a clearcut harvest (since the initial age class is no longer relevant). Barros and Weintraub [
55] also focus on the ability to reduce the number of variables in Model II by combining similar management units when they are harvested in the same period. Today, however, model size is seldom a significant concern for non-spatial LP models. However, model size and solution time is (and probably always will be) a concern for spatially explicit models, which generally involve binary variables [
16]. However, in spatially explicit models all management units differ in terms of their location, so combining management units removes the ability to model a management unit’s locations. McDill et al. [
56] and Martin et al. [
57] compared Model I and Model II formulations in the context of MIP planning models, focusing primarily on solution time. These papers came to opposite conclusions, with McDill et al. [
56] finding that Model II performed better and Martin et al. [
57] giving the advantage to Model I. Eriksson [
58] compared Model I and Model II in the context of modeling uncertainty and concluded that Model II more easily accommodates a stochastic approach and produces smaller stochastic models due to its tree-graph design instead of the matrix design of Model I. Reed and Errico [
22] take advantage of the structure of Model III for addressing uncertainty related to fire. Based on these few studies, it is difficult to determine whether one formulation is superior for any given application, and one might conclude that perhaps no formulation will consistently outperform another.
Additional criteria could also be considered when comparing different model formulations, including how easy they are to understand and the insight they provide into the structure of the real-world problem being modeled. For example, Model I might have been used more than Model II because it is easier to understand, or because it fits more naturally into the matrix-based structure of a LP model, or even simply because more papers use it and that is the formulation most analysts are familiar with. Especially for large models, storage considerations and model formulation time may also be important. In any case, unless one formulation is uniformly superior to another, having multiple ways of formulating a mathematical model is generally a good thing since one formulation may work better for one situation while another formulation works better for another problem.
This paper revisits Models I, II, and III and considers them from a new perspective. We present the three model formulations using a consistent notation that illustrates the similarities and differences between them. Our presentation of the models shows the matrix perspective underlying Model I, and the network – or decision tree – perspective underlying Models II and III. This difference has also been noted by Garcia [
42] in the case of Model III, and Erickson [
58] in the case of Model II. Additionally, we suggest that while Model II has a network structure, it has an intervention-focused perspective, while Model III reflects a state trajectory perspective. This state-focused perspective can be particularly useful when the problem involves objectives or constraints, such as water yields, forest carbon storage, and other ecosystem services that depend on the state of the forest at any given time, rather than the inputs and outputs that result from interventions. Most applications of Models II and III have considered final harvest to be the only type of intervention. Garcia [
42] discusses the possibility of thinning, but thinning was not included in the actual model described in that paper. Our presentation of these model formulations allows any type of intervention.
We believe there has been some ambiguity in the specific definitions of the Model I, II, and III formulations, which we hope this paper will resolve. Several papers [
41,
42,
55] have emphasized the possibility of combining management units after clearcutting, which is possible with both Models II and III but not with Model I. In fact, Gunn [
46], Martin et al. [
57], and Nguyen et al. [
52]
require (as part of the definition of these models) the merging of management units that differ only in their age when they are regeneration harvested in the same planning period. But we propose an alternative definition of these models that does not identify this as the salient difference between the three model formulations. Our definition differentiates the models based solely on the sets of periods represented by the model variables. Thus, by our definition:
In Model I, a variable represents the state of the area assigned to it for an entire sequence of states, interventions, and non-interventions covering the entire planning horizon of the model. In Model II, a variable represents the area assigned to it for one of three sets of periods: 1) the set of periods between two interventions, 2) the set of periods from the initial state to the first intervention, or 3) the set of periods from the final intervention to the end of the planning horizon. In Model III, a variable represents the state of the area assigned to it for one pair of adjacent periods.
Note that we use the term “intervention” in a very general sense here. Alternatively, a variable in Model II could be defined as representing the area assigned to it for the set of periods 1) between two stand-replacing interventions (e.g., regeneration harvests), 2) from the initial state to the first stand-replacing intervention, or 3) from the final stand-replacing intervention to the end of the planning horizon. This is, in fact, the traditional definition used by, among others, Gunn [
46], Martin et al. . [
57], and Nguyen et al. [
52]. This version could, perhaps, be called Model IIB, while our definition could be called Model IIA. This distinction is moot when regeneration harvests are the only interventions considered, as is often the case. It is perhaps less confusing to use Model IIB when the emphasis is on merging similar areas after regeneration harvests (as do Gunn [
46], Martin et al. [
57], and Nguyen et al. [
52]). However, when one has a variety of interventions, including some that are not regeneration harvests (e.g., thinning), Model IIB becomes more complex, as in many cases there will be more than one sequence of interventions between any two regeneration harvests. Model IIA can accommodate merging similar areas after regeneration harvests, but one would have to differentiate between stand-replacing interventions and other interventions. We do not consider this possibility in this paper because it would complicate the exposition and possibly perpetuate confusion and ambiguity in the definitions of the three model formulations.
After we review the structure of the three formulations, we compare the size and performance of Model I, II and III formulations in the context of a case study of strategic planning in the management of a eucalyptus plantation problem in Brazil. Our intent is not to demonstrate that any of the model formulations are necessarily superior for all problem instances. Rather, the example is intended to illustrate the differences in model size, formulation time, and solution time and when and how different formulations might be preferred. We do not evaluate the performance of the different model formulations in the context of spatially explicit MIP formulations of the forest ecosystem management planning problem.
2. Methods
This section consists of three parts.
Section 2.1 to
Section 2.4 specify our formulation of Models I, II, and III.
Figure 1,
Figure 2 and
Figure 3 use decision trees for a single example management unit to illustrate the structure of the three models, emphasizing the differences and similarities in each model’s decision variable definitions, area conservation constraints, and accounting constraints. Particular attention is paid to modeling ecosystem services in harvest scheduling models, as the advantages of the Model III formulation are more apparent when multiple ecosystem services are modeled.
Section 2.5 and
Section 2.6 describe a real-world forest ecosystem management problem involving a 12,732-hectare industrial eucalyptus plantation in São Paulo, Brazil, comprised of 350 management units. This forest area is managed primarily to provide pulpwood for a nearby mill, but other ecosystem services, including carbon sequestration and water conservation are considered. This example problem will be used to conduct a comparative analysis of the three model formulations in the context of three scenarios, each incorporating an additional ecosystem service.
Section 2.7 details the experimental setup, including computational resources, solver specifications, and the methodology for running and evaluating the scenarios. It also identifies the performance metrics used to compare the models.
2.1. Example Decision Treee Graph Representation for a Single Management Unit
It will be useful to consider how the variables in each model represent the management alternatives for a single example management unit as we describe the three forest management model formulations in the coming sections. The alternatives for this example unit are presented as a decision tree in
Figure 1. The figure contains three types of nodes: the initial node (brown), non-intervention nodes (gray), and intervention nodes (red, blue and green). We do not differentiate between stand-replacing interventions and non-stand-replacing interventions. The specific nature of the interventions is not critical for the example (or for our model definitions), but when the possibility of an intervention exists, the decision tree branches to represent the decision to either implement the intervention (represented by an intervention node) or not (represented by a non-intervention node). Thus, for example, in period 2 there is a decision to either conduct an intervention (node 3 in
Figure 1) or not (node 18 in
Figure 1).
For each model, we first define the sets, variables and parameters needed to formulate the model and then we give the equations of the model. Alternative equations for a spatially-explicit model are presented following each non-spatially-explicit model formulation. Analogous equations for spatially-explicit models are given the same number, and a prime symbol indicates that an equation is for the spatially-explicit model. As we introduce new models, sets, variables and parameters that have been defined for previous models are not repeated.
Table 1, following the presentation of the models, lists the sets for each model for the example networks shown in
Figure 1,
Figure 2 and
Figure 3.
2.2. Model I Formulation
In Model I, each variable represents a complete management alternative for a management unit for the entire planning horizon.
Figure 2 illustrates this by representing the management alternatives from
Figure 1 as separate sequences of nodes for each complete management prescription. For example, the first prescription is represented by the sequence {1, 2, 3, 4, 5, 6, 7, 8}, and the second prescription is represented by the sequence {1, 2, 3, 4, 9, 10, 11, 12}. In a non-spatially-explicit model, the variable
would represent the area of unit
u where prescription 1 would be applied. In a spatially-explicit model, the variable
would take a value of one or zero, representing, respectively, whether prescription 1 would be applied to unit
u or not. The number of Model I variables for a given management unit equals the number of ending nodes in that unit’s decision tree graph.
2.2.1. Sets
set of economic quantities, ecosystem services, or landscape conditions (e.g., costs, revenues, timber harvests, timber inventory, carbon stocks or flows, old growth, recently clear-felled area, etc.). For brevity, we will refer to any of these items simply as ecosystem services.
set of management units.
set of time periods {min{-tfu}, …, 0, 1, 2, …, T}; -tfu represents the time of the most recent (past) intervention for unit u.
set of nodes for unit u.
set of nodes for unit u in a general period t; is a subset of .
the set of nodes for unit u in the specific period T (the last period); these are called “ending nodes.”
set of prescriptions for unit u (the term “prescription” is used here to mean a management plan for a single management unit from time 0 to period T, as represented by a connected sequence of nodes from t=0 to T).
Note: there exists a one-to-one mapping from each ku to each .
sequential (ordered), connected set of nodes for unit u that represent prescription ku .
node for unit u in prescription ku for period t (note: there is only one for period t in prescription , but a node may belong to more than one prescription).
2.2.2. Variables
equals the quantity of ecosystem service e provided in period t.
Non-spatially-explicit model:
equals the area of unit u on which prescription ku will be applied.
Spatially-explicit model:
equals 1 if prescription ku is applied to unit u, 0 otherwise.
Because there is a one-to-one mapping from each
ku to each
there is one variable
for each
. For the management unit in
Figure 1, there will be eight variables.
2.2.3. Parameters
the contribution of node to ecosystem service e in period t. Note that is stored in the prescription database with the node . In the non-spatially-explicit model, these values are expressed on a per-unit of area basis; in the spatially-explicit model, they are expressed on a per-management-unit basis.
objective function preferential weight for ecosystem service or landscape condition e in period t. For all models it is assumed that .
the area of unit u.
2.2.4. Formulation
Model I for a non-spatially-explicit forest management problem can now be written as follows:
Equation (1) defines the objective function as a weighted sum of the economic values, ecosystem services, and landscape conditions produced in each period. Equation set (2) is the set of area constraints; they state that the sum of the areas from unit u assigned to each prescription must equal the area of unit u. Equation set (3) is the set of accounting constraints for each ecosystem service for each period. Equation (4) states that the sum of the objective function weights equals one. Equation sets (5) and (6) are non-negativity constraints for the model variables.
For a spatially-explicit model, replace Equation set
(2) with Equation set
(2’) and Equation set
(5) with
(5’), as follows:
Equation (2’) states that one and only one prescription can be applied to unit u, and Equation (5’) states that the X-variables are binary, i.e., they must equal either zero or one.
Upper or lower bounds on the values or changes in the values of the ecosystem service accounting variables can easily be added to the base model represented by Equations
(1) to
(6). For example, one could add constraints enforcing minimum or maximum harvest levels, minimum or maximum carbon stored in the forest, minimum levels of carbon accumulation, maximum area of cleared forest within a given zone, minimum area of forest older than a given age, maximum change in the harvest level between periods (flow constraints), etc. The possibilities are too numerous to describe them all here. Adjacency constraints, accessibility constraints, etc. can also be added to the spatially-explicit model. Again, enumerating the possibilities is beyond the scope of this article. It should be noted that a well-constructed model should include constraints on either the ending state of the forest (target forest conditions) and/or include appropriate values for possible ending states in the objective function to ensure accurate valuation of the forest resource and the sustainability of the forest beyond the planning horizon [
59].
2.3. Model II Formulation
In Model II, each variable represents the state of the area or management unit assigned to it for a period of time (sequence of nodes) that begins either at the beginning of the planning horizon or immediately following a management intervention and that ends either with an intervention or at the end of the planning horizon.
Figure 3 illustrates this by representing the alternatives from the example in
Figure 1 as they would be represented in Model II. For example, the first variable would represent the sequence {1, 2, 3}, and the second and third variables would represent the sequences {3, 4, 5} and {3, 4, 9, 10}, respectively. In Model II, the number of variables for a given management unit will equal the number of interventions that can be applied to that unit plus the non-intervention ending nodes in that unit’s decision tree graph. Equivalently, the number of variables equals the number of intermediate intervention nodes (defined below), plus the number of ending nodes.
2.3.1. New Sets
the initial node for unit u.
set of intermediate intervention nodes for unit u; these are nodes that represent an intervention, excluding the initial node (if it is an intervention node) and any intervention nodes that occur in period T; this is why this set is referred to as “intermediate” intervention nodes.
set of intermediate non-intervention nodes for unit u; this set includes all non-intervention nodes except the initial node (if it is a non-intervention node) and any non-intervention nodes that occur in period T.
Note that the set can be partitioned into four sets: .
A “prescription” in this model is a sequence of nodes that starts with either the initial node, , or an intermediate intervention node, , and ends with either an intermediate intervention node, , or an ending node, . Again, let
set of prescriptions for unit u.
set of nodes that represent the last node in a prescription.
Note that the set in Model II is different from the set in Model I. Assuming we are not merging similar areas after regeneration harvests, there is a one-to-one mapping from each prescription to each . Otherwise, there would be a many-to-one relationship between prescriptions in and nodes in . In our examples, we did not merge similar areas after regeneration harvests.
Define the function as returning the prescription that ends at node .
For each node there is a set of prescriptions that start with node . Define the set:
as the set of prescriptions that that start with node .
Similarly, define the set:
as the set of prescriptions that start with intermediate intervention node (where j is the index to the set corresponding to the intermediate intervention node ).
Define the function that returns the index j for the set corresponding to the intermediate intervention node .
set of prescriptions for unit u that contain a node in period t.
2.3.2. Variables
The variables in Model II have definitions like those in Model I. What has changed is the prescriptions (sequence of nodes) that they represent.
Non-spatially-explicit model:
equals the area of unit u on which prescription ku will be applied.
Spatially-explicit model:
equals 1 if prescription ku is applied to unit u, 0 otherwise.
2.3.3. Parameters
The parameters are the same as for Model I.
2.3.4. Formulation
Model II can now be written as follows:
Equations (1), (4), (5), and (6) are the same as for Model I. Equation set (7) is analogous to Equation set (2) from Model I, but the summation on the left-hand side is only over the prescriptions that start from the initial node. Equation set (8) are network flow conservation constraints stating that the total area entering an intermediate intervention node must equal the total area leaving the node. Equation (9) is very similar to Equation (3); the only thing that has changed is that we only sum over the set of variables that contain a node in period t.
For a spatially-explicit model, replace Equation set
(7) with Equation set
(7’) and Equation set
(5) with
(5’), as follows:
As with Model I, upper or lower bounds on the values or changes in the values of the ecosystem service accounting variables can easily be added to the base model. Adjacency constraints, accessibility constraints, etc. can also be added to the spatially-explicit model. Again, a well-constructed model should include constraints and/or objective function coefficients that ensure accurate valuation of the forest resource and the sustainability of the forest beyond the planning horizon [
59].
2.4. Model III Formulation
In Model III, each variable represents a sequence of two nodes that starts with either the initial node,
, or an intermediate node,
, and ends with either an intermediate node or an ending node,
.
Figure 1 represents Model III, where each node represents a forest state. Note that each variable corresponds to an arc in the network representation in
Figure 1. In Model III, the number of variables for a given management unit would equal the number of nodes in that unit’s decision tree graph, excluding the initial node; for the example in
Figure 1, this would result in 31 prescriptions.
2.4.1. New Sets
set of all intermediate nodes (intervention plus non-intervention) for unit u (the “c” superscript indicates that is a combination of and ).
Now, the set can be partitioned into three sets: A “prescription” in this model is a sequence of two nodes that starts with either the initial node, , or an intermediate node, , and ends with either an intermediate node, , or an ending node, . Now, let
set of prescriptions for unit u.
Note that the set in this model is different from the sets in Model I and in Model II. In this model there is a one-to-one mapping from each prescription to each . As with Model II, it is possible to merge similar areas after regeneration harvests; if this was done, there would be a many-to-one relationship between prescriptions in and nodes in .
Define the function as returning the prescription that ends at node .
For each node there is a set of prescriptions that start with that node (). Define the set:
as the set of prescriptions that that start with node .
Similarly, define the set:
as the set of prescriptions that start with intermediate node (where j is the index to the set corresponding to the node ).
Define the function that returns the index j to for the set corresponding to the intermediate node .
set of prescriptions for unit u that end with a node in period t.
2.4.2. Variables
The variables in the model are again either.
Non-spatially-explicit model:
equals the area of unit u on which prescription ku will be applied.
Spatially-explicit model:
equals 1 if prescription ku is applied to unit u, 0 otherwise.
2.4.3. Parameters
The parameters are the same as for Model I and II.
2.4.4. Formulation
Model III can now be written as follows:
Note that this model is virtually the same as Model II. The only difference is Equation Set (10), which is analogous to Equation Set (8) except that we now have one equation for each intermediate node (not just for intermediate intervention nodes) each management unit. This, and the number of nodes represented by the variables, is the only difference between Models II and III.
As with Model II, for a spatially-explicit model, replace Equation set
(7) with Equation set
(7’) and Equation set
(5) with
(5’), as follows:
To help readers follow the above descriptions of Models I, II, and III,
Table 1 lists the key sets from the three models and identifies the nodes belonging to each set from the example prescriptions in
Figure 1.
2.5. The FEM Problem Description
We created a case study of a management problem for an area of fast-growing eucalyptus plantations owned by an integrated forest products company in São Paulo, Brazil to compare various attributes of Models I, II, and II. The plantations in the case study are managed using a short-rotation coppice and renewal regime with the primary objective of pulpwood production. The plantation, comprising 350 management units over 12,732 hectares, employs clear-cuts followed by renewal (CCR) or sprouting (coppice) from stumps (CCS) at ages 6, 7, and 8 years, extending to 11 years in the first three periods for older management units. This regime aligns with regional practices, allowing only one CCS before a mandatory CCR.
We calculated the inventory, harvest volume, and current annual increment (CAI) for three site indexes (23, 26, 29 m) using yield curves provided by the company. The mean average increments at age seven (MAI7) are 26.96, 31.40, and 36.73 m3/haꞏyear in the first rotation and 22.92, 26.69, and 31.22 m3/haꞏyear in the second rotation for the three site indices, respectively.
Genetic material, mostly Eucalyptus grandis, Eucalyptus urophylla, or their hybrids, were grouped by basic wood density (BWD) properties into four categories that influence carbon sequestration calculations. The high, medium, low, and very low-density groups range from 0.463 to 0.531, 0.446 to 0.503, 0.403 to 0.488, and 0.355 to 0.481 t/m3, respectively, depending on the stand age.
The company provided the initial state of each management unit, including location (region, sub-region, and site index), genetic material, area, last intervention time, and type. A 21-year planning horizon, divided into one-year planning periods, was used, reflecting three average rotations.
Carbon sequestration calculations were based on Sanqueta et al. [
60], utilizing the Biomass Expansion Factor (BEF), Root Shoot Ratio I, and Biomass Carbon Fraction Factor (BCF). We used data from the same species collected in a similar region from Ribeiro et al. [
61]. We obtained the factors BEF = 1.2220, R = 0.1689, and BCF = 0.4323, which are ratios with no units.
Additionally, the company reported natural forest coverage per subregion, maintaining at least 20% to meet Brazilian regulations. We use sub-regions as a proxy for micro-watersheds to calculate water-related ecosystem services (ESw). There is a threshold of 20% of natural forest coverage in a micro-watershed with eucalyptus plantations, below which ESw supply tends to become unsustainable and a desirable level of 45% for the highest ESw gains [
62]. These data, with concepts from Lima and Zakia [
63], are used in our third scenario that considers ESw, as described in more detail below.
Production costs were provided per region in BRL (
Table 2; BRL = Brazilian reais, also represented by R
$). A conservative interest rate of 6% was chosen to avoid biasing the model towards early harvesting [
64,
65]. The pulpwood price, R
$ 100.00, is typical for the region. We used a social value of carbon as a proxy for the carbon value, which is R
$ 120/t of CO
2 in Brazil [
66,
67].
We calculated terminal values (TV) following the methodology outlined by Kanieski da Silva
et. al. [
59], accounting for regional cost differences and production variations specific to each region and site index. This involved initially determining a management regime that maximizes the Land Expectation Value (LEV) for each region and site index (
Table 3) and then calculating a Forest Value [
68] by age, region, and site assuming this management regime would be followed in perpetuity.
All values, tables, and assumptions from this section, including the initial state by management unit, genetic material BWD, required forest cover per sub-region, production tables, costs, LEV, TV for each potential ending condition, formulas, and a Visual Basic macro for TV calculations, are provided in the supplementary material. These data are compiled in an Excel® spreadsheet titled “Assumptions.xlsm,” along with detailed carbon calculation methodologies.
Lastly, we utilized iGen by Romero
® [
69] to generate all possible management alternatives for this FEM problem. iGen produced a tree graph similar to
Figure 1 for each management unit, resulting in a total of 426,191 nodes for all management units. It calculated a suite of parameters for each node for use in formulating the linear programming models, including age, inventory, harvest volume, discounted revenue and costs, carbon sequestration, forest cover, and an age control indicator for the ESw scenario. Additional variables maintained during alternative generation, such as region, sub-region, management unit, area, site index, period, intervention type, rotation, and genetic material, along with the rules and equations of motion, are detailed in an Excel
® spreadsheet “Alternatives.xlsx,” available in the supplementary material.
For this study, the node data were read to build non-spatial Models I, II, and III according to the formulations presented in the previous sections. While
Section 2.2,
Section 2.3, and
Section 2.4 describe both non-spatial and spatially-explicit versions of their respective models, including spatially-explicit versions of the case study models was beyond the scope of this analysis. In addition to a base case that considers only the profitability of timber production, two additional scenarios were considered with each scenario adding additional management objectives (ecosystem services). These scenarios are detailed in the following section. Each scenario was formulated using Model I, II and III, resulting in a total of nine LP formulations.
2.6. Scenarios Description
This study hypothesizes that Models I, II and III perform differently when managing for an increasing number of Ecosystem Services (ES). To examine this, we have structured scenarios with escalating complexity. Scenario 1 is a standard timber production harvest scheduling model that presents a baseline case. Scenario 2 adds carbon sequestration as an additional ES management objective to Scenario 1. Finally, Scenario 3 adds an additional water-related Ecosystem Services (ESw), thus providing the most comprehensive and complex scenario for analysis.
2.6.1. Scenario 1 – Harvest Scheduling
Scenario 1 focuses on timber production, including timber revenue, the costs of silviculture activities, and the terminal value. The objective function was to maximize the Net Present Value (NPV), which includes the sum of discounted timber management net revenue and the terminal value. The model includes flow constraints allowing ±5% production variation starting from period four. The flexibility in the first three years, during which harvests in older ages are allowed due to existing older management units, provides an adjustment period before imposing flow constraints.
The parameters necessary for this scenario – VolCut (volume of timber cut), DscTmbCst (discounted timber costs), and DscTmbRev (discounted timber revenue) – were calculated using iGen, as detailed in the previous section. During model-building phase, we used these parameters to define three corresponding sets of accounting variables by period. The VolCut accounting variables were required to establish the flow constraints, while DscTmbCst and DscTmbRev were utilized to compose the NPV objective function.
2.6.2. Scenario 2 – Carbon Analysis
In Scenario 2, we expanded the model to include carbon sequestration and its associated value. A new parameter, CRemoved (carbon removed from the atmosphere), was introduced and calculated by iGen, as detailed previously. CRemoved quantifies the CO2 in tons per hectare removed from the atmosphere and fixed in the trees’ biomass (both above and below ground) during the current period. In this scenario, the revenue parameters, which previously included only timber revenues and terminal values, were adjusted to include the value of carbon sequestration, obtained by multiplying the amount of carbon removed by the social cost of carbon. Similarly, the cost parameter was adjusted to account for carbon emissions resulting from harvesting, calculated by multiplying the carbon content in the harvested volume by the social cost of carbon. Additionally, we introduced a new set of constraints to ensure non-declining carbon sequestration, stipulating that CRemoved must not decrease over time. The objective function remains the Net Present Value (NPV), which now integrates the carbon value into the revenue and cost components.
The harvest flow constraints from Scenario 1 were also included in this model. Thus, the parameters in this scenario include VolCut (volume of timber cut), DscCosts (discounted costs), DscRevenue (discounted revenue), and CRemoved (carbon removed), used again while building the accounting constraints to compute the corresponding accounting variables, which in turn inform the flow and non-declining carbon constraints, as well as the NPV maximization objective function.
2.6.3. Scenario 3 – Timber, Carbon, and Water
In Scenario 3, we introduced binary parameters, PltCover, to represent the water protection ecosystem service (ESw). PltCover equals 1 when a plantation’s age is two years or older, and 0 otherwise. PltCover is used in a set of accounting constraints to calculate the plantation coverage (the area of soil covered by forest plantations that are at least two years old) within each micro-watershed for each period. Bounds were placed on the resulting accounting variables to set a minimum level of plantation coverage per period in each micro-watershed.
In addition, this scenario incorporated an age control variable to calculate the weighted age by area, computed as the age multiplied by the area represented by each variable divided by the total area (12,732.41 ha) for every period. A set of accounting constraints was added to populate accounting variables to track the average age of the forest per period. A set of non-declining age (
NDAge) constraints was added to ensure that this weighted age does not decrease over time because older fast-growing plantation covers are closer to natural forest cover when it comes to improving water-related ecosystem services [
63].
Finaly, the scenario retains the constraints from Scenarios 1 and 2, including flow constraints and non-declining carbon. The objective function is the same as Scenario 2, maximizing the Net Present Value (NPV), including the social value of carbon.
2.7. Experimental Setup
iGen by Romero
® was used to generate and calculate management alternatives and associated parameters. Data were stored in a remote SQL Server 2022 database [
70]. The model-building process for Models I, II, and III was executed through custom software developed in Python [
71], utilizing SQLAlchemy [
72] and Pandas [
73] packages for database connectivity.
Pyomo [
74], a Python based algebraic modeling language (AML), was used to construct the optimization models. Pyomo interfaces with the Gurobi Optimizer version 10.0.3 win64 [
75] to solve the models. Results and comprehensive control logs detailing each step of the process are saved back to the same SQL Server database. These logs are instrumental for the analysis presented in the results.
All tests were conducted on a local computer equipped with an 11th Gen Intel® Core™ i7-1165G7 CPU @ 2.80GHz, 64GB RAM, featuring four physical cores, eight logical processors, and support for up to eight threads. To ensure the accuracy and integrity of the tests, we turned off all non-essential software to prevent interference with CPU and memory usage. The test suite, encompassing nine runs (three scenarios across three models), was executed in a controlled environment to eliminate the influence of external applications on computing resources.
3. Results
As one would expect, Model I formulations had the fewest decision variables per management unit (240), Model II formulations had an intermediate number (421), and Model III had the most (1,218).
We tracked the time in our analysis of each process phase, from data retrieval from the iGen database to saving the results.
Table 4 summarizes the durations of each modeling phase for each scenario and model. Objective function values are shown in the table to confirm that the three models are indeed solving the same forest management problem.
‘Time to Get Data’ is influenced by database organization, database dialect, and internet connectivity rather than model characteristics. This phase shows minimal variance across models and scenarios, attributed to the consistent number of records processed, predetermined by the number of alternatives generated by iGen in an earlier phase. Minor time differences are due to differing parameter sets prepared for the building phase.
Model III requires the least model building time, compared with Models I and II, with this advantage increasing in more complex scenarios. For instance, Model III requires only 38% of building time required for Model I in Scenario 1, and only 27% of the Model I building time in Scenario 3. This is also true, but to a lesser degree, when Model III is compared to Model II.
Model II consistently has the lowest solver time, while Model I requires the largest solver time. However, the difference between Models II and III narrows in more complex scenarios. For example, it takes almost twice as long to solve Model III compared with Model II Scenario 1, while it takes less than 10% more time to solve Model III compared with Model II in Scenario 3. This trend suggests further testing may be needed for more complex scenarios with additional ecosystem services and alternatives.
Overall, combining building and solver times, Model III emerges as the more efficient model in all scenarios, with its relative performance improving with scenario complexity. As depicted in
Figure 4, Model III matches Model II in Scenario 1 but is more efficient in Scenarios 2 and 3, taking only 90% and 67% of Model II’s time, respectively. It should be acknowledged, however, that solving time may be more critical than building time when one is solving more-or-less the same problem many times, such as when one wishes to find the Pareto frontier with a multi-objective problem.
We broke down the building times into finer categories to delve deeper into the analysis. For the building process, we detailed the time required for each linear programming element to be provided to the solver. We also analyzed the solver process using Gurobi’s presolve data.
Table 5 breaks down the building time for Scenario 3, showing the duration of each model-building step. The “look-up tables” are necessary for linking decision variables to nodes and establishing node connections (i.e., building the sets and functions described in
Section 2.1 to
Section 2.4). This step consumes more time in Model III due to the larger number of variables and connections. This is also reflected in the “Conservation of Area Declaration,” where Model I has fewer rows than Model II, and Model II has fewer than Model III (
Table 6). Model III also requires more decision variables (linear programming columns) than Models I and II (
Table 6), resulting in longer set declaration times. However, the time for declaring other sets (periods, ages, sub-regions, management units) is relatively minor. The time for parameter declaration is consistent across models, as parameters are indexed identically by node for all models.
The notable difference in total building time is clearly due to setting up the accounting variables and their corresponding accounting constraints (
Table 5). The need for extensive accounting variables, especially when incorporating ecosystem services over all periods and alternatives, is the reason for Model III’s efficiency. This efficiency is critical when dealing with complex variables like
PltCover, which require accounting across multiple indices (period and micro-watershed).
Once accounting constraints and corresponding accounting variables are established, the time required to declare ‘Constraints’, ‘Objectives,’ and ‘RHS’ is minimal, under 0.005 seconds for all models. Understanding the accounting constraint construction process requires analyzing
Figure 1, 2, and 3, which illustrate the number of nonzero coefficients involved, essentially one for each node.
Table 7 shows the number of non-zero coefficients for each accounting constraint set in this study for each model type. Except for the VolCut accounting constraints, Model II has less than half the number of non-zero coefficients compared with Model I, and Model III has less than half the number of non-zero coefficients compared with Model II. In fact, not counting the VolCut accounting constraints, Model III has less than one fifth of the non-zero coefficients that are required in Model I.
Further tests, including scenario variations, were conducted to develop the graph in
Figure 5.
Table 7 and
Figure 5 show that building time is strongly correlated to the number of nonzero coefficients. These results suggest that the density of the coefficient matrix is an important driver of model formulation time. Detailed measurements from these tests, including accounting constraints results and times, are available in the supplementary material in a spreadsheet named “Results.xlsx.”
To gain insight into the solver phase, we examined Gurobi’s presolve report. Gurobi’s presolver simplifies the optimization problem through problem size reduction, bound tightening, and identifying unique structures.
Table 8 summarizes the main presolve metrics for each scenario, illustrating Gurobi’s effectiveness in simplifying and significantly reducing the size of all three models.
In Scenario 1, Gurobi’s presolve process results in Models II and III having an equal number of rows, columns, and nonzeros, suggesting that the two models are structurally similar, if not equivalent. This suggests that the difference in the solution times for these models in this scenario can be attributed to the presolve time, not the solution time. However, in Scenarios 2 and 3, Model III maintains more columns and rows than Model II after the presolve process. Nevertheless, the differences are relatively small, suggesting that both models still converge towards very similarly structured problems. However, Model III still has noticeably fewer nonzeros than Model II in these scenarios – about one third and a half in Scenarios 2 and 3, respectively.
Conversely, while Model I is transformed into a model with fewer columns and dramatically fewer rows in all scenarios, compared to Models II and III, Model I consistently shows a higher count of nonzero coefficients. Finally, note that the density of the Model I and Model II matrices increases faster than the density of the Model III matrices as model complexity increases. After presolving, there are approximately six times more nonzeros in both Models I and II for Scenario 2 vs Scenario 1, while the number of nonzeros increases by a factor of less than four. Similarly, for Scenario 3 vs Scenario 2, the number of nonzeros increases by a factor of 2.7X for Model I, 2.2X for Model II, and 1.9X for Model III. This discrepancy in the density of the constraint matrix is likely one reason why Model I has the longest solution times. In addition, Gurobi’s presolve likely recognizes the network structure inherent in Models II and III, enabling more effective application of cutting planes to expedite the solution process [
75].
4. Discussion and Conclusions
This paper presents a new perspective on the Model I, II, and III formulations of the forest management planning problem; depending on one’s previous understanding of the three models, one might say that we have redefined Models II and III. The paper shifts the focus in the distinction among these formulations away from whether areas can be merged into combined management units after stand-replacing harvests. Rather, it focuses on the definitions of the variables in the three models, specifically the sequence of management unit states represented by the variables. In Model I, variables represent a complete sequence of states from the beginning of the planning horizon to the end. In Model II, variables represent a sequence of states from one management intervention to the next. Finally, in Model III, variables represent a single arc in the decision tree for a management unit, representing only two states: a beginning state and an ending state.
While it is possible in Models II and III – and not in Model I – to merge areas into combined management units so they can be represented by a single set of variables following a stand-replacing harvest, we suggest that this is not the critical distinction between the models. In fact, we have mostly ignored this possibility in this paper, and we do not take advantage of this option in our analysis. It is possible that allowing for the merging of similar areas into a single management unit following a management-unit-replacing harvest would result in even greater efficiencies for Models II and III, but this is not the focus of this paper. This option is also less attractive in the case of spatially explicit models, since in those models all management units differ in terms of their locations. In addition, because we do not consider the option to combine areas following a complete harvest, we do not differentiate between stand-replacing interventions and other interventions. If one wishes to merge management units following complete harvests, then this distinction is crucial, and the models will be somewhat more complex than the ones presented here.
Not surprisingly, given our definitions, Model III has the most variables and constraints, Model I has the fewest, and Model II is somewhere in between. In past studies, such as Johnson and Scheurman [
41], the focus of Model II was on the possibility of reducing the number of variables through the merging of management units. However, our analysis suggests that the number of variables and constraints is not strongly related to model formulation time and even solution time. In fact, our analysis suggests that the density of the parameter matrix is a key metric that past studies have ignored. Model I has the densest parameter matrix, and Model III has the least dense matrix. Furthermore, the density of the Model III parameter matrix increases less fast than the density of the Model I and Model II parameter matrices as more ecosystem services are included in the model. This difference remains even after Gurobi’s presolve process.
Our analysis indicates that Model III’s efficiency advantage over Model I increases with the number of ecosystem services (ES) considered, and the solution time advantage of Model II vs Model III decreases with the complexity of the problem. These trends are linked to the computational demands of incorporating each ES. A corresponding set of accounting constraints must be constructed for every ES factored into the model. The computational load intensifies since ES considerations apply to every alternative across all periods and the entire horizon. Models I and II, which represent the same state in
Figure 1 with duplicate variables, require greater formulation time as a result. In terms of solution time, Model II performs somewhat better than Model III in our case study, but the difference between the solution times of Models II and III is less than the difference between Models III and I.
It is essential to note the need for further research to explore Model III’s adaptability to various problem structures. This includes testing its performance with different numbers and configurations of alternatives per management unit, leading to varying tree-graph shapes and consequent impacts on efficiency. Model III’s robustness should also be assessed in scenarios involving binary decision variables, adjacency, transportation constraints, and stochastic models.
Additionally, it is crucial to evaluate building and solver times independently for researchers and practitioners implementing forest planning models in contexts such as Monte Carlo simulations, Pareto Frontier calculations, or any iterative process using a persistent solver. Given the varying demands of controlling numerous ES, the performance differential can be significant.
In our analysis, Model I consistently underperformed the other models in both building and solving times across all scenarios. Interestingly, all three model formulations have been around for nearly half a century, and it is somewhat curious that most practitioners have adopted the Model I approach. Model I’s widespread usage might be attributed to its matrix approach, which aligns well with the familiar spreadsheet format. This alignment extends to the design of linear programming problems, where variables and constraints correspond to columns and rows, respectively, facilitating intuitive program development for solver submission. In contrast, Models II and III, which employ network concepts, can be perceived as more complex and challenging to understand and teach. Network-based models, while frequently discussed in forest management literature, have yet to see widespread adoption among researchers and practitioners, possibly due to the abovementioned complexities. However, they are favored in commercial software owing to typically superior performance. Models II and III demand a network format for alternative generation, requiring advanced programming skills beyond the matrix approach. Nevertheless, it is possible that the decision-tree structures of Models II and III may ultimately be easier to understand for users who have not already been indoctrinated with the matrix-based perspective of most linear programming models. It seems to us that Model III is conceptually straightforward if one visualizes the forest management problem as a decision tree – a sequence of decisions as shown in
Figure 1 – rather than as a matrix.
Model III emerges as a promising alternative, offering enhanced performance and greater accessibility in terms of understanding and programming. Unlike Models I and II, Model III adopts a perspective centered on forest states, viewing forests as ES sources subject to human interventions under certain conditions. This approach could potentially resonate more naturally with researchers and practitioners, simplifying the process of programming management alternatives and building models. In addition, because Model III has a variable representing the state of each management unit at each point in time, it is well suited to introducing uncertainty into the model formulation. This likely explains why authors such as Eriksson [
58], Reed and Errico [
22] and Davis and Martell [
54] found Models II and III to provide a superior framework for incorporating uncertainty. At the very least, we hope this paper gives forest management planners new reasons to reconsider both Models II and III, rather than the ubiquitous, yet plausibly inferior, Model I.