Beyond misperception–two types of mental model errors in a dynamic decision task

This article contributes to research on mental models and how they underpin decision policies. It proposes a framework for the joint use of mental models of dynamic systems and the theory of mental models initiated by Johnson–Laird and defines two types of errors: (1) misrepresentation of the system’s structure, and (2) failure to deploy relevant mental models of possibilities. We use a dynamic decision task based on Moxnes’ “reindeer experiment” to formulate three intuitive policies, their underlying mental models, and the reasoning, and evaluate the policies under varying initial conditions. Each of the policies generates problematic behaviors like dependance on initial conditions, underperformance because of flawed goal setting and oscillation due to leaving the delay in a feedback loop out of account. We identify errors of both types in the mental models and relate them to the behavioral problems. Limitations and questions for further research conclude the paper.


Introduction
This article contributes to mental model research in dynamically complex situations. It introduces the combined use of mental models of dynamic systems Ford, 1998, 1999;Groesser and Schaffernicht, 2012;Lane, 1999), and a theory of reasoning-the theory of mental models (Johnson-Laird, 1983, 2010-to describe two types of mental model errors that lead to detectable flaws in policies. Roughly two decades ago, Erling Moxnes's "reindeer experiment" showed that individuals with professional experience underperformed in a comparatively simple dynamic decision task (Moxnes, 2004). Poor decisions seemed to come from failing to perceive relevant feedback relationships. In Moxnes's words, the data "suggests, however, that a vast majority had highly inappropriate mental models" (p.151, emphasis added). But his research did not aim at eliciting or analyzing these mental models-it was a contribution to the literature of misperception of feedback, which had emerged a decade earlier (Sterman, 1989a, b). What followed was a series of experimental studies focused on developing mathematical models that replicate the subjects' decisions (see Gary and Wood, 2016), without the aim to study the underlying reasoning of the participants.
In management and organization studies, mental models are the way people understand the structure and the working of a system (Rouse and Morris, 1986). Compatible with this general definition, the system dynamics field has developed a conceptual definition of mental models of dynamic systems (MMDS when in singular and MMDSs for plural). This definition introduced specific features for the study of feedback-rich systems, in particular that an MMDS contains contain the perceived structure of a system Ford, 1998, 1999;Lane, 1999). Groesser and Schaffernicht (2012) then proposed an operational definition with a data structure for representing MMDSs. However, only a few studies with a detailed examination of MMDSs have been published so far (for a discussion, see Schaffernicht, 2017Schaffernicht, , 2019. One reason for this scarcity may be than researchers consider selfreported language assertions as not sufficiently reliable for scientific research (Arango A., Castañeda A., and Olaya M., 2012). Another reason may be that a MMDS only represents structure (not the reasoning process), and researchers find the efforts to elicit, represent and analyze this type of mental model paramount as compared to the insights to be gained from the comparison of MMDSs.
Cognitive psychologists elicit and analyze assertions articulated by individuals when studying reasoning. There are diverse theories of human reasoning, like probabilistic approaches (Baratgin et al., 2015;Chater and Oaksford, 1999) and formal inference rules (Braine and O´Brein, 1991;Braine and O'Brien, 1998;O'Brien, 2014;Rips, 1994). The theory of mental models (Johnson-Laird, 1983, 2001, 2010Ragni and Johnson-Laird, 2020) proposes that reasoning is based on mental models of possibilities (we use MMP for singular and MMPs for plural), which are consistent with their mental image (the so-called iconic mental model) of the situation in which a decision has to be made.
Cognitive scientists and management scholars focusing on the perceived structure use the same term mental model but give it different meanings-albeit the historic roots of the term are the same (Johnson-Laird, 2004). Nevertheless, the definitions of mental models are complementary: the mental models of possibilities which an individual mentally deploys are drawn from the underlying iconic mental model of the situation. We propose to consider a MMDS as a particular form of iconic model. This creates a link between both types of mental models and leads to two types of mental model error. 2) MMP errors are when a relevant mental model of possibilities is not deployed and processed.
One can only think of possibilities in terms of recognized features. Therefore, some MMP errors may happen because of MMDS errors. Of course, unconsidered possibilities can lead to policies with "surprising" effects.
To test this combination and the possibility to identify mental model errors leading to flawed policies, we have used a dynamic decision task inspired by the "reindeer experiment" (Moxnes, 2004).
Assuming naïve decision-makers without specific domain knowledge or training in systems thinking, the "herd management game" has some superficial differences but maintains the causal structure.
Decision-makers must maximize production drawn from animals without compromising the herd's sustainability. Intuitive reasoning steps based on the briefing information leads to three intuitive decision policies. The implied mental models (MMPs and MMDSs) are derived from the reasoning steps. Simulation shows that each of the policies generates problematic behaviors like dependence on initial conditions and oscillations; production performance is majorly poor. We identify several MMDS and MMP errors in the mental models underlying each of the policies; we also pinpoint the links between misperceiving structural features of the system and the flaws in each policy.
The ability to identify specific types of mental model errors that explain flaws in decision policies is useful for advancing the understanding of how people take deliberate decisions. Hence, the combined use of both types of mental model concepts and methods has the potential to facilitate mental model research.
The remainder of the article is structured as follows: the second section briefly introduces the theory of mental models and its link with mental models of dynamic systems. Then, the herd management game section elaborates the three policies, their underlying mental models, and the reasoning steps, 4 followed by a discussion of their behavior and performance in the simulations. The subsequent discussion section introduces a diagnostic of MMDS errors and MMP errors and then outlines important research questions emerging from our results. The conclusions summarize our findings and their limitations, together with a call for further cumulative research.

Mental models of dynamic systems as iconic representations of decision situations
The conceptual definition of a mental models of dynamic systems by Ford (1998, 1999) established the analogous relationship between the actual situation and the mental model of it. Groesser and Schaffernicht (2012) then proposed an operational definition. They argued that if the causal structure of the external situation contains feedback loops with stocks and flows, an accurate mental model of this system will contain the same type of elements. This definition is conceptually compatible with Rouse and Morris's influential elaboration of mental models (1986) and allows researchers to build on mainstream methods in organizational and management studies (Langfield-Smith and Wirth, 1992;Markóczy and Goldberg, 1995;Schaffernicht, 2017;Groesser, 2011, 2014) while putting interdependence into focus (Schaffernicht, 2019).
MMDSs represent the causal structure of a system, they do not include the reasoning and the mentally simulated behaviors.

The theory of mental models: reasoning with mental models of possibilities
The theory of mental models (see also Khemlani and Johnson-Laird, 2019) offers one explanation for the different aspects involved in human reasoning. The structural elements of MMDSs-variables, causal links, and feedback loops-constitute a vocabulary for assertions concerning what can happen.
For instance, a positive causal link from a variable motivation to another variable effort will give rise to assertions like "when there is a higher motivation, then there will be more effort". Such assertions imply several possibilities, and the theory analyzes and explains how and when humans process them or fail to do so.
To do that, it proposes several theses and general principles. Three of the principles are relevant here: (1) Representation; (2) Dual process; and (3) Modulation (see e.g. Khemlani, Byrne, and Johnson-Laird, 2018). To introduce them, we focus on the way the theory accounts for the conditional. i Sentential connectives refer to models (Johnson-Laird and Ragni, 2019). People deem those models as possibilities and link them to conjunctions ii (Khemlani, Hinterecker, and Johnson-Laird, 2017).
The conditional is a sentential connective relating an antecedent p to a consequent q in the manner expressed by the following assertion (identified by roman numbers to avoid confusion with the assertions discussed in section 3): [I] If p then q Here, p and q often represent assertions describing events like "it rains" and "I'll be wet"; however, they can also refer to behaviors like "I work more hours per day" and "I will feel more fatigue." When models of possibilities are deployed, [II] they correspond to an assertion like [I] (Johnson-Laird, 2012).
The symbol "¬" stands for negation. There is a missing possible combination in [II] which is shown in [III].
The reason for this is clear: as in classical propositional logic, [III] represents the situation in which a sentence such as [I] is false.
The principle of representation supports all of this. It claims that sentential connectives are usually understood as sets with "conjunctions of possibilities" (Khemlani et al., 2018). Hence this principle is also the principle providing that [II] is the set of possibilities related to [I].
The second general principle is the principle of dual process. The theory of mental models is a dual process theory (Khemlani et al., 2018) arguing that two systems work in the human mind. System 1 is quick, intuitive, and effortless. System 2 is slow, reflexive, and needs effort (Byrne and Johnson-Laird, 2020;Stanovich, 2012). The theory of mental models follows this approach, distinguishing between the intuitive mental models of possibilities (MMPs) visible to the first system and the socalled Fully Explicit Model (FEM-albeit we do not capitalize for mental models of dynamic systems or of possibilities, we follow the convention in the literature of the theory of mental models to capitalize this term), which would require the second system to step in.
If something is asserted in the form of a conditional, only one mental model is deployed: system 1 identifies only one model. That model matches with the first conjunct in [II]; this conjunct is: On the other hand, the Fully Explicit Models of the conditional are all of those in [II]. The theory suggests that Fully Explicit Models are harder to note because they include negations: p is negated in the second conjunct in [II], and both p and q are negated in the third conjunct in [II] (Johnson-Laird, 2012). This is an essential point of the theory of mental models: many reasoning mistakes happen because of insufficient mental effort, leading to taking only part of the possibilities into account.
Humans tend to use the quick system and only think about MMP (IV). If they made a further effort and used the reflexive second system, they could access to [II], and the error rate should be lower (Byrne and Johnson-Laird, 2009).
According to the third general principle-principle of modulation-the content of sentences can change their models or possibilities (Khemlani et al., 2017;Quelhas, Johnson-Laird, and Juhos, 2010). [VIII] demonstrate this. Their structure is typical for the theory of mental models (Orenes and Johnson-Laird, 2012).
[V] If they come from Germany, then they come from Berlin.
Taking "they come from Germany" as p and "they come from Berlin" as q, the set of possibilities of The reason for this is clear. Possibility [VII] is the second possibility implied by [II]; it cannot be admitted for [V] because it is not possible that they do not come from Germany, and they come from Berlin.

[VII] Possible (¬p & q)
However, the case in which the conditional is false in classical logic, that is, [III], has to be added in this case. It is the second conjunct in [VI]. The reason is apparent: they can come from Germany and not from Berlin, but another German city.
Still another example is [VIII].
[VIII] If it is cloudy, then it may be warm.
This last conditional admits all the combinations, that is, [II] plus [III]. Therefore, its set of combinations is [IX].
Where p indicates that it is cloudy, and q refers to the fact that it is warm.
The four possible scenarios can happen when (VIII) is true. Nonetheless, up to ten combinations of models that can be linked to the conditional (Johnson-Laird and Byrne, 2002 Although the literature on the theory of mental models does not use causal diagrams, such diagrams and the definition of polarity can be used to represent assertions like in [II]. Consider an example in which p and q represent statements about the behavior of variables rather than statements about facts or events: p stands for "I work fewer hours" and q for "I feel more fatigue." 7 Then assertion [I]-if p then q-appears to be a very simple causal diagram (black printed variable and link in Figure 1), involving only one positive link from p to q. As will be intuitive for most individuals, the positive polarity implies the mental model " Possible (p & q)." Note that possibility [¬p & ¬q] is also implied by the positive polarity. However, most untrained individuals will find it harder to imagine-as predicted by the theory of mental models. The remaining two mental models are not intuitive. However, when there are other factors of influence, they make sense once one. The causal diagram states that there are also "other causes of fatigue," which explains why it is reasonable to state Possible [¬p & q]; this part of the diagram is dark gray because it is not obvious. The last component of the diagram is even less salient: there are also energizing causes, like stimulating substances, that will relieve fatigue. While (p & ¬q) is altogether possible, coming to this thought is more effortful. It is easy to overlook.

Figure 1: The relationship between mental models and causal diagrams
If the additional variables are not included in the MMDS, possibility (p & ¬q) would not only be overlooked because of its lack of salience: what is not in the MMDS is not available for reasoning. If there is only one causal link from "work hours" to "fatigue," this actually states that this causal relationship is always there and there is no other causal influence stemming from other variables (Pearl, 2009;Pearl and Mackenzie, 2018). Recognizing less salient factors as relevant and accounting for them in the MMDS takes more mental effort. According to the theory of mental models, if there is no reason to make such additional efforts, they will be overlooked.

Conceptual framework
The conceptual framework proposed here consists of four layers. The first layer is the situation: an unstructured cloud of features that may or may not be relevant to achieve a given goal. At level 2, the MMDS is someone's attempt at identifying the situation's causal structure inside a conceptual boundary. Any MMDS may have two types of boundary mismatch: (a) relevant features may have been left out, and (b) irrelevant or even illusory features may have been included. Boundary mismatches can be revised and corrected later on (Sterman, 2002), but while they exist, they will preclude the possibility to recognize possibilities or induce to recognize irrelevant possibilities: every possibility for every relationship between two variables in the MMDS is part of the Fully Explicit Model. We refer to MMDS boundary mismatches as error type 1.
The third layer corresponds to the MMPs. A second type of error is found here: some of the possibilities included in the Fully Explicit Model may not be considered in mental models of possibilities. A possibility that is unaccounted for is equivalent to "this is not possible". But if it is actually possible in the situation (layer 1), then this is an MMP error. We refer to MMP errors as error type 2. Just like type 1 errors, errors of type 2 lead to flawed decision policies.
Policies are the fourth layer. When designing a decision policy, leaving possible circumstances unconsidered opens the door for decisions that provoke undesired outcomes. Possibilities may have remained unconsidered because of either type of errors. Hence, avoiding or identifying and correcting such errors is important for policy development.
The four layers situation, MMSD, MMP and policy are illustrated in Figure 2, which also expresses that each layer depends on the previous one. Since people use their reasoning abilities to figure out decision policies, the ability to identify MMDS errors and MMP errors is helpful. The following section shows this in an exemplary decision task.

The decision situation and its mental model of a dynamic system
In contrast to Erling Moxnes's interest in individuals with specific domain knowledge, here the aim is to study the thinking of naïve individuals: people who (1) do not have specialized knowledge in the domain of the decision situation, and who (2) do not have specific training in analytical or other reasoning methods. The "reindeer experiment" is easy to convert into a "herd management game" (referred to simply as the "game" hereafter) replacing "reindeer" and "lichen" by "animals" and "food," and avoiding references to "slaughtering" because they can trigger emotional reactions in some individuals. The causal structure of the "game" is analogous to the "reindeer experiment".
Decision-makers in the game are briefed with the same information as in the "reindeer experiment".
Their goal is maximizing the production based on the animals in their herd in a sustainable way, without diminishing or even annihilating the food, over a span of 15 years. The only decision they take each year is setting the desired herd size. All animals have a constant rate of reproduction, independent from food availability. But if food becomes insufficient, some animals will starve. Food (measured in mm) has a yearly rate of regeneration that depends on the current thickness. If there is little food, there will be little regeneration. And if food approaches the maximum level of 60 mm, the rate of regeneration also drops. We adopt the computation used by Moxnes: Decision-makers are not shown this equation but told that the highest regeneration is in the middle between 0 mm and 60 mm of food. In that case, the annual food regeneration would be 5 mm. They get historical data from a fictitious predecessor. All animals are equal in their annual food consumption of 0.004 mm. The briefing information allows a systems modeler or systems thinker to construct a sufficiently complete MMDS to figure out a successful herd management policy-but we work with naïve players.
The situation is summarized in the following Figure 3, where stock variables are shown as boxes.
Variable names appearing in the diagrams are printed in italics when used in the text. Feedback loops are labeled by R for reinforcing and B for balancing, but no names are assigned because these loops will play no role in the mental models discussed later. The polarity of the loop from food to food regeneration depends on the level of food: for values smaller than 30 mm, the link is positive, implying a reinforcing loop. However, for values greater than 30 mm, the link is negative, and the loop is balancing. The varying polarity is symbolized by a "v".

Basic assumptions and common elements
In this subsection, we introduce three naïve policies for steering the number of animals in the herd to maximize production without scarifying sustainability. These policies have some commonalities in terms of the underlying mental model of the situation. However, some elements of the mental model are interpreted in diverging ways, leading to different reasoning steps and possibilities, and eventually to different policies. We describe these structural features, assumptions, reasoning steps, and decision rules using a common set of variable names and typographic conventions. Variable names are in italics and the description of values or behaviors is underlined. The symbols =, < and > are used to compare values of variables or the results of calculations;  stands for "is assigned the value of," and → is used for "if … then" in conditional statements. When statements include references to the year before or the year after, and in these cases, the variables are printed with sub-indices y, y-1 and Since the three policies do not differentiate between stock or flow and do not consider the feedback loops, the causal diagrams representing them do not show stock variables in boxes and no loop labels are included.
The following assertions describe arguments that are common to the three policies, and the corresponding mental models of possibilities. We label assertions with A and sequential numbers, using dots to represent hierarchical relationships. The mental models of possibilities have a suffix -Ps (for possibility) followed by a sequential number starting with zero for the first one (the salient possibility), and then the possibilities in the FEM.

A1)
I have more animals, → I will have more production.
A1-Ps0 is the salient MMP. A1-Ps1 is impossible in the game when one considers only one year.
However, an increased number of animals in a given year may lead to overconsumption in later years, which eventually diminishes the accumulated production at the end of the 15 years. Individuals who overlook this possibility can compromise sustainability and the overall performance.
Concerning A1-Ps2, the productivity of real animals can be increased by, for instance, selecting highly productive individual animals and possibly additional nutrition. The game excludes this: individuals who overlook this possibility are not in danger of making flawed decisions.
In real life, A1-Ps3 can be problematic because one might achieve more production without increasing the number of animals by, for instance, boosting productivity with a food complement.
However, since this is impossible in the game, A1-Ps3 is always true, and there is no risk of omitting of consideration.
The next step asserts: Assertion A2.1 leads to: A2.2-Ps0 is an intuitive possibility, but the game's structure also allows the less obvious possibilities to be true. Consumption drains food, but there also is the inflow of natural food regeneration, which depends on the previous food level. Whenever food regeneration exceeds consumption, food will not decrease (A2.2-Ps1). For instance, if there was very little food and animals have been drastically reduced in previous years, food has increased during the preceding year. This may encourage an increase in the herd size. If food had been smaller than 30 mm before, the increased food stock caused food regeneration to increase. In that case, there will be an additional amount of food regeneration.
If the additional consumption is not greater than the additional food regeneration, the food net regeneration cannot become smaller than it was in the prior year. However, food is not always less than 30 mm.
A2.2-Ps2 addresses the opposite case: if the food level has been greater than 30 mm and food had been increasing over the past years, then next year's food regeneration will be smaller than before.
So, even if the herd size remains constant and consumption does not increase, this amount of consumption may now be greater than food regeneration, leading to a negative food net regeneration: a decrease in food. Individuals who do not heed A2.2-Ps1and A2.2-Ps2 risk to keep their herd too small and being surprised that food starts increasing or decreasing.
It is possible to have a constant or decreased consumption and equal or more food (A2.2-Ps3): if there is either (a) an equilibrium between food and animals or (b) natural food regeneration increases beyond consumption, there will not be more consumption and more food. However, failure to consider this possibility cannot lead to problematic effects, except in the very special case that the positive net food regeneration pushes food from just below 30 mm to just over 30 mm. Then the same consumption would produce a negative net food regeneration greater than the previous positive net food regeneration. This is very unlikely, so overlooking this possibility would be an inconsequential type 2 error.

A2.3) food decreases each year → this violates sustainability.
This assertion represents a piece of general knowledge concerning sustainability. In the game's context, sustainability is the ability to keep up operations for the human who wants to extract production from animals, for the animals who want to stay alive, and for the plants that serve as food.
Without food, there would be no animals and no production. Therefore, this assertion is not a conditional-it is rather a prohibited different rule: the food must not decrease each year. It is included in the chain of reasoning steps because it is the context in which the following steps take place.

A2.4)
Therefore: I have more animals, → I will need more food (to be sustainable).
One may then think that having a larger stock of food enables one to sustain more animals: A3) I have more food, → I can sustain more animals.
A3-Ps0 is a very intuitive possibility, but it is only true if food < 30 mm: whenever food > 30 mm, it is false because food regeneration will decrease and therefore food will decrease. When food > 30 mm, A3-Ps1 is true, and so is A3-Ps2: if food decreases, it will approach a thickness of 30 mm, which yields the maximum food regeneration. A3-Ps3 can be safely neglected. Assertion A3 is intuitive but disregards the dynamic nature of the situation-this will later have a consequence on the choice of a policy. The immediate consequence is: Based on the discussion of the possibilities of A3, clearly A4-Ps0 is false, albeit intuitive. The food regeneration for food = 60 mm is less than for food = 30 mm. This implies that A4-Ps1 and A4-Ps2 are true (A4-Ps4 can be neglected). However, naïve decision-makers who pay attention to the salient possibilities will come to the following conclusion:

A5)
From A4) and A1) it follows that: I have the most food, → I will have the largest production.
The salient MMP A5-Ps0 is false, and the less obvious possibilities A5-Ps1 and A5-Ps2 are true.
From here on, the reasoning steps for each policy are distinct. The possibilities A3-Ps0, A4-Ps0 and Consider next the consequences for the ensuing reasoning steps. Only the decision-maker's policy can set the values for animal target and food target. Since the animals depend on food, the food target must be set first. The information provided in the briefing is sufficient in principle to figure out the correct value. It explicitly states that maximum food net regeneration is reached at half of the maximum food level. Decision-makers must infer by themselves that food decreases due to the animals' food consumption (assertions A2.1, A2.2, A2.3, and A2.4). They also must understand the need to compensate for this decrease of food without being told so. Intuitively, the reasoning steps A3-Ps0, A4-Ps0, and A5-Ps0 will come to mind. A very naïve decision-maker may not think this through to define a food target, but instead settle with the thought "I do not know" and reach: The conclusion implied by these reasoning steps would be to use the highest possible food level: bring food net regeneration into focus: the largest food consumption must not exceed the largest food regeneration to maintain sustainability. This implies that "half of the maximum food" leads to the highest sustainable consumption, and therefore: The three specific versions of assertion A6 assign a particular value to a variable and are not conditionals. The value assigned will have a consequence for decisions taken, not for the mental processing of the possibilities. iv This leads to three different policies for driving the animal target, and each of them will be introduced in turn.

Policy P1
This policy assumes that: This means that the decision-maker cannot directly identify a value for animal target but will instead observe and interpret the development of food "since last year." But it is important for the decisionmaker to interpret the meaning of the herd size with respect to sustainability and the meaning of the food level in that context.
There are two ways to think about how to recognize a sustainable situation, and both are based on the detection of a stable food level: A7.1) foody = foody-1 → animalsy-1 is the sustainable number of animals.
A7.2) foody = foody-1 → animalsy-1 is the sustainable number of animals given foody-1, but for other food levels, the sustainable number of animals might be different.
In the case of A7.1, we have the following explicit possibilities: As before, A7.1-Ps0 immediately appears as a representation of assertion A7.1. If things were simple, the other possibilities would be impossible. But the intricate structure of the relationship between (a) the current number of animals, (b) their collective consumption, (c) the impact of consumption on food on one side, and (d) the influence of the previous food level on regeneration and (e) the impact of regeneration on the food level requires us to be mindful of the nonlinear relationship between food and its' regeneration. There is, in fact, one sustainable herd size for each food level. While they may imply different amounts of production, they all keep the current food level constant. It follows that Whenever food increases or decreases, the herd size has not been sustainable for the previous food level, and consequently A.7.1-Ps3 will frequently happen in the game. The only exception would be that the current herd size is sustainable for a food level of 30 mm, but the previous food level was greater than 30 mm. However, in this case, A.7.1-Ps2 must be considered.
To avoid getting trapped in a suboptimal but sustainable situation, it is preferable to follow A7.2.
Heeding sustainability of the herd size does not implicitly assume that there is only one sustainable herd size. Different sustainable herd sizes are possible. However, this thought leads to the second assertion and a distinct set of mental models: Now, A7.2-Ps1 and A7.2-Ps2 are impossible, but A7.2-Ps2 also means that the negation "¬" only applies to the specific food level. Of course, if food is greater than 30 mm, any number of animals which causes food to decrease would be unsustainable for the current food level. But food would only decrease until the natural regeneration is large enough to replace the consumed food: food would become constant exactly at the level for which the herd size is sustainable. Consequently, production would be greater. This is equivalent to stating that: A7.2-Ps3 is always true, and decision-makers cannot make a mistake by overlooking it. According to this deliberation, policy P1 is as follows:

Policy P1:
If food has changed (foody <> foody-1) →I will change animal target in the same direction.
Otherwise, I will slightly increase animal target.
The policy statement is a decision rule; whereas decision-makers could in principle decide differently and do the contrary. However, we assume that this is highly unlikely and therefore do not discuss the various logical possibilities. Following this policy, a decrease in food will lead to a decrease in animals when the herd size is adjusted to the animal target. An increase of food will lead to an increase of animals when the herd size is adjusted to the animal target. The multiplier is used to modulate the strength of the reaction, since there is no reason to assume that animal target ought to be changed in the same proportion as the observed food change. The second part is intended to converge a suboptimal animal target: if a slight increase in the number of animals does not lead to a decrease of food between years y and y+1, then the decision-maker has found a larger sustainable number of animals. Otherwise, the first part of the policy would be triggered for the following year, and the number of animals would be corrected downwards, back to the sustainable number of animals identified one year earlier.
The MMDS beneath policy P1 is shown in its causal diagram representation in Figure 4. The assertions omit feedback loops or stock variables; therefore, there is no loop symbol, and no difference is made between the variable types. The balancing loop food-food net change-animal target-excess animals or animal deficit-production or animals purchased-animals-food consumption-food adjusts the number of animals in the herd to reduce the food net change progressively and thus find a herd size which-in the reasoning of the decision-maker-will optimize the accumulated production. However, it is not in the mental model, and therefore it is not labeled in the diagram. This policy articulates a hill-climbing logic that allows the decision-maker to set the animal target without referring to a food target.

Policy P2
Policies P2 and P3 are based on the idea that one can specify a value for food target, derive a value for animal target and then apply control logic to keep the gap between the target value and the actual number of animals small enough: balancing feedback for system thinkers, but not for naïve decisionmakers. Consider first the setting of the food target in P2: A6.2) food target  60 mm.
The following assertion expresses the belief that the appropriate food level leads to maximum production: A8.1) If food = food target → production will be maximized.
An individual who sets the food target at 60 mm and uses only A8.1-Ps0 to come to a decision will get disappointing outcomes: if food = food target, then A8.1-Ps1 holds true, and production will not be maximized. Simultaneously, the food levels will be distinct from the food target maximizing production. The origin of this error lies in the incorrect belief concerning the food target. This rule prescribes what the decision-maker will do in response to each described condition. We assume people will not do the contrary and not discuss the logical possibilities. The following Figure   5 summarizes the MMDS structure behind this reasoning.

Figure 5: Causal diagram representation of the MMDS beneath policy P2
In Figure 5, the food target gets a value according to assertion A6.2 and the reasoning behind it, which depends on food. The dotted arrow shows the logical dependency. Note that this is not an actual feedback loop in this situation because the food target is constant during the 15 simulated years.

Policy P3
Unlike policy P2, policy P3 follows from:

A6.3 c) food target  30 mm.
The reasoning now focuses on food regeneration: A9.1) food regeneration is maximized → my accumulated production will be maximized.
A9.1 recognizes that rather than the food level, it is the food regeneration that must be the highest possible to achieve maximum production. As individuals have been informed, the maximum food regeneration will be 5 mm per year when the food level is equal to 30 mm. Therefore, the highest possible number of animals have enough food, which in turn leads to maximum production. This also means that the third possibility can never happen. Therefore, not considering these two mental models cannot have a detrimental consequence in the game. Note that A9.1-Ps0 is even true when one erroneously sets a food target of 60 mm-but in that case, food regeneration will not be maximized.
A9.1-Ps1 could only happen if there are other events or influences decreasing production, which cannot happen in the simulated situation. A9.1-Ps2 is impossible in the game and failing to think of it cannot have a consequence.
If one believes to have set the correct food target, it is logical to think: In the light of the previous discussion of food target, clearly A9.2-Ps0 is true if the food target = 30 mm. Otherwise, both A9.2-Ps1 and A9.2-Ps2 will be true. For instance, food target = 60 mm will drive the system towards a food level at which natural food regeneration is not the highest possible one, which also means that for at least one food level that is unequal to the food target, regeneration will be the highest possible one.
The previous reasoning steps lead to a different decision rule connecting a recognized situation to an action:

A10a)
If food < food target → I should decrease animal target.

A10b)
If food > food target → I should increase animal target.
We assume that decision-makers will not act counter to the rules they have elaborated through all the reasoning steps. Therefore, we do not discuss the logical possibilities of this assertion. Consider now how the intensity of adjustments is determined:

A11)
If food approaches the food target as quickly as possible → accumulated production will be maximized.
A11-Ps0-Possible (food approaches the food target as quickly as possible & accumulated production will be maximized) A11-Ps1-Possible (food approaches the food target as quickly as possible & ¬ accumulated production will be maximized) A11-Ps2-Possible (¬ food approaches the food target as quickly as possible & accumulated production will be maximized) A11-Ps3-Possible (¬ food approaches the food target as quickly as possible & ¬ accumulated production will be maximized) The idea that food can approach the food target at varying speeds implies that if there is too little food. Hence, "something" needs to be done to enable food to reach the desired level. It is necessary to reduce consumption, and this leads to the need to decrease the number of animals, accepting that this will also decrease production. A11-Ps0 is true and A11-Ps1 and A11-Ps21 are impossible in the game only when the food target = 30. This is not the case of policy P2, where A11-Ps0 is false and A11-Ps1 as well as A11-Ps2 are true.
The causal diagram in Figure 6 shows a relevant difference compared to policy P2: the food target depends on food regeneration, and its value is assigned according to assertion A6.3. Decision-makers might reconsider this rule in response to surprising outcomes. This would happen over the reiterated decisions and likely result from of revisions to the previous reasoning steps.
The MMDS beneath all three policies have many common elements, as illustrated in Figure 7. They all aim at driving production such as to maximize accumulated production, and they all account for the possibility of having so many animals that there will be starvation due to a food deficit. However, they go different ways to drive the animal target. Policy P1 does not use the food target but uses the yearly food net change to determine the animal target, therefore depending on information already revealed by the system's behavior.
Policies P2 and P3 use inferences drawn from the briefing information to mentally "jump" to the final food level. They then set the food target to different values because they use a reasoning which pays attention to different variables: P2 relies on food, whereas P3 considers food regeneration. Arriving at P3 takes some reasoning that is not directly framed by the salient MMP, implying an increased mental effort.
It is important to see how the different degrees of the salience of the possibilities in assertions A3, A4, and A5 lead to different policies. We summarize this in Figure 8, which presents the respective sequences of assertions (referenced by their respective identifiers) from assertion A1 to the decision rules. Figure 8 shows that policy P1 is a line of reasoning that has the same origin as P2. But it takes a markedly different direction as compared to policies P2 and P3. The differences between P2 and P3 are less blatant. One difference is the value assigned to food target. The other difference is that P2 is based on the food level (assertions A8.8, A82a, and A8.2b), whereas P3 accounts for food regeneration (A9.1, A9.2, A10a, and A10b). Note that in P2 and P3, the same decision rules process different values of the food target; of course, two distinct food targets can lead to distinct decisions. Food levels between 31 and 59 mm will trigger the condition "food < food target" for policy P2. In contrast with this, policy P3 will classify the same food levels as "food > food target". One should expect different performances of these policies. The decision rules are easy to carry out by basic arithmetic operations. v

Figure 8: A decision tree to determine which policy will be applied
To reduce animal target as quickly as possible: animal targety+1  0.
To increase animal target as quickly as possible: food surplus  food -food target; animal target increase  food surplus / food consumption per animal; animal targety+1  animal targety + animal target increase.

The behavior and performance of the policies The context of the simulations
These policies have been inserted in a system dynamics model (see supplementary material for the model documentation; interested readers may also interact with the model through a simple user interface at: https://exchange.iseesystems.com/public/martin-schaffernicht/herd-managementmodel). The trajectories of the herd (animals) and food and the performance in terms of accumulated production have been simulated under various initial conditions (foodinit, animalsinit) because the behavior and performance of the policies can be sensitive to the initial conditions.
This assures that all initial endowments of food are simulated with the optimal number of animals when food has the optimum thickness (1,250 animals), and that all initial herd sizes are tested with the optimum food level (30 mm).
Concerning the assessment of performance, accumulated production is problematic. In the "reindeer experiment", players had to maximize production by maximizing the number of reindeer slaughtered.
Whenever the initial herd size exceeds the optimum, this would generate a windfall benefit because the downward correction of the herd size will increase production. Production does not capture sustainability, except in the special case when an excessive herd size annihilates food, and then all animals starve. We, therefore, measure performance based on the relationship between food regeneration and food consumption per animal over time:

= ∫ ℎ
Whatever quantity of food is added to the stock after the animals have consumed their part at the end of a year defines how many additional animals are sustainable in the beginning period. This performance indicator combines both aspects of the decision-maker's goal to maximize production while remaining sustainable. The following box chart shows each policy's range of performances computed as the number of animals that could graze without starving, given the food regeneration: Figure 9: Box chart of the performance of all policies under 12 different initial conditions The first policy in Figure 10 represents the policy discussed in the original "reindeer experiment" as benchmark policy: "if food < 30 mm → animal target  0, else animal target  1,250." Note that this policy was intended for cases where the initial stock of food < 30 mm. However, when food > 30 mm, the herd size of 1,250 animals will consume more food than the net regeneration compensates. The reasons for these differences in performance become clear when looking at the behavior of animals under these policies. To maximize production, any policy should steer the number of animals so that food quickly converges to 30 mm, which assures the maximum food net regeneration of 5 mm/year and allows the highest sustainable number of animals: 1,250-regardless of the initial conditions. Consequentially, one can assess the policies' respective goodness based on how quickly the herd size approaches 1,250 from varying initial conditions and how stable this development is over time.
Consider nine initial conditions combining animalsinit (650, 1,250, and 1,850) and foodinit (20, 30, and 40). The following Table 1 shows the nine combinations together with the resulting relationship between food and animals. The first column displays the three possible values for food, followed by the implied net regeneration in the second column. The top row shows the initial values for animals, followed by the total yearly consumption each value implies. The nine cells tell us what the relationship between food net regeneration and consumption means for the immediate future behavior of food. The sign  means that food will increase, and  means that food will decrease. We have equilibrium at the start of the simulation when there are 30 mm of food and 1,250 animals. In conclusion, the initial combinations ensure that policies are tested with all three possible food environments for the herd manager: food may increase, decrease, or keep the current value due to the initial number of animals.  Figure 11 shows    Figure 13 shows that policy P3 also makes animals oscillate. The policies' reaction to the initial food levels is the same (): starting with 40 mm of food, the herd size first increases to almost 4,500, then overshoots (food < food target) and is decreased, from where it increases back and then oscillates. An initial food level of 20 mm leads to the opposite movement but then turns into very similar oscillations. A start with 30 mm of food entails one year of stability for all initial herd sizes; however, only if the herd has 1,250 animals at the beginning, the curve is flat at the optimum until the end ().
In contrast to the behavior under policy P2, this time the average (and goal) value is the correct one: 1,250.

Figure 12 The behavior of the herd size under policy P3
Policy P3 outperforms policy P2 because of the correct food target-which it received from previous reasoning steps. Both policies generate oscillations because they do not account for the delay between food regeneration and herd size adjustment. However, both consumption and food regeneration happen in the year prior to detecting a food gap and adjusting the animal target cause desired herd size adjustment (see Figure 7). The balancing feedback structure is a second-order negative loop between food and animals. By driving decisions as if it were a first-order negative loop, policies P2 and P3 cause the oscillations. This failure to perceive a feature of the situation is a type 1 error that makes decision-makers overlook the implied possibilities (type 2 error).

Two types of mental model error in the three policies
Consider now the mental model errors in certain assertions leading to the three policies (a summary table is included in the supplementary material). Three possibilities following from assertions A1 and A2-shared by all policies-are not salient but possible in the game: • A1-Ps1-Possible (more animals & ¬ more production) facilitates the error to overlook the danger of overpopulation. Two other MMP errors are found in A7:

herd size is sustainable for that specific food level)
Failure to consider this would lead Policy P1 to avoid jumps and therefore not search a sustainable situation with higher production.

The use of mental models of dynamic systems and possibilities
The ability Policies P1-P3 are artificial. They exemplify how naïve individuals can analyze the herd management situation and formulate a policy. This limitation notwithstanding, the ability to identify these errors and classify them is a step beyond detecting underperformance in end results and behaviors with undesired consequences which suggest "misperception of feedback"; it opens the way to directly linking articulated mental models, articulated policies to the observed behaviors and end results. This allows to research specific mental errors and develop mitigating interventions.
The concepts and methods of the theory of mental models are well established and add to the concepts and methods toolset so far developed in system dynamics: empirical studies with human participants come insight.
Several research questions can be addressed using this combined mental model approach. • Stress: there can be a tension between the experimental decision situation and the personal knowledge of participants. Experimental decision task simplify reality, and some simplifications may contradict the MMDSs of participants who have domain-specific knowledge. This can trigger negative affect and emotional resistance. This may reduce the willingness to make a cognitive effort or divert cognitive resources from reasoning to impulse control. Then the question arises if specific mental errors should be attributed to the individual or to the system consisting of the individual plus the experimental situation. This will certainly require additional discussion.
• Cognitive load, working memory and cognitive dissonance: the numerous simplifying assumptions needed for a relatively simple decision situation are usually introduced at the outset.
Participants must keep them in their working memory during the experiment. Since the brain has only limited resources, it should be expected that a higher demand than working memory diminishes the attention given to reasoning (Brunyé and Taylor, 2008). If this can be confirmed, does making such assumptions salient just-in-time during the iterations decrease this phenomenon? When participants have prior MMDS in similar situations (see the previous paragraph), it will become more demanding for them to retain contradictory assumptions in their working memory. This may lead to MMDS errors that are induced by the decision task. If this is empirically confirmed, one could argue that some experimental situations trigger artificial mental model errors and improve the experimental settings to avoid such problems.
• Dynamic complexity: the herd management game is arguably the simplest situation involving a dynamic system. Other games like fish banks or versions of the "market growth and underinvestment" situations include more feedback loops and more delayed relationships. Thus, results observed in studies dealing with the previous questions can be examined in increasingly complex decision tasks.
• Transferability of insights: decision tasks may vary in their superficial features, but they may also vary in the complexity of the underlying causal structures (see the previous point). To the extent where participation in experimental games makes individuals learn something, the question arises if this new knowledge can be transferred from one decision task to another one. Would some kinds of MMDS become less frequent? Would some MMP errors decrease?
There remain questions regarding elicitation.
• Elicitation methods: the authors' experience suggests that elicitation methods like questionnaires, comprehension tests or card sorting are useful for eliciting the most accessible parts of recognized MMDS. However, some less accessible aspects and the MMP are only articulated when participants are confronted with an unexpected problem during the game or inquiring questions of an interviewer. Therefore, recording a briefing and a debriefing semistructured interview and thinking aloud during the experiment appear as the adequate elicitation approach.
• Prompts: would decision-makers commit less MMDS errors when the eliciting researcher includes questions about non-salient possibilities in the debriefing interview?
Such research will provide insights into the cognitive reasons behind phenomena like the misperception of feedback, and therefore contribute to the system dynamics literature. At the same time, cognitive scientists gain access to a type of integrative decisions and reasoning that concentrates on dynamic behaviors rather than assertions concerning certain states or certain events and one-off decisions.

Conclusions
This article introduces a way to analyze the structure of and the reasoning with mental models of dynamic decision situations, leading to the identification of mental errors belonging to two different but interrelated types of errors. Two different types of mental models are used in combination: (1) mental models of dynamic systems (MMDS)-well known in the system dynamics field but seldom applied-contain the mental representation of the decision situation, and (2)  The dynamic decision situation used to test this is a variant of the well-known "reindeer experiment".
This article identifies specific mental errors of both types in the MMDS and MMP underlying three naïve policies. The two main behavior flaws were that (1) either any sustainable constellation of food and animals is taken as "the" solution (policy P1) or (2) overshooting corrections lead to unproductive oscillations. These flaws could be avoided by overcoming the identified mental errors.
Our results suggest that combining mental models of dynamic systems with the theory of mental models is fruitful. It provides the possibility to represent how decision-makers reason with their MMDSs, and to pinpoint the errors committed due to, for instance, the misperception of feedback.
These errors are what makes a flawed policy seem correct to a decision-maker. We hope this perspective may motivate researchers to incorporate the articulation and analysis of mental models in experimental studies.
To our knowledge, no empirical studies have been carried out to test this claim. It is now time to include real individuals as decision-makers. Some directions for empirical research have been delineated, and we hope that this article may encourage empirical studies in this area.

Summary of the policies and their mental model errors
The main text discusses the assertions beneath each policy in detail, including the diverse possibilities which may be relevant for the game but unprocessed. This section provides a summary of the assertions and then presents a synoptic table with the respective mental model errors.
Assertions made by all policies: A1) I have more animals, → I will have more production.

A2.3) food decreases each year → this violates sustainability.
A2.4) Therefore: I have more animals, → I will need more food (to be sustainable).

A3)
I have more food, → I can sustain more animals.

A4)
From A3) it follows that: I have the most food, → I can sustain the most animals.

A5)
From A4) and A1) it follows that: I have the most food, → I will have the largest production.
A7.2) foody = foody-1 → animalsy-1 is the sustainable number of animals given foody-1, but for other food levels, the sustainable number of animals might be different.

Decision rule:
If food has changed (foody <> foody-1), then I will change animal target in the same direction.
Otherwise, I will slightly increase animal target.
Assertions made by policy P3: A9.1) food regeneration is maximized → my accumulated production will be maximized.

A10b)
If food > food target → I should increase animal target.

A11)
If food approaches the food target as quickly as possible → accumulated production will be maximized.
Both policies P2 and P3 follow the same rule when it is possible to increase the herd size because food > food target.
Decision rule of policies P2 and P3: Decision rule: If I should change the number of animals → I should change them the quickest possible: To reduce animal target as quickly as possible: animal targety+1  0.
To increase animal target as quickly as possible: food surplus  foodfood target; animal target increase  food surplus / food consumption per animal; animal targety+1  animal targety + animal target increase.

A11.2)
If food approaches the food target as quickly as possible ® accumulated production will be maximized.
A11.2-Ps1-Possible (food approaches the food target as quickly as possible & ¬ accumulated production will be maximized) 1 1 A11.2-Ps2-Possible (¬ food approaches the food target as quickly as possible & accumulated production will be maximized) 1 1

The simulation model
The model has been implemented using STELLA Architect 1.9.5 and can be freely used through a simulator interface at: https://exchange.iseesystems.com/public/martin-schaffernicht/herd-management-model This model has been developed for a thought experiment: to test the hypothetical decision policies in a dynamic decision task. Therefore, there is no reference data to compare simulation data against: no exogenous variables, and no pseudo-random streams.
Important parameters for testing the policies: • Policy switch: values from 1 to 4 to select which policy will be simulated. 1 through 3 correspond to the policies discussed in the article. The fourth possibility is to use a policy following Erling Moxnes' discussion of the "reindeer experiment".
• Animals INIT: INITial number of animals in the herd (between 0 and 1,900).
• Food INIT: INITial stock of food (between 10 and 60).
• Policy multiplier: values from 0 -1 for adjusting how strongly the policy is applied.
• The discussion in the article used data from 12 distinct simulation experiments based on combining diverse INITial stock levels for animals and food, generated by STELLAS's "sensitivity runs" for policies P1, P2 and P3 with the following combinations: o Animals INIT: incremental in 3 steps from 650 to 1,850.
o Food INIT: incremental in 4 steps from 10 to 50.
Other parameters to adjust the simulation: • Animals knockout switch: allows to simulate the model without the herd an its management (0|1).
• Herd management switch: allows to simulate the model with the herd but without its management (0|1).

Accumulated_performance(t) = Accumulated_performance(t -dt) + (annual_performance) * dt
INIT Accumulated_performance = 0 UNITS: Animal DOCUMENT: Sum of the yearly "sustainable" herd sizes -the more, the better given the goals of the game.  UNITS: unitless DOCUMENT: The term 1-((Food-optimum_food)/optimum_food)^2 expresses the distance of the current food level from thew optimum level in relative terms and then squares it to avoid negative values, then subtracts the square from 1. Effect: when food = optimum, the distance will be 1, but the further food is away from the optimum, the smaller the value becomes. This will be used as a multiplier in the regeneration flow equation.
USED BY: food_regeneration food_regeneration_max = 5 UNITS: mm/Years DOCUMENT: When food level is optimal, this will be the yearly food net regenertation. DOCUMENT: Same as policies 2 and 3: the distinct decisions taken be these two policies re the consequence of the distinct values of the food target. This variable only serves readability: it is fed into animal target by policy, and I think it is important to avoid confusion by using names like "policies 2 and 3". Endnotes i A conditional is a "sentential connective" with two clauses: the if-clause or antecedent, and the then-clause or consequent. In classical logic, a conditional can only be false under one circumstance: when the if-clause is true and the then-clause is false (see, e.g., Jeffrey, 1981). Its usual structure is "if p, then q". A "sentential connective" is a connective linking two clauses or sentences; for example, the conditional (if…then…), conjunction (…and…), or disjunction (either…or…). ii A conjunction is a "sentential connective" with two clauses named "conjuncts." In classical logic, a conjunction is true only when its two conjuncts are true at once. Its usual form is "p and q" (see, e.g., Jeffrey, 1981). iii Sometimes only the first model is identified. However, in those circumstances, individuals can know that more models can be deployed. This can be represented, for example, in this way: "Possible (p & q)…", where the dotted line points out the possibility to display more models. iv Decision-makers will also wonder if the goal of maximizing the production over 15 years requires them to maximize each year's production or if there is a better alternative. In the remainder of this article, they are assumed to believe that the production of one year can affect the largest possible production for the following year, and therefore a "sacrifice" (a sub-maximum production) in one year may be more than offset by the maximum production in the following year. Believing that each year's production must be maximized leads to slightly different policies, but these differences only have little impact on the variables" behaviors and the performance in the game (the reader will find a discussion of the corresponding policies in the supplementary material).
v The following operations describe implementation of the decision rule: To reduce animal target as quickly as possible: animal targety+1  0.
To increase animal target as quickly as possible: food surplus  food -food target; animal target increase  food surplus / food consumption per animal; animal targety+1  animal targety + animal target increase.