Framed experiments and Games in Agriculture: A systematic review of the 21st century in economics and social science

Framed experiments and games are a useful medium to understand how context affects individual and group decision-making. They are particularly relevant for field research in agriculture, where alternative experimental designs can be costly and unfeasible. After a systematic review of the literature, we found that the volume of published studies employing coordination and cooperation games increased during the 2000-2020 period. In recent years, there has been greater attention given to natural resource management, conservation, and ecology areas, especially in strategic regions for agriculture sustainability. Other games, such as trust and risk games, have come to be regarded as standards of framed field experiments in agriculture. Regardless of sectoral focus, most games' results are subject to internal and external validity criticism. In particular, a significant portion of the games showed potential recruitment biases against women and no opportunities for a continued impact assessment. However, games' validity should be judged on a case-by-case basis. Specific cultural aspects of games might reflect the real context, and generalizing games' conclusions to different settings is often constrained by cost and utility. Overall, games in agriculture could benefit from more significant, frequent, and inclusive experiments and data – all possibilities offered by digital technology. Present-day physical distance restrictions may accelerate this shift. New technologies and engaging ways to approach farmers might represent a turning point for games in agriculture in the 21st century.

In the first half of the 20th century, experimental methods such as randomized control trials (RCTs) became a popular research method in the area of agricultural productivity, usually in the form of carefully controlled field trials. [1]. Later on, increasingly larger data sources in the areas of labor and industrial organization captured the attention of researchers who identified in those information sources the opportunity for quasi-experimental or "natural experiment" studies. [1]. RCTs were perceived as a clear way to estimate causal effects of policies on a given population [6], while natural experiments were accepted as an next-best way to establish causality without being part of a framed game or experiment [1,2]. RCTs and natural experiments, however, are only part of the field experimenter's toolkit, and cannot always speak to external validity [2]. Moreover, there are many situations in which natural experiments and RCTs have gaps in their ability to inform policy. Researchers cannot generate natural experiments but must work with the exogenous events that have occurred, hopefully discovering an event that provides identification. RCTs can require scaling and experimental designs that are economically and ethically infeasible [5,7] in many situations.
Games can contribute to understanding individuals and groups by mimicking the actual environments and incentives structures where policy interventions will take place. Games are a particularly relevant tool for overcoming obstacles in agricultural research -for example, the longtime frames needed to study agricultural decision making and crop output -by simulating the setup and framing of issues surrounding climate change, food security, biodiversity conservation and governance of natural resources. Despite their ubiquitous use across many disciplines (economics, agriculture, sociology among others), we are not aware of a systematic comparative analysis of the types of games that have been used and the extent to which they can help understand and predict behavior in the real world. We fill this gap by providing a systematic review of the literature and by proposing how games can be used to pursue future research.

Questions:
The overarching goal of this study is to analyze the existing literature on games and agriculture between 2000 and 2020 to understand the strengths and weaknesses of games in rural settings. To this end, we lead with three questions: • Q1: What is the primary purpose and context of games in agriculture in this period? • Q2: What is the scope of the game's results and conclusions based on the experimental design? • Q3: Is there evidence of any technological transition or evolution in the way that games and experiments have been performed and implemented in rural settings?
In general, Q1 requires identifying the subjects and the context of each experiment, including the specific agricultural sector involved and the behavioral test or analysis behind the game. Q2 is designed to assess how robust, and replicable a game's results and conclusions are. Finally, Q3 analyzes the experimental design and technological elements that affected games' scalability, accessibility, replicability, and costs.

Methods
In this study, we present a systematic literature review (SLR) based upon a pre-determined rubric to evaluate each study. The SLR followed suggested PRISMA guidelines for preferred reporting items [8]. We searched electronic databases in June 2020, including the Web of Science (WS) and Scopus, the largest databases of peer-reviewed literature with multidisciplinary coverage of academic articles [9,10]. To limit the scope of the review, we defined full and truncated search terms capturing experiments, games, and agriculture in the following general search strings on the title, abstracts or keywords: (game* OR experiment* OR test* OR behavior* OR gamif*) AND (agriculture OR farm* OR smallholder). English from 2000 to the  present (WS: 78,440 and Scopus: 118,016). We refined the search based on Scopus subject area filter, including social science, economics, business and psychology (Scopus), and the closest WS categories, which included economics, agricultural economics policy, business, behavioral science, and interdisciplinary social science. As a result, WS suggested 3,824 references while Scopus suggested 11,265. These numbers indicate that even if all WS references were duplicated, our first refined search suggested more than ten thousand references (Fig. 1).

Figure 1. Systematic literature review and screening process
Given this large number of papers, we refined our search and be more specific on the first component of the search string, including only (game* OR gamif*). Consequently, we defined the new string as (game* OR gamif*) AND (agriculture OR farm* OR smallholder). As before, we completed the search based on the title, abstracts, or keywords. As a result, WS suggested 184 references while Scopus 513. We then removed 135 duplicates, consolidating a final pool of 562 references (Layer 1).
We improved our first layer screening according to the following inclusion criteria: i) well explained research methods, ii) explicitly stated and described game/experiment/test, iii) impact or effect reported and iv) clearly described outcome related to farmer behavior or decision making. For the second layer of screening, we excluded references following these criteria: i) economic models with game theory but without any fieldwork, tests or experiments conducted with farmers, ii) surveys or interviews without activities, games, tests or experiments (even when fieldwork or visits occurred), iii) publications with mixed methods, which included critical perspectives, protocols, personal observation, and reviews but no experimental or game design, nor interactions with farmers, iv) studies that used the word "game" in a different context (i.e., wild animals hunted), v) studies on animal (non-human) behavior, and vi) lab experiments and games with subjects other than farmers to simulate "rural contexts" or farmers (mentioned in the following section). As a result, layer two concluded with a pool of 104 references from which 52 papers were randomly selected for this review.

Categories
Field experiments lie 'on the bridge' between the laboratory and the 'real world' of naturally occurring data [2]. Lab experiments employ a standard pool of participants (students), an abstract framing, and an imposed set of rules. Artefactual field experiments (AFE) are the same as a conventional lab experiment, but participants are non-students. Framed field experiments (FFE) move a step closer to the naturally occurring setting, including the field and context of participants. Finally, natural field experiments are similar to FFE, but participants are performing their everyday decisions in their natural environment without knowledge that they are being observed [2,3]. Since our main goal is to capture real-life contexts and decision making, our review considers mainly games that can be classified as FFE. In addition, as we anticipated that a significant portion of the games occur in low-and middle-income regions, with significant income and education differences between students and farmers, we exclude lab experiments where students assumed the role of farmers (e.g. [11]), To characterize the games being studied, we defined four experimental or behavioral test domains. Social preferences include games measuring altruism, reciprocity trust, fairness, and, subjects' behavior and attitudes toward other's wellbeing. Coordination and cooperation games include canonical and non-canonical experiments of public good games, common pool resources dilemmas, and governance challenges. Market games and simulations include games that mimic the rules of various real-world market institutions (e.g., auctions, insurance) and that measure willingness to buy or sell different products and services. Our final group, behavioral and cognitive bias includes experiments concerning risk preferences, attitudes, and memory, intertemporal behavior and discounting, and learning and adoption of new technologies. Agricultural sectors, when defined, are classified by growing crops, raising animals, fishing, hunting, and forestry. Moreover, we identify the context in which experiments took place to determine the policy areas where recommendations and conclusions would apply, specifically: farm risk (climate risk, financial and market risk, and pest and diseases), conservation and natural resource management (land, water, biodiversity), inclusion and poverty (gender, ethnicities, food security, health), and networks and social capital formation (including formal and informal institutions, organizations, and regulations (Table A1).
We defined specific categories to assess the scope, validity, and robustness of the experiments. Some of these categories involved direct metrics, while others required some level of judgment. Among the direct metrics that we consider are: i) the number of participants, ii) the number of rounds and/or repetition of the experiment, iii) the report of a balanced sample in terms of gender, age and ethnicity of participants, iv) the monetary and non-monetary incentives (real and simulated), and v) the presence of power calculations, which can shed light on whether a study has a sufficient number of participants to detect an effect, and the validity of any detected effect sizes [12].
We judged the credibility of each study's findings based on whether the experimental design considered credible counterfactuals and/or accounted for the potential existence of confounding factors. We also evaluate the technologies used to collect data and whether there was a possibility for continued data collection and impact assessment. We consider metrics that reflect how personalized the data collection process is including: i) the identification of the experimenter (i.e., researcher, local collaborator, digital tool), ii) the modality or interface (i.e., paper-based, cell or smartphones webbased software) of the game and iii) the player's possibility of sharing the game with other members of their community. We reviewed each paper based on the above standardized categories, and we cross-checked codes for internal consistency. We cleaned, analyzed, and graphed the data in R.

Descriptive statistics and results
Over the last twenty years, there has been a significant increase in the volume of publications related to games in agriculture, mainly focused in Asia, Sub-Saharan Africa, and Latin America. The bulk of the literature has been on growing crops and, more recently, on forestry and livestock ( Figure  2). We found relatively fewer studies with games conducted in North America, Europe or Australia, but there has been a study an increase in those regions since 2010. In terms of research focus, coordination and cooperation games are the most frequently utilized games worldwide, and most prevalent in Sub-Saharan Africa and Europe. In Asia, market games and simulations are the most popular while, in Latin America, coordination and cooperation games and social preferences games are more common (Figure 2). Games associated with social preference were specially oriented to measure trust, while behavioral and cognitive bias games focus on risk preferences and attitudes (Figures A1 and A2). Activities studied have become more diverse over time. For example, activities related to conservation and ecology are now being studied using games, and were previously not commonly researched agricultural activities ( Figure A3). There has also been an increase in studies evaluating group and collective decision making ( Figure A4). Games regarding farm risk management have also become more frequent over the last 15 years ( Figure A3).
Regarding the scope and validity of the games, the majority of studies either did not report or were not balanced on gender, age and ethnicity, and very few reported power calculations for establishing sample size (Figure 3, red). Breaking the sample down by study focus, studies of individual preferences and behaviors had a larger sample size, and they were more gender-balanced than studies of cooperation and coordination, and games related to market simulations. However, individual behavior studies were also much less likely to have repeated follow-ups over time or use real monetary incentives (Figure 3, red). Studies of cooperation and coordination were judged with the lower level of causal validity, and studies of market games and simulations were more likely to have confounding factors (Figure 3, blue).' Figure 3. Average research scope by type of behavior studied. Blue represents measures that required reviewers' judgment. Variables representing aspects of research quality were transformed into a 0-1 scale for comparability (see Appendix, Table A2 for the coding scheme).
Of the 46 reviews that recorded survey modality, only 7 (15%) were digital (web-based, computer or tablets), and the remaining 39 (85%) were paper-based. Similarly, out of 52 studies, 78% reported no data gathering after the initial intervention or experiment, and 71% reported having no mechanism for participant feedback over time. Around 30% of the games incentivized participants with actual monetary rewards, while only 17% were associated with intrinsic motivations. The remaining 52% used simulated/hypothetical monetary rewards.
Regarding the results of the papers reviewed, Table 1 summarizes key findings by research focus and scope (as defined above). For instance, a body of evidence suggested that contribution to public goods is greater in smaller and more familiar settings, and that the efficacy of policies to promote environmental public goods depend on their design. Another consistent finding is that small, selfpolicing organizations exhibit more cooperative behavior, and that individual incentives influence cooperative behavior. In the area of market games and simulations games, multiple papers found that that markets for environmental goods are sensitive to design choices, and playing games that educate people about financial instruments can eventually affect demand. Behavioral and cognitive bias games show that extrinsic motivators do not appear to crowd out intrinsic motivations. Moreover, risk games suggest that preferences revealed through games often reflect real-world risk factors. Finally, social preference games suggest that: familiarity with neighbors leads to more prosocial behavior; social trust influences technology adoption; trust influences farmers' willingness to participate in potentially risky social actions; and scarcity is not always an explanation for anti-social behavior.  [13] finds that members of larger farming communities were less willing to contribute to a public goods game.
The efficacy of policies to promote environmental public goods depends on their design: A simulation game with farmers in Europe [14] finds that actionbased incentives (i.e., rewards for individual planting behavior) were more effective at promoting biodiversity conservation than results-based incentives (i.e., rewards for collective biodiversity outcomes).
However, game actions may not always predict actual public goods behavior: An observational study in Sierra Leone [15] finds that behavior in a public goods game had no meaningful correlation with actual pro-social behavior in a community development program.

Coordination and cooperation games (n = 7):
Small, self-policing organizations exhibit more cooperative behavior: In the Philippines, a study finds evidence that farmers' common-pool game contributions depended on their neighbors' actions [16]. Similarly, a study in the Republic of Congo [17] finds that self-monitoring reduced free-riding in a common pool game.

Individual incentives influence cooperative behavior:
A study in the US finds that farmers' propensity to cooperate in a game depended on their degree of risk aversion and their expectation of others' behavior [18]. A study in Germany finds that a social nudge reduced farmers' freeriding in a simulation game [19]. Likewise, a study in Latin America finds that individual incentives were more effective than collective incentives in promoting Public good games (n = 1): An observational study of coffee farmers in Costa Rica [22] finds that farmers from different communities contributed less to a public goods game than farmers from the same community and that free-riding behavior was correlated with actual free-riding behavior.

Coordination and cooperation games (n = 7):
Five of these studies were descriptions of serious games without quantitative hypothesis tests (only qualitative accounts of participants' actions and feedback).
A study in South Africa and Namibia finds that contributions to a commonpool resource game were greater in homogenous sociodemographic settings. An observational study across communities in the Levant finds that farmers from places with communal water management systems were less likely to free-ride in a simulation game, as compared to farmers from places with top-down water management systems [23]. cooperation in an ecosystem services simulation [20].
In the Philippines, a study finds that women and men had almost equal decision-making power in an intrahousehold farm investment simulation.
Behavior in games can be a good proxy of farming organizations' financial health: An observational study in Ghana [21] finds that the financial performance of farmer cooperatives in Ghana was correlated with its members' behavior in a risky dictator game.

Market games and simulations (n = 9)
Above median score (1.3) Below or in the median score (1.3) n = 2 n = 7 Markets for environmental goods are sensitive to design choices: An RCT in Liberia, using a simulation game, finds that monetary incentives to reduce fertilizer usage were more effective when they were framed as punishments rather than rewards, but less sustainable [24]. In contrast, an RCT in Tanzania [25] finds that PES was more effective in improving forest conservation than mandated levels of contribution (backed by penalties).

Insurance demand (n = 3): A study in
India finds that the average willingness to pay for weather insurance was around 9% of the maximum possible payout, and that demand was greater for the group as compared to individual insurance [26]. A simulation in Ethiopia finds that farmers exhibited a preference for insurance over other risk management options, including high-interest savings [27]. An RCT in Ethiopia finds that playing an educational game increased uptake of index insurance by 10% [28].

Payment for environmental services (n = 2):
A study in Indonesia [29] finds that longer-established farmers and those with larger plots were more likely to win PES auctions. Actual conservation compliance cost was about 115% greater than the bid outcome on average, and only about 55% of farmers completed their contracts. A qualitative study in Latin America finds that the implementation of PES schemes often rests on deep-seated power asymmetries and, therefore, risks reproducing existing inequalities [30].
Other (n = 2): Two remaining studies were descriptions of participatory design processes without any

Behavior and cognitive bias (n=15) Above median score (1.3)
Below or in the median score (1.3) n = 7 n = 8 Extrinsic vs. intrinsic motivators: (n=2) Extrinsic motivators do not appear to crowd out intrinsic motivations: A study in Germany [31] finds that both direct individual nudges and social comparisons reduced farmers' illicit fertilizer use in a simulation game, but that combining the two did not lead to any additional effect. Likewise, a study in Colombia [32] finds that PES did not change farmers' selfreported motivations for conservation, and that PES improved conservation behavior in a simulation game, regardless of its design (i.e., individual vs. collective payments).

Risk preferences/attitude: (n=5) Risk preference revealed through games often reflect real-world risk factors: A study in
Paraguay [33] finds that when the risk of theft is higher, the amount of gift-giving increases and that risk attitudes are highly predictive of play in behavioral games. Similarly, [34] finds that risk-averse farmers are less likely to invest, even with insurance available. A study in Vietnam [35] find that low-wealth farmers reduce their fertilizer intensity when their risk aversion increases, and the marginal effect of risk aversion is insignificant for highwealth farmers.

Risk preference findings have implications for the "poverty trap" model of development:
A study in Ghana [36] finds that farmers are more concerned with maximizing agricultural productivity than minimizing variance.
Addressing Issues of power asymmetry (n = 1): One study [37] finds that the use of games for collective decisionmaking can encourage a greater socioeconomic variety of farmers to voice their opinions.
Risk preferences/attitude: (n=4) [38] looks at risk aversion in farmers vs. freelancers and finds that farmers were more risk-averse than the freelancers. However, both groups exhibited constant partial risk aversion and decreasing absolute risk aversion. [39] finds that most farmers preferred cash payments when given a choice to index insurance contracts, even when the insurance contracts offered substantially higher expected returns. [40] finds that it is more important to consider a farmer's situation, information available, and the emotional state to predict risk aversion than assume a fixed attitude among all farmers.
Different measurements of risk preferences may yield inconsistent results: [41] finds that the elicitation technique chosen influences the degree of farmers' measured risk aversion.

Games and technology adoption (n = 3):
Games can be a useful tool for facilitating technology adoption: [42] finds that farmers who played a serious game about shrimp farming increased information exchange with peers, and consequently, increased the likelihood of technology adoption.
However, games' abstraction can limit their applicability: [43] finds that participatory scenario development was better suited for farmers' collective decision-making processes than role-playing games, which farmers found to be more abstract.
Group composition and individual identity influence productive technology adoption: [44] finds that women have a stronger preference for agroforestry, and male-only groups prefer more production (timber) and protection forest. [45] looks at group size and leadership and finds that smaller groups promote more coordination, but leading by example, did not improve coordination.
Social preferences (n=7) Above median score (1.3) Below or in the median score (1.3) n = 6 n = 1 Altruism (n = 1): Familiarity with neighbors leads to more pro-social behavior: A quasi-experimental study in Cambodia (exploiting a resettlement lottery) finds that resettled farmers gave 42-75% less to their neighbors in a solidarity game [46].

Trust (n = 4): Social trust influences technology adoption:
An RCT in Ecuador finds that receiving agricultural advice from an extension agent led to greater trust (as measured by a trust game) and greater learning than when advice was given by a neighbor [47]. Likewise, two separate observational studies in Ethiopia find that behavior in a trust game was correlated with actual soil conservation behavior [48,49].

Trust influences farmers' willingness to participate in potentially risky social actions:
A study in Ecuador finds that delayed loan repayment led farmers to trust their partners less (as measured by a trust game), and consequently made them less willing to loan money in the future [50].

Other (n = 1): Scarcity is not always an explanation for anti-social behavior:
A study in Latin America finds that farmers' cheating behavior in a multi-round game did not depend on their current level of scarcity in the game [51].
Other (n = 1): A study in Cameroon finds that "knowledge elicitation tools" (semi-structured interviews with a game component) were an effective method for measuring farmers' attitudes toward conservation [52]. Scope score = 1/2(Sample size + 1/3(Gender balance + Ethnic balance + Age balance)) + 1/2(Repetition + Incentives) + 1/2(Validity + Evident confounding factors). n of 49 because (3) papers were missing data on one or more variables, and it was not possible to calculate a scope score for them. See Table A2.

Discussion
The increasing number of framed field experiments and games in rural Asia, Africa, and Latin America during the last twenty years have contributed to filling the information and modeling gaps that characterize socio-economic and environmental research in those regions. The growing popularity of coordination and cooperation games involving groups rather than individuals is consistent with a greater focus on natural resource management, conservation, and ecology -areas that depend on collective decisions. The games evaluated here primarily assessed contributions to public goods and the determinants of cooperation for land use, water management, biodiversity, ecosystem services, and organizational aspects. Europe, Australia, and the USA showed a more moderate, but increasing use, of similar games, particularly with the aid of innovative interfaces and technologies. Digital tools and electronic formats can expand the frequency and sample size as compared to games conducted in paper-based formats [14,19,31]. Overall, the rising interest in sustainability during the last twenty years [53] is reflected in the higher frequency of cooperation games that took place in strategic regions in terms of diversity, natural resources, and poverty and there are indications of the use of new technologies.
These new questions and focus areas did not replace traditional games and settings during the last twenty years, however. Market games and simulations-which have been common in South and Southeast Asia-reflect a longstanding interest in understanding individual and collective behaviors in the context of farm risk and inputs management [26,54,55]. Additional and complementary questions related to payments and markets for ecosystem services, forestry, and gender differences appeared after 2015 [29,44,56]. Trust and risk games, on the other hand, were among the most likely to characterize social preferences and behavior of farmers, usually in the context of cultivating crops. Canonical experimental games, including the ultimatum game, the dictator game, and lottery games [57], were frequently utilized with small variations depending on the research question. Some studies also validate trust and risk assumptions using canonical games even when the context of the experiments is different from the social behavior or farm risk management but is closer to other areas such as ecosystem services or natural resource management [32,49]. In brief, and although not perceived as "innovative," the frequency of canonical games suggest that they became a standard to support research and framed field experiments in agriculture.
Most games suffered from challenges in establishing whether their findings uncovered causal effects for the population and the specific context being study -that is, internal validity -as well as whether game behavior reflected real-world decision making in other contexts-that is, external validity. RCTs are often considered the gold standard for internal validity. [6]. A robust experimental design should also consider a power test to define the optimal sample size. Very few of the games analyzed fully addressed both aspects. Moreover, a significant portion of the experiments showed potential recruitment biases with little participation by rural women who are usually excluded from social gatherings and activities [3,58]. Spatial and temporal limitations to analyze long-run effects were also evident as games usually referred to one-time interventions with no repetition. Although some of these biases could be corrected by controlling for relevant characteristics in a statistical model (i.e., gender, age, ethnicity), the lack of heterogeneity and mechanisms to track participants over time reduced the internal validity of the experiments [6,59]. Because these limitations are not necessarily intrinsic to the methodology of the games, many of these issues could be addressed though game design and implementation in future literature. A relative advantage in many of the of games, however, was the use of real money to generate pay-offs that better mimic real contexts. Usually, but not all, participants received a payment in the studies that we reviewed. Some payments were randomly assigned to reduce the costs of the experiments [60]. Indeed, judging an experimental design requires understanding when the limitations compromise -or not-validity on a case-by-case basis. For instance, gender imbalance or specific cultural aspects might reflect, in practice, the real context where individual and collective decisions are made, and behaviors are shaped.
Similarly, the predictive value of the games' findings in a different context -the external validityshould be judged according to the diversity and complexity observed in agricultural systems around the world, and the policy and project goals associated with a specific game. Some papers show that behavior in games does not necessarily reflect behavior in the presence of actual programs, and insurance market games tend to overstate real demand preferences [27,41]. In contrast, there is also evidence that behavior in experimental settings is correlated with real-world behavior [17,[61][62][63][64]. The ambiguity in external validity raises questions regarding the need (and the costs) of generalizing conclusions and policy recommendations, and highlights the importance of considering the relative strengths and weaknesses of each study. Depending on the objective, it could be justifiable to sacrifice external validity and generalization in defense of context-based responses. In brief, there are clear opportunities to improve internal validity and identification strategies in most of the games. However, validity needs to be judged according to the context of each game. The gold standard should be the best method for the question at hand and not a universal approach [7].
Games in agriculture will benefit, in general, from bigger, more frequent, and inclusive experiments and data. Many games in agriculture are, however, not effective or efficient as a result of the costs of visiting remote villages and the limited number of facilitators and participants. Scalability will likely be improved in the near future with the increasing connectivity observed between humans and technologies with respect to agricultural knowledge and advice networks [65]. For example, researchers can use mobile applications (e.g., ODK) to input responses offline, which makes data cleaning and analysis more efficient [32]. More sophisticated tools have been developed, targeting extensionists and technicians that will eventually reach farmers [66]. Some initial pilots and proposals also point to the possibility of using gamification -defined as the use of game design elements in non-games contexts-in agriculture through websites, apps, and SMS [67][68][69][70]. These are important steps to expanding the toolkit of games in agriculture and to incorporating new technologies. However, the low level of digitalization in our sample shows that this integration is still underdeveloped compared with other sectors [71]. Certainly, the lack of investments and fixed costs of new technologies can constrain the use of digital tools and games in agriculture in less developed regions. Nonetheless, 70 percent of the poorest 20 percent in low and middle-income countries have access to a mobile phone, and one in three people have internet access [72][73][74]. Although connectivity prevails in urban settings, it has spread to rural areas, where the ratio of farmers to extension workers exceeds 1000 to one [65,[72][73][74]. And with increased physical restrictions from COVID-19, there is a need to reach and engage with farmers using innovative technologies. This opportunity might represent a transition for games in agriculture.
The scalability of framed field experiments and the integration of new technologies and games will require better connectivity conditions and researchers with the skills to make this happen. Similar to how the availability of massive data sets was critical to turning attention toward specific areas of labor and industrial organization in the past [1], games in the field will require researchers with knowledge of application development and remote data collection. Conducting games in the field demands managing complex sets of relationships between parties, the ability to present and communicate ideas to lay individuals, and the understanding of the value of framed field experiments for organizations. These characteristics define the value of researchers leading games and framed field experiments today [2,75].

Conclusions and future directions
The 21st century has witnessed a significant surge in studies that are gathering data via framed field experiments. Games in agriculture are not an exception. There is an increased emphasis on agricultural sustainability in Africa, Asia, and Latin America as well as a higher frequency of cooperation and coordination games that reflect collective decisions for conservation and natural resource management. Other canonical games related to social preferences and individual behavior -e.g., trust and risk games-also remain prominent in the field. There remains ample opportunity to scale existing games and improve their validity, both by conducting games in multiple contexts and with multiple populations, but also by repeating games and monitoring populations over time. New technologies for data will play an essential role in the purpose of improving identification strategies. Judging by the low but increasing percentage of games using digital tools in agriculture, it is possible that gamification, will play a role in the way that games are designed and implemented. Emerging topics in conservation and climate risk will likely shape the agenda in the coming years. Applied economists and social scientists are called to be part of this challenging and exciting agenda by scaling existing studies, incentivizing participation of excluded populations with engaging and innovative games, and embracing the information communication technologies of this century.
Author Contributions: All authors have read and agreed to the published version of the manuscript. Conceptualization, methodology, and software: JNHA, MM, AH, DO. All authors contributed with the review validation, formal analysis, investigation, writing, draft preparation and editing,.
Funding: This research received no external funding.

Balance indicators
Gender balance (yes/no/no reported); different ages (yes/no/no reported); different ethnicities (yes/no/no reported).

Q3: Is there evidence of any technological transition or evolution in the way that games and experiments have been performed and implemented in rural settings?
Setting: facilitator Researchers' leading workshops/games; extension service or local collaborators facilitating workshops/games; digital tool or information technology (IT) for end-users (farmers); digital tool or IT administered by local collaborators/extension services. Setting: modality Paper-based, cell phone/SMS; smartphone/apps; web-based software/tablets/computer.

Continued assessment
The game can eventually be shared by initial participants with other potential players on their communities (yes/no). There is opportunity to continuing data gathering after the game/experiment is introduced and performed for the first time (yes /no). The experiment/game facilitate feedback opportunities along the time (yes/no).
In addition to these specific categories, the reviewers' analysis included: identification of research question, main hypothesis, major result and additional comments. Multiple rounds on the game = 1; multiple rounds of surveying = 1; no or not reported = 0 Incentives Actual monetary incentives = 1; simulated monetary incentives = 0, intrinsic incentives = NA Validity assessment Low = 0; medium = 0.5; high = 1 Confounding factors Yes = 1; no = 0 Scope score = 1/2(Sample size + 1/3(Gender balance + Ethnic balance + Age balance)) + 1/2(Repetition + Incentives) + 1/2(Validity + Evident confounding factors). The n of 49 is because a few (3) papers were missing data on one or more variables, and it was not possible to calculate a scope score for them.