How to discourage adversaries from affecting decision outcomes of a repeated patent application decision–making process

Outcomes of repeated decision–making processes may be affected by adversarial actors, without being noticed. Adversaries may try to gain knowledge about a particular decision–making process, identify its decision–makers, and guess which underlying decision support model is used. Then they can simulate the process, and craft different scenarios to affect its decision outcomes. Therefore, designers of decision support systems need to incorporate this in the decision modeling phase. The purpose of this study is to demonstrate this for the repeated decision–making in a patent application process. In this process, two sequential decision outcomes can be affected by adversarial actors: a company’s decision to which type of patent office to send a patent request to, and the decision of a specialized patent officer to grant an application, or not. It is motivated that the company’s decision–maker is bounded rational. A theory for information–theoretic bounded rational decision–making under uncertainty proposed by Ortega et al. is adopted to model this type of decision–maker. A framework is provided to simulate a number of scenarios that adversaries may deploy to affect decision outcomes of a repeated patent application decision–making process. The framework is also utilized for statistically testing the presence of the scenarios, and to demonstrate how to discourage adversaries from deploying them.


Introduction
Nowadays companies and organizations are increasingly using mathematical models to support their decision-making processes. However, in most cases, the actual decisions are still taken by humans. Adversaries may try to gain knowledge about a decision-making process and the used decision criteria, may try to guess the supporting mathematical model, may seek ways to degrade the performance of this model, and may try to influence decisionmakers by presenting them wrong insights from data. To make a supporting decision model less vulnerable to such adversarial influencing, a solution would be manual and ad hoc reconstruction of the decision support within the parameters of the used decision model, and adapt the model to the adversary's evolving manipulations [1]. An area of research that focuses on the subfields risk analysis and decision analysis is adversarial risk analysis (ARA). In ARA, one asserts that analysts should use Bayesian thinking to describe their beliefs about an opponent's goals, resources, optimism and type of strategic calculation, while placing subjective probability distributions on all unknown quantities. This in order to enable analysts to maximize their expected utilities [2]. Not all decisionmaking processes, however, are suitable for applying Bayesian thinking, such as the patent application decision-making process.
A patent application is a request pending at a patent office for the grant of a patent for an invention, being described in a patent specification and a set of one ore more claims stated in a formal document, including necessary official forms and related correspondence [3]. Companies and organizations normally use tools with patentability criteria, like a patenting decision tree and an additional machine learning system, to assist in determining which technological inventions should be patented [4]. Once the decision is made to apply for a patent, a company's intellectual property department evaluates the proposed technology (novelty) in relation to its patenting strategy. Depending on the outcome, a decision-maker X in a company's intellectual property department decides which geographic coverage a patent must have (country, or region), and to which patent office a patent application has to be filed to (decision A: country office or regional office). Lastly, the relevant documentation is provided to and examined by a responsible patent officer at the chosen patent office (i.e. decision-maker Y), who in a process of negotiating or arguing decides whether to grant a patent or not (decision B). So, a binary decision A made by decision-maker X is followed by X observing the outcome of a subsequent binary decision B taken by decision maker Y. In X's deliberation process there is interaction with the environment in that he/she selects the choice alternative a according to some optimized probability distribution P A={regional,country} (a). This has a stochastic effect on the environment according to the probability distribution P(o|a), where o is X's observation of the outcome of decision B. Decision maker X's resources to extensively evaluate all choice aspects of decision A are limited, and this limitation reduces X to a state of bounded rationality, a term coined by [5]. To model decision-maker X's decision-making, the theory for bounded rational decision-making under uncertainty developed by [6,7] is adopted. Put in a repeated patent application context, the resulting repeated bounded rational decision-making model (here, referred to as model M 1 ) requires to choose the values of the so-called boundedness parameter and the value of a utility parameter.
In a patent application process, both of the binary decisions A and B are vulnerable to adversarial influencing. For example, decision-maker X may be an adversarial actor, or he/she may be influenced by an adversarial co-worker of the intellectual property department who is presenting wrong insights from data. On the patent office-side of the process, decision maker Y may be an adversary, or he/she may be manipulated by an adversarial co-worker. There is even the possibility that adversarial actors on both sides of the patent application process are closely cooperating. In the present study, a simulation framework is proposed to generate for both binary decisions A and B a sample of decision outcomes in complete absence of adversarial influencing, and an equally sized sample in case some adversarial influencing scenario has been active in the same time window. By pairing corresponding samples, a two samples (paired) proportion test can be conducted to test whether there is a significant difference between two proportions of the same decision outcomes, or not. To be precise, an asymptotic McNemar-test without continuity correction [8].
In the present study, six adversarial influencing scenarios have been defined and implemented: three different basic scenarios and three combinations of these scenarios. A measure has been introduced to express the attractiveness of an influencing scenario from the perspective of an adversarial actor. The measure is based on the observed average number of times the presence of a scenario can statistically be proven, and the observed average Cohen effect size [9] of the proven presences. A multi-objective optimization model (referred to as model M 2 ) is formulated to minimize the set of object functions corresponding to the six considered scenarios, with regard to the to be chosen boundedness parameter and utility parameter of model M 1 . It has been made plausible that solving model M 2 for a time window yields the most favorable parameter value pair of model M 1 in this time window, and implementing this parameter pair will make it less attractive for adversaries to deploy the six considered influencing scenarios in the time window.

Results
This section provides the results of the performed simulation study, and the conclusions that have been drawn. As stated in the introduction, the purpose of the study is to demonstrate how a repeated patent application decision-making process, on average, can be made less vulnerable to adversaries trying to affect its decision outcomes by deploying six considered influencing scenarios in some time window W. Three time windows are considered (1 year, 2 years, and 3 years). Mathematical details about these scenarios, the statistical test(s) conducted to test for their presence, the modeling of the repeated patent application decision-making process (i.e. model M 1 ), and the used simulation framework can be found in Section 3. Two parameters 0.30 < β d < 0.60 and 1.1 < U d (1, R) < 5.0 of model M 1 remain to be specified (see Subsection 3.2). A second model M 2 (see Subsection 3.8) is developed to determine the most favorable parameter pair (β d, * W , U d, * W (1, R)) for a time window W. In model M 2 , a set of objective functions is to be minimized with regard to the parameters β d W and U d W (1, R), where each objective function corresponds to a considered influencing scenario •. An objective function represents the attractiveness of the corresponding influencing scenario from the perspective of an adversarial actor, and requires two statistical quantities as input: the sample mean of positive test results µ ptr • (·,W) (with · = X or Y) and the associated sample mean Cohen distance µ ∆ Cohen,• (·,W) (see Subsection 3.7). Here X and Y correspond to the statistical test conducted for the decision outcomes of decision-maker X and Y, respectively (see Figure 9 below). To obtain these statistical quantities, 50 simulation runs with 50 sub-runs per simulation run were performed for each time window (see Subsection 3.6). Subsection 2.1 illustrates how the attractiveness of each of the three basic influencing scenarios • = 1, 2 and 3 in a time window depends on the behavior of the two statistical quantities µ ptr • All plots shown in this subsection were generated on a parameter grid in which β d W is ranging from 0.32 to 0.59 with steps of 0.02, and U d W (1, R) is ranging from 0.11 to 4.9 with steps of 0.1. The main contribution of the present study is to propose a mathematical model (M 1 ) for repeated patent application decision-making that inherently includes a second mathematical model (M 2 ) that takes into account that adversaries may guess the structure of model M 1 and its parametrization, and deploy six different crafted scenarios to affect its decision outcomes. Moreover, model M 2 provides a parametrization for model M 1 that makes it less attractive for adversaries to deploy the six scenarios. This is the subject of Subsection 2.2.
In total six adversarial influencing scenarios have been implemented in the proposed simulation framework, for a time window W of 1 year, 2 years, and 3 years. Three basic scenarios, denoted by • = 1, • = 2 and • = 3, and three combinations of these scenarios, denoted by • = 2 + 3, • = 1 + 2 and • = 1 + 3. Figure 1 and Figure 2 below show surface plots of the sample mean ptr-scores and sample mean Cohen distances on the z-axis for the basic influencing scenario • = 1 and the time windows W = 1 and W = 2, respectively. In the bumpy surface plots of the mean Cohen distance, the heights are almost similar for both time windows (between a medium effect 0.5 and a large effect 0.8). And the surface plots for the sample mean ptr-score show a relatively smooth landscape surface. The surfaces of both statistical quantities rise with increasing values of the grid parameters, where the heights in the upper right warm colored area of the sample mean ptr-score surface for time window W = 1 are considerably higher than those for time window W = 2. Hence, it does not seem to be attractive for an adversarial actor to deploy this scenario for a period longer than 1 year.  The computed mean power of the conducted statistical tests (i.e. McNemar tests) for the time windows are β = 0.96 ± 0.02 for W = 1 and β = 0.95 ± 0.02 for W = 2, so the power of the conducted McNemar tests is sufficient for security analysts of a patent applying company.

First inspection of the attractiveness of the three basic influencing scenarios
For the basic scenarios • = 2 and • = 3, only surface plots for the scenario option COW are shown (see Subsection 3.4.2 and Subsection 3.4.3). In this scenario option, an adversarial co-worker in a patent applying company's intellectual property department tries to affect decision outcomes of the company's decision-maker X (who is unaware of any adversarial influencing). Figure 3 and Figure 4 below show the surface plots for scenario option S •=2 COW and time window W = 1 and W = 2, respectively.  For this scenario option, all surface plots show a smooth landscape. For both considered time windows, the landscape of the mean Cohen distance shows a warm colored area in the upper right corner, with a maximum that is even hot colored (i.e. values ≥ 0.8). For the time window W = 1, the warm colored area more or less coincides with the warm colored area of the sample mean ptr-scores, whereas the warm colored area of the sample mean ptr-scores for the time window W = 2 is much broader than is the case for W = 1. In addition, the values of the sample mean ptr-scores for the time window W = 2 are considerably higher than those for the time window W = 1. In the cooler areas of the surface plot for time window W=1, however, there are areas with medium sample mean Cohen distances and relatively low sample mean ptr-scores. Hence, this area might be attractive for an adversarial actor. The computed mean power of the conducted McNemar tests are β = 0.65 ± 0.01 for the time window W = 1 and β = 0.72 ± 0.01 for the time 6 of 22 window W = 2, so the conducted McNemar tests lack some power. Overall, this scenario option does not seem to be attractive for an adversarial actor to deploy for a period longer than 1 year. Figure 5 and Figure 6 below show the surface plots for the scenario option S •=3 COW and time windows W = 1 and W = 2, respectively.  Of the three basic influencing scenarios, scenario option S •=3 COW seems to be the most attractive for an adversarial actor to deploy, especially in the time window W = 1. This is mainly due to the low sample mean ptr-scores at this time window, even in the warm colored area of the surface. Unlike the other two basic scenarios, the sample mean Cohen distances in the warm area of this scenario are small (ranging from below 0.1 to below 0.4).
The computed mean power of the conducted McNemar tests are β = 0.62 ± 0.01 for the time window W = 1 and β = 0.69 ± 0.01 for the time window W = 2, so the conducted McNemar tests lack some power.
The above inspection makes clear that adversarial actors in their own simulation and analysis somehow need to make a trade-off between the sample mean ptr-score (i.e. the likelihood of occurrence of positive test results) and the sample mean Cohen distance (i.e. the expected effect of the scenario), in order to determine the attractiveness of an influencing scenario in a time window. Thereby taking into account that the values of both statistical quantities for each influencing scenario and considered time windows W strongly depend on the positions of the model M 1 's parameter pair (β d W , U d W (1, R)) in the surface landscape. From an adversarial risk analysis (ARA) perspective, of importance to company security analysts is to find the most favorable model parameter pair (β d, * W , U d, * W (1, R)) for each time window that makes it on average the least attractive for adversaries to deploy either of the six considered scenarios. For company security analysts, as well as for adversaries, it is also of concern to find out whether combining basic scenarios simply implies addition of their sample mean Cohen distances, or that combining may cause some form of cancelling out of sample mean Cohen distances, due to the mathematical structure of model M 1 . In other words, what is the most favorable parameter value pair of model M 1 for each time window for the set of six adversarial influencing scenarios? This is the subject of Subsection 2.2.

The most favorable M 1 model parameter pair for each time window
In model M 2 , a multi-objective optimization problem with a set of six attractiveness objective functions is formulated, in order to find the most favorable parameter value pair The NSGA-II evolutionary optimization method [11,12] is used to solve the optimization problem. This method yields a set of favorable model parameter pairs for a time window W, denoted where N pop is the population size used in the NSGA-II method. In the simulations N pop = 50, meaning that the method yields 50 favorable model parameter pairs. Associated with each such pair is an attractiveness value A •,W (see Subsection 3.7), some favorable model pairs in the above set may correspond with (very) high attractiveness values. By putting some threshold value on the attractiveness value, undesirable favorable model parameters pairs will be dropped, and this yields the reduced set Figure 7 below shows an example of the frequency distributions of the attractiveness values corresponding to the reduced parameter pair set, for the scenario option COW. The figure reveals that the scenario • = 1, the scenario option S •=3 COW and the combined scenario COW can potentially do more harm to the decision outcomes of the patent applying repeated decision-making process than scenario • = 2 can do, especially in the time window W = 1. The figure also reveals that the longer each scenario is deployed, the less harmful it is, and the less spreaded are the corresponding attractiveness values (i.e. smaller bin sizes). This is due to getting more reliable statistics with growing numbers of patent requests in a time window. Figure 7 also reveals that the combination of scenario • = 1 with the scenario option S •=2 COW can potentially do considerably less harm than scenario • = 1 can do on its own. Something similar is the case when combining the scenario options S •=2 COW and S •=3 COW . The harm that the combination of scenario • = 1 with the scenario option S •=3 COW can do is confuse.

Preprints
for the three time windows and six considered scenarios, for the scenario option COW.
From the perspective of an adversarial actor, scenario • = 1 and scenario option S •=3 COW are potentially attractive on their own, even when considering favorable parameter pairs, whereas combining each of them with another scenario option reduces attractiveness. This is especially true for the time window W = 1. Based on all of the above findings, company security analysts have to apply some selection procedure on the reduced of M 1 model parameter pairs, in order to arrive at the single most favorable model parameter (β d, * W , U d, * W (1, R)) for a time window (see Subsection 3.8). This selection procedure is company specific and may therefore be hard for adversaries to guess. Table 1 below shows an example of the selected most favorable M 1 model parameter pair for each time time window. The coordinates of the selected most favorable parameter pairs are in agreement with the inspection results described in Subsection 2.1, based on inspecting surface plots.
for the three time windows and six considered scenarios, for the scenario option X. Figure 8 reveals that the frequency distributions of the attractiveness values for scenario option X resemble those of scenario option COW, except that they are less spreaded and most frequency mass has shifted to the left. The frequency distribution for the scenario • = 1 is slightly more spreaded than is the case for scenario option COW, within the normal statistical variation. As is the case for scenario option COW, the scenario • = 1 and scenario option S •=3 X are potentially more harmful than scenario option S •=2 X , and combinations of them with another scenario option reduce their attractiveness. Table 2 below shows an example of the selected most favorable M 1 model parameter pairs for each time time window for scenario option X.

Materials and Methods
As stated in the introduction, the probabilistic scenario of the patent application decision-making process is not Bayesian, in that the company decision-maker X selects a choice alternative (decision A: region office or country office) before observing the outcome of decision B (patent request granted or not granted). Moreover, decision-maker X is in fact a bounded rational decision-maker. Subsection 3.1 briefly formalizes the theory of Ortega et al. that is used to model a bounded rational decision-maker for the above probabilistic scenario, and provides an example too 3.1. Formalization of the theory of Ortega et al. for the probabilistic scenario of the patent application decision-making process In real-world decision problems, a decision-maker does not always have enough resources to exhaustively evaluate all aspects of each choice alternative of a decision. Ortega et al. have shown that this limitation changes a decision problem in a fundamental way. Their theory first requires defining a finite outcome space X , be defined as: A is a finite space of choice alternatives and O is a finite space of possible observations. Furthermore, in the probabilistic scenario in which the decisionmaker first selects a choice alternative a and then observes the stochastic state of the world o, the theory conceptualizes a decision-maker's deliberation and planning process as follows. The decision-maker first chooses a (what Ortega et al. call) prior decision policy, i.e. a probability distribution P 0,X (x), and then transforms this policy into a (what they call) posterior decision policy, i.e. a probability distribution P X (x), be defined as: During this transformation process, the decision-maker is not allowed to reason about the costs of transforming a prior decision policy into a posterior decision policy. Furthermore, U : X → R is a real-valued mapping of the outcomes, called the utility function. The decision maker's goal is to find the optimal posterior decision policy P * X (x) by optimizing this utility function over the probability distribution P X (x), while facing limited information processing resources in the deliberation and planning process. Ortega et al. showed that this limitedness for an outsider will appear as if the decision-maker were explicitly optimizing the explicit objective function −∆F β [P], known as the functional for negative free energy difference due to its origin in thermodynamics: The second term in the formula expresses information cost due to limited resources measured in units of utility (i.e., utiles), and the boundedness parameter β ∈ R acts as a conversion factor between units of information and utiles. The functional in Equation ( 2) expresses information-theoretic bounded rationality as a tradeoff between utilities and information cost, and is reflecting the decision-maker's net utility. In the literature, this cost term also goes under other names, such as KL-control cost, and has been motivated in numerous ways [7,10]. The boundedness parameter not only acts as a conversion factor, but also scales how far P(x) can deviate from P 0 (x), measured in terms of the KL-divergence. The parameter therefore controls how much a decision maker is in control of the action of selecting a choice alternative (see the limit cases in Table 3 below). Limit case Actions β → ∞ Perfectly rational decision maker with unlimited resources β → 0 Decision maker without resources simply selects an action according to the prior decision policy P 0 (x) β → −∞ Perfectly anti-rational decision maker, which always selects the action with the worst outcome.
To find the bounded rational optimal posterior decision policy, Ortega et al. have formulated a variational principle for maximizing the functional over probability distributions P(x). The general solution of this variational principle is the optimal posterior decision policy: For the probabilistic scenario in the decision-making of the patent application process (i.e. choice alternative selection decision A before observation of decision outcome of decision B), the particular optimal solution over a finite action space A becomes: [U|a] . The reader can find the derivation of this formula in Appendix A.
As of now, the case of decision-making in complete absence of adversarial influencing is referred to as the default case, denoted by the superscript d. In addition, R and C represent the choice alternatives "region office" and "country office", respectively. And 0 and 1 represent the decision outcomes "patent request not granted" and "patent request granted", respectively. Example 1. Let A = {R, C} and O = {0, 1}. Let the utility function be defined as U d (0, R) = U d (0, C) = 0 if a patent has not been granted, and U d (1, R) = 45 and U d (1, C) = 1 if a patent has been granted. Let P d (0|R) = 0.92 = 1 − P d (1|R) be the probability of a patent not being granted by a regional office in the default case, and P d (0|C) = 0.2 = 1 − P d (1|C) the probability of a patent not being granted by a country office. Let the prior decision policy that the patent will be requested at a regional office be given by P . Choose a value for the boundedness parameter β d (here β d = 1), representing how much the bounded rational decision maker X is in control of selecting choice alternative R or C. Now, the optimal decision policy becomes: is the expected utility in case choice action a is selected, and the partition function Z d 0.1e 3.6 +0.9e 0.8 = 0.65 and P d, * A (a = C; β d = 1) = 0.35. The below decision function is used to determine the outcome of decision A: Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 28 June 2021 doi:10.20944/preprints202106.0676.v1 As expected, with differences of utility values for both types of patent offices as above, decision maker X adapts his/her strategy according to the optimal decision policy and decision rule, and decides to send the patent request to a regional patent office.

Repeated patent application decision-making
Repeated patent application decision-making involves repeating the decision-making process that is described in Subsection 3.1. Let r denote an individual patent request. To capture variety in repeated patent requests in the simulation study, the following bounded rational decision-making model will be applied in the default case: where the values of the parameters β d and U d (1, R) remain to be specified (as stated in the introduction), and randint(a, b) represents a uniform drawing from the interval [a, b].
The quotient U d,r (1,R) U d,r (1,C) expresses decision maker X's preference with regard to how much more important a grant at the regional office R is for patent a request r than a grant at a country office C. By lowering the boundedness parameter β d,r compared to β d , decision maker X admits to have less control of selecting choice alternative R or C for patent request r. By raising the value of the prior decision policy parameter P d,r 0,A (a = R) compared to 0.43, decision maker X is more confident that patent request r will be granted by a patent officer at regional office R. By raising the utility value U d,r (1, C) compared to the value U d (1, C), decision maker X lowers his/her preference for getting the patent request granted at office R.

Observed proportions of patent application decision-making outcomes in the default case
The repeated decision-making process described in Subsection 3.2 yields two equally sized time ordered sequences (i.e. samples) of 1 < N(W) ∈ N decision outcomes from individual patent requests r in a time window W (expressed in years). A sample of R/C Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 28 June 2021 doi:10.20944/preprints202106.0676.v1 outcomes (from decision A) and a sample of 0/1 outcomes (from decision B). This study focuses on decision A outcomes R, and subsequent conditional decision B outcomes 1|R, and defines two observed unaffected proportions of decision outcomes R and 1|R for the default case:

Observed proportions of adversarial influenced patent application decision-making outcomes
In this study, six scenarios are defined by which adversarial actors may negatively influence patent application process decision-making outcomes, compared to the default case. An adversarial influencing scenario is denoted by the superscript •, followed by the number of the scenario.

Adversarial influencing scenario • = 1
An adversarial specialized patent officer in the regional patent office is able to create the opportunity to assess all the patent requests that are sent to the office by a company. The patent officer knows the observed granting chance p d,r (1|R) the decision maker X in the company is counting on for the sent patent requests in the default case, and tries to negatively influence the value of this chance, without raising suspicion. He/she first determines the lowest number of patent requests N lowest ∈ N that will approximately result in the chance value p d,r (1|R) if just one of the N lowest patent requests is granted: N lowest = round to the nearest lowest integer 1 p d,r (1|R) . Suppose p d,r (1|R) = 0.40, then N lowest = round to the nearest lowest integer( 1 0.40 ) = 2. This means that with just one granted patent on N lowest + 1 patent requests, the resulting adversarial influenced granting chance p •=1,r (1|R) = 1 N lowest +1 = 0.33. So, the officer's strategy is to not grant N lowest patents on every N lowest + 1 patent requests. This strategy takes into account that it may be hard to prove for company security analysts that there is a statistically significant difference between the observed unaffected proportion p d 1|R (W; β d , U d (1, R)), defined in Equation (  7), and the affected proportion:

Adversarial influencing scenario • = 2
In this scenario, company decision maker X's decision A for individual patent requests r is influenced by either one of the below two scenario options: • Scenario option S •=2 COW : An adversarial co-worker in the company's intellectual property department tries to persuade decision maker X to raise the value of the utility component U d,r (1, C) for a patent request with an integer value, with X being unaware of this.
• Scenario option S •=2 X : Decision maker X is the adversarial actor and raises the value of the utility component U d,r (1, C) for a patent request with an integer value himself/herself.
In scenario option S •=2 COW , the value of v r is drawn from the distribution (P(v r = 0) = 0.50, P(v r = 1) = 0.30, P(v r = 2) = 0.15, P(v r = 3) = 0.05), meaning that the adversarial co-worker has a 50% chance that decision maker X is willing to accept a proposed raise of U d,r (1, C). In scenario option S •=2 X , the value of v r is drawn from the distribution (P(v r = 0) = 0.20, P(v r = 1) = 0.45, P(v r = 2) = 0.27, P(v r = 3) = 0.08). Raising the value of U d,r (1, C) leads to a value of the chance component P •=2,r, * A (a = R; β d , U d (1, R)) that is lower than the value of the chance component P d,r, * A (a = R; β d , U d (1, R)), and the more likely it is that the number of decision outcomes R will drop. Therefore, the value of the below defined affected observed proportion is expected to be lower than the value of the corresponding observed unaffected proportion: Though this scenario does not affect repeated decision B outcomes, in simulation runs the below defined observed affected proportion may well differ from the value of the corresponding observed unaffected proportion: 3.4.3. Adversarial influencing scenario • = 3 In this scenario, company decision maker X's decision A for individual patent requests r is influenced by either one of the below two scenario options: • Scenario option S •=3 COW : An adversarial co-worker in the company's intellectual property department tries to persuade decision maker X to decrease the value of the boundedness parameter β d,r for a patent request with some percentage, with X being unaware of this.
• Scenario option S •=3 X : Decision maker X is the adversarial actor and decreases the value of the boundedness parameter β d,r for a patent request with some percentage himself/herself.
To capture both scenario options, the following mathematical formulation is used: ), with p r being a drawing from the distribution (P(p r = 0) = p 0 , P(p r = 30) = p 1 , P(p r = 40) = p 2 ) and, In scenario option S •=3 COW , the value of p r is drawn from the distribution (P(p r = 0) = 0.40, P(p r = 30) = 0.40, P(p r = 40) = 0.20), and in scenario option S •=3 X from the distribution (P(p r = 0) = 0.20, P(p r = 30) = 0.50, P(p r = 40) = 0.30). A decrease of β d,r may drop the value of the chance component P •=3,r, * A (a = R; β d , U d (1, R)), and may result in a lower number of decision A outcomes R. Therefore, the value of the below defined affected observed proportion is expected to be lower than the value of the corresponding observed unaffected proportion: Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 28 June 2021 doi:10.20944/preprints202106.0676.v1 As is the case for influencing scenario • = 2, this scenario does not affect repeated decision B outcomes. However, in simulation runs the below defined observed affected proportion may well differ from the value of the corresponding observed unaffected proportion: 3.4.4. Combined influencing scenario • = 2 + 3 In this combined influencing scenario, either the combination of scenario options S •=2 COW and S •=3 COW is active, or the combination of scenario options S •=2 X and S •=3 X . Combining the individual influencing scenarios offers the adversarial actor the advantage that smaller value changes of ev r and p r may be more effective. However, the risk of exposure may be higher than in case a single influencing scenario is deployed. The definitions of the two observed affected proportions for this scenario are identical to the definitions for the individual scenarios • = 2 and • = 3.

Combined influencing scenarios
In these two combined scenarios, the patent office-side adversarial actor and the company-side adversarial actor do cooperate. Scenario • = 1 + 2 is a combination of scenario • = 1 with either scenario option S •=2 COW or scenario option S •=2 X . And scenario • = 1 + 3 is a combination of scenario • = 1 with either scenario option S •=3 COW or scenario option S •=3 X . If the company-side adversarial actor succeeds in dropping the number of patent requests that is send to the regional patent office, then it may be statistically harder for company security annalists to test for the presence of scenario • = 1, being deployed by the patent office-side adversarial actor.

Testing for the presence of an adversarial influencing scenario
To find out whether an adversarial influencing scenario • has been active on decision A-outcomes in a time window W, or not, a paired proportions test will be conducted. To be precise, the asymptotic McNemar-test without continuity correction [8]. The distinguishable case outcomes for decision A are captured by the 2x2 contingency table shown in Table 4 below, where (n 11 , n 12 , n 21 , n 22 ) denotes a combination of outcome pairs on a total of N(W) pairs. Case influence scenario • be active Outcome R Outcome C Totals Default case d Outcome R n 11 n 12 n 11 + n 12 Outcome C n 21 n 22 n 21 + n 22 Totals n 11 + n 21 n 12 + n 22 N(W) The McNemar test procedure that is followed in the simulation study is explained below by means of Example 2.
Example 2. Suppose, the simulation framework has generated N(W) = 11 drawn pairs of binary decision outcomes R/C for the default case and the case adversarial influencing • has been deployed, with (n 11 , n 12 , n 21 , n 22 ) = (6, 3, 0, 2) being the generated combination of drawn outcome pairs. This corresponds with the observed proportions p d R (W; β d , U d (1, R)) = n 11 +n 12 N ( W) = 9 11 = 0.818 and p • R (W; β d , U d (1, R)) = n 11 +n 21 N ( W) = 6 11 = 0.545. Do these proportions significantly differ from each other or not, under the null hypothesis H 0 that the proportions of R-outcomes in the population are equal for both cases? Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 28 June 2021 doi:10.20944/preprints202106.0676.v1 Based on the criterion n 12 + n 21 = 3 + 0 = 3 < 20, the exact binomial version of the Mc-Nemar test will be conducted, otherwise the normal (χ 2 1 ) approximation. Choose a significance level (here α = 0.05), and let a software package compute the McNemar score statistic M = (n 12 −n 21 ) √ n 12 +n 21 and the two-tailed P-value, according to H 0 and by using the exact binomial distribution. The computed (two-sided) P-value is equal to 0.25. Because this value is greater than α = 0.05, we fail to reject H 0 and assume there is no significant difference between the proportions. To assure that the conducted McNemar test does not lack sufficient power to demonstrate and prove adversarial influencing for small and moderate sample sizes, as well as for larger sample sizes, a power analysis will be performed by means of a software package. The power value β will be computed given the sample size N(W) = 11, the significant level α = 0.05 and the observed effect size ∆(W; β d , The computed power value β = 0.701. In the performed simulation experiments, the calculated power value should not be far away from the value 0.8, which is normally imposed on a statistic hypothesis test. Instead of using the real difference between the two proportions as effect size, the Cohen difference will be used, be defined as the absolute difference between the arcsine-roottransformed values of the proportions [9], i.e. Cohen suggested that ∆ Cohen = 0.2 can be considered a small effect size, ∆ Cohen = 0.5 represents a medium effect size and ∆ Cohen = 0.8 a large effect size. This means that if two proportions do not differ by 0.2 (threshold) standard deviations or more, the difference is trivial, even if it is statistically relevant. Here, the Cohen distance is equal to 0.599, representing a more than medium effect.
In a similar way as for decision A outcomes, the McNemar test can be conducted for decision B outcomes, that is for the paired observed proportions p d R)), under the null hypothesis H 0 that the proportions of 1|R-outcomes in the population are equal for both cases. The two McNemar tests for decision A and B will be referred to as MNT(X, W) and MNT(Y, W), respectively (see the test setup shown in Figure 9 below). figure-test-setup-eps-converted-to.pdf

Building test statistics in the simulation study
In order to statistically examine the presence of a specific adversarial influencing scenario • in the three considered time windows in more detail, test statistics need to be build in the simulation study. Therefore, N sim = 50 simulation runs with N s = 50 sub-runs s per simulation run will be performed for each time window. For each simulation run, the number of patent requests in an individual sub-run for the time window, N (s) (W), will be determined by the drawing N (s) (W) ∼ W(2 + randint (1, 4)). , and for test MNT(Y, W) by: . Though not shown in the notation, all these statistical quantities are parameterized by the parameter pair (β d , U d (1, R)), because they are based on an observed unaffected and affected proportion pair and each proportion in this pair is parameterized by β d and U d (1, R). . Based on this directive and on Table 5, the attractiveness A •,W of a scenario (option) in a time window is defined as: where µ ptr • = 0, and ∈ R is a small factor to prevent dividing by zero. The higher the value A •,W , the more attractive the scenario (option)/time window combination is for an adversarial actor, and the higher will be the risk for a patent requesting company. Though not shown in the notation, the attractiveness value A •,W is parameterized by the parameter pair (β d , U d (1, R)), because the statistical quantities on the right side of Equation (15) are parameterized by this parameter pair (see Subsection 3.6).
3.8. Procedure for making patent application decision-making outcomes on average less vulnerable to negative adversarial influencing As stated before, the values of the parameters β d and U d (1, R) of model M 1 , defined in Equation (6), remain to be specified. Instead of choosing some pair of parameter values within the specified bounds, a procedure is provided to determine the most favorable parameter values in a time window with regard to the six adversarial influencing scenarios. By considering the two parameters as variables and by formulating the attractiveness objective function defined in Equation (15) for each of the six considered influencing scenarios, the multi-objective optimization problem stated below is used to determine a set of pairs of optimal parameter values. On this set, a refinement procedure will be applied to determine the most favorable parameter pair (β d, * W , U d, * W (1, R)) for a time window W.

Model M 2 :
Multi-objective optimization problem: Selection procedure : A selection procedure will be applied on the output set, in order to arrive at a single most favorable pair of optimal parameters (β d, * W , U d, * W (1, R)) for the time window.
For each time window value W, the evolutionary optimization method NSGA-II [11,12] will be applied to Equation (16), with a population size of N pop = 50 and N gen = 30 generations as a termination criterium. This results in a set of 1 ≤ N pairs ≤ N pop favorable parameter pairs (β R)), from which a single most favorable parameter value pair (β d, * W , U d, * W (1, R)) is selected. It is expected that implementing the latter parameter pair in model M 1 for a time window, will discourage adversaries from deploying one of the six considered scenarios. Especially, because it is hard for them to not only guess the underlying mathematical decision support model a patent applying company is using, but also the most favorable parameter pair the company has selected.

Discussion
In the literature, a lot of attention is drawn to adversaries trying to explore vulnerabilities of IT systems that are supporting crucial business processes or infrastructure, and how to detect attempts to manipulate such systems. Considerably less attention is drawn to adversaries trying to manipulate the decision outcomes of repeated decision-making processes with underlying parameterized decision support models. And no serious attention at all is drawn to incorporating simulated statistics of repeated decision outcomes affected by a set of well-defined possible influencing scenarios into the parametrization of mathematical decision support models. The purpose of this study is to draw attention to this deficiency, and set a stage for the topic by means of the proposed general simulation framework. The decision support model underlying the patent application decision-making process serves as an example, because of its interesting structure: a non-Bayesian bounded rational action-reward model with two successive binary decisions. Most mathematical decision support models have some parameters that remain to be specified, and usually an optimization problem is formulated to find the optimal parameter values with regard to some general objective function or loss function. A crucial attribution of the proposed simulation framework is that it provides a general definition of a measure that is feeded by simulated statistical test outcomes and that expresses the attractiveness of a defined influencing scenario (from the perspective of an adversary), in terms of the decision support model parameters that remain to be specified. The present study has demonstrated that by considering this measure as an objective function, a multi-objective optimization problem can be formulated for a set of well-defined adversarial influencing scenarios. And that solving the optimization problem for a chosen time window, and applying some selection procedure on its solution set, will provide the most favorable (for adversaries hard to guess) support model parameter values for the time window. Parameterizing the decision support model according to these parameter values, will on average make the considered set of influencing scenarios less attractive for adversaries to deploy in the chosen time window.
Of course, company security analysts cannot be accounted for preventing adversaries from crafting and deploying adversarial influencing scenarios to manipulate decision outcomes of repeated decision-making processes they are supposed to protect. However, they can be accounted for taking countermeasures, such as implementing the proposed approach, that will make such scenarios on average less effective and that on average will raise the chance that adversarial influencing of decision outcomes will be detected. Once adversaries suspect that company security analysts themselves craft and simulate influencing scenarios to make them less effective and that this may raise the chance of being exposed, this may discourage them from crafting and deploying such scenarios in the future.
The statistical theory underlying the presented mathematical model M 2 needs to be further developed. Company security analysts should be given stricter guarantees than that the effect of considered scenarios on average will be less and the detection chance of a deployed scenario will on average be higher. But, this is left to future research. The approach presented in this study is general and can be applied to a variety of repeated decision-making processes and underlying mathematical decision support models. For instance, to decision support of repeated decision-making by means of machine learning models, in which case the presented approach needs to be included into the hyperparameter tuning of the used machine learning model. In forthcoming work, a case study of this will be presented. As well as a case study of detection of adversarial influencing of repeated decision outcomes of a repeated decision-making process supported by a bounded rational Bayesian decision support model. with the environment in that the choice action the decision maker takes according to the optimized P A (a) has a stochastic effect on the environment according to the distribution P(o|a).

X = A × O,
= {a 1 , · · · , a M } Actions × {o 1 , · · · , o N } Observations , P X (x) = P X (a, o) = P(o|a)P A (a), Q X (x) = P(o|a)P 0,A (a), where P 0,X (x) and P X (x) represent the prior decision policy and the posterior decision policy with respect to the space X , respectively. Optimizing the above objective function is equivalent to optimizing the objective function Take the derivative with respect to P A (a) for fixed a ∈ A: This finally yields the optimal decision policy over the finite action space A: