Lessons Learned from Applying Computer Assisted Experimental Designs

Nikolay Angelov; Per Johansson

doi:10.20944/preprints202410.0575.v1

Submitted:

07 October 2024

Posted:

08 October 2024

You are already at the latest version

Abstract

This paper discusses our experience with a computer assisted design for the analysis of an intervention on a digital platform, conducted by the Swedish Tax Agency. The interest was in analyzing the possibilities of estimating the effect of a message (the intervention) on the subsequent tax compliance of taxpayers who had foreign income. The effect size was not expected to be large. However, as the cost of the intervention is very low, even a small effect could be important economically. Due to administrative reasons the number of treated was limited to 500 individuals. Because of the limited sample size and the small expected effect it was imperative to consider an efficient experimental design. The population considered for the experiment consisted of 2,697 individuals. Instead of choosing all except the treated as controls, only 500 were deliberately chosen to examine the efficiency gains of a rerandomization design with real world data. To this end, a simulation study was conducted on the same population as was randomly sampled to be part of the experiment. An important conclusion from this exercise is that regression-based post-stratification on relevant covariates is an equally efficient alternative as rerandomization in this setting. The results are of special interest when experiments are run in organizations where computer assisted designs may be costly.

Keywords:

field experiment

;

tax compliance

;

rerandomization

;

foreign income

;

automatically exchanged financial information

Subject:

Business, Economics and Management - Accounting and Taxation

MSC: 62-08; 62P20

1. Introduction

As a means of increasing efficiency, stratification on observed covariates is commonly used. An alternative, or complement, to stratification that has received attention lately is to utilize modern computational capabilities for finding assignments with balance in observed covariates (see e.g., [3,10,11,12,13,14,15,18]). This paper contributes to the methodological literature on computer assisted designs in providing guidance of how to make use of these new designs in practice. More importantly, the paper provides numerical evidence on efficiency gains on, more specifically, the `Mahalonobis based rerandomization design’, suggested in [18], based on real world data. The numerical exercise allows for a discussion of advantages and disadvantages with this specific design in applications.

In [18], assignments are repeatedly randomized until obtaining an assignment whose Mahalanobis distance in covariate means of treated and control units is smaller than a given threshold. This assignment is then the treatment vector in the Mahalonobis based rerandomization design. According to our understanding, ours is the first study in which any of these new type of computer assisted designs are applied. An important conclusion from the exercise is that regression-based post-stratification on relevant covariates might be a reasonably efficient alternative to the Mahalonobis based rerandomization design. The results should be of interest for especially analysis of experiments in organizations where it is costly to set up a computer assisted design.

The background of the experiment is that the Swedish Tax Agency was interested in analyzing the possibilities of estimating the effect of a low-cost intervention in a digital platform on the subsequent tax compliance of Swedish taxpayers who, according to the so called Common Reporting Standard (CRS), had foreign dividends during the income year of 2018.1 Although the direct cost of sending a digital message is low, the number of treated was limited to 500 individuals. The reason for this decision was uncertainty on behalf of the team responsible for sending the message about how the taxpayers would react to the message. In particular, there were concerns about potential service desk overload. Furthermore, there is no information on whether the recipient has read the digital message or not. In combination with the non-intrusive nature of the message, the intent-to-treat (ITT) effect size cannot be expected to be large. For these reasons it was important to consider a more efficient experimental design than a completely randomization design.

As the Swedish Tax Agency was also interested in analyzing heterogenous effects by gender and previous tax compliance we decided to stratify the experiment on gender and being compliant in 2018. Six covariates, deemed to be important in reducing the heterogeneity, were then used to further reduce the imbalance using Mahalanobis based rerandomization. The population considered for the experiment was 2,697 individuals. Instead of choosing all remaining as controls, only 500 were chosen as controls as this choice provided the opportunity for analyzing the efficiency gains from the experimental design, deemed important by the tax office for future implementations. The design was built on a random sample of 1,000 individuals from this population.2 The digital message was sent to 500 of these individuals in March 2020, about six weeks before the final day to file income taxes for the income year of 2019.

The results from the experiment show no statistically significant effects on declared capital income or final taxes paid. However, the point estimates for capital income are larger in magnitude among the previously non-compliant taxpayers and this is true for both women and men. The only case of a confidence interval that does not cover zero is for non-compliant women. Given that this is the group for which it was expected to have the largest effect on compliance from receiving the digital message (see [2]), this was seen as a useful result by the Swedish Tax Agency.

The paper proceeds with a description of the study sample, the intervention, and the outcome data. The experimental design is described in Section 3 and in the following Section 4, we discuss the analysis and the results from the experiment. Section 5 provide the results from the simulation exploring the gains from the chosen experimental design. The paper concludes with a discussion in Section 6.

2. The Intervention, Study Population and Outcome Data

The intervention consisted of sending the following message to the treated individuals and leaving the controls untreated:

Declaring foreign dividends/interest

Hi!

We get many questions on how to declare foreign income and have therefore developed a new online app in order to make this easier.

The Swedish Tax Agency has obtained information from a foreign tax authority that you have received dividends or interest from abroad during 2018.

If you have received dividends or interest from abroad also during 2019, you can use the online app when you file your taxes. The app will help you with the correct amount to file and how much foreign tax offset you have the right to claim.

You can find the app here:

https://app.skatteverket.se/klient-sifu-segmentering/

Sincerely,

The Swedish Tax Agency

The message was sent in Swedish (see Appendix A for the original) and the message in English above is our translation. This message was sent in digital form to the treated taxpayers via a so called digital mailbox. In Sweden, about half of the population above 16 years of age have a digital mailbox.3 This is a free service making it possible to receive mail in digital form from Swedish authorities and some large private firms. With regards to taxes, having a digital box implies that all communication from the Swedish Tax Agency that otherwise would have been sent as paper mail is sent digitally in a secure app. This includes pre-filled tax returns as well as various other messages. Taxpayers can file their income tax declaration securely in the app.

The online app mentioned in the message does not require login and is best described as a specialized calculator. The user fills in the type of foreign income (dividends or interest), amount, currency, date of receiving the amount, country where the income was received, and if applicable, the amount of foreign tax paid. Upon clicking Calculate, the app converts the amount to SEK, calculates the foreign tax offset, and indicates the specific tax declaration boxes where the amounts should be filed.

The message was sent to each treated taxpayer’s digital box on 25 March, 2020. This is one week after it became possible for taxpayers to send in their income declarations (17 March, 2020). Unfortunately we have no data on the share of taxpayers who sent in their declarations between the 17th and 25th of March. However, the absolute majority of taxpayers in Sweden file their tax returns during the week preceding the deadline (4 May, 2020). Thus, in our opinion any attenuation of the impact from early filing is non-existing or minor.

In addition to measuring the effect of receiving the message on the taxpayers’ tax compliance, it would have been of interest to assess whether receiving the message had an effect on using the online app mentioned in the message. However, since the app does not require the users to log in, this is not feasible.

Finally, as mentioned in the introduction, we have no means of knowing whether the taxpayers who were sent the digital message actually read it. Therefore, we can only draw inferences on the ITT effect.

2.1. The Study Population

The original population consisted of 10,344 individual taxpayers each of whom had a financial account with accrued foreign dividends summing to over 3,000 SEK (about 292 EUR) during tax year 2018, according to automatically exchanged information within the Common Reporting Standard (CRS). The study population was restricted to the population with a digital mailbox, aged 30 to 75 and having filed a tax declaration in 2018. In addition, some further selections were made with respect to missing data and outliers. For details on the restriction of the population see [2].

2.2. The Outcome Data

Data on the outcome variables comes from administrative data on taxpayers’ filed income tax returns. The latest day for sending in the income tax declaration for the income year 2019 was May 4, 2020. For various reasons, some taxpayers file their taxes later than the deadline. The most common reason for filing at a later date is that the individual has applied and been approved for respite with income tax returns. As of July 27, 2020 (the date when the estimation data set was collected from the registers), we had access to data on 99.3 percent of the study sample, i.e., the group of individuals who received the message and the control group.

There is no separate field in the income tax declaration for foreign dividends. Our main outcome variable is therefore capital income (

c a p i n c

), where according to the tax code, the taxpayer should include the amount of foreign dividends. The variable

c a p i n c

does not include capital gains and can be zero but cannot be negative. In addition, we estimate the effect on total tax paid (

t a x

). An increase in declared foreign dividends in

c a p i n c

should lead to an increase in

t a x

unless no offsetting adjustments are made in the income tax declaration. Since the treatment is expected to have an effect on

t a x

only through changes in foreign income which is a minor part of

t a x

, we do not expect a large effect on

t a x

. It is nevertheless important to include the final tax as an outcome variable since it is a measure of compliance directly related to tax revenues.

Also, although perhaps far fetched, receiving the message could potentially have a positive effect on overall compliance through an unintentional nudge. In other words, merely receiving a message from the Swedish Tax Agency could possibly nudge taxpayers into higher general compliance level which could be manifested as an increase in total tax paid not necessarily stemming from increased declared capital income.

3. Experimental Design

Even though the estimators from well-conducted experiments are unbiased in expectation, the estimates from any single experiment may still be far from the population average treatment effect due to an unlucky, albeit random, assignment. For this reason the experimental design is stratified on gender and historical tax compliance; two variables that we believe can be important determinants for the two outcomes under investigation. A second reason for stratification is that we are interested in examining group differences in behavior as a consequence of the intervention. Hypotheses on the signs of effect differences along previous compliance and gender are discussed in [2]. As we also have continuous covariates (historical data on the outcomes, earnings, age, etc.) we also would like to balance them within each stratum. To this end, we used the stratified rerandomization design suggested in [11].

As mentioned previously, the number of treated was limited to 500 individuals so given that we in total have 2,697 individuals in the population this means that we have in 2,197 non-treated individuals. However, as we also want to evaluate the efficiency gains of the design we do not want to use all these as controls. For simplicity we considered only 500 of these in the experimental design. The reason for choosing balanced designs is that they preferable to unbalanced designs in both Fisher [4] and Neyman-Pearson [6] inference with a fixed sample size. The standard error would in general be smaller if the number of controls would be larger than 500. However, as the sampling variance of the treated are constant the reduction in standard error will in general be moderate in the unbalanced case. For example, under the assumption that the variance of the potential outcomes if being treated or a control is the same the standard errors would in expectation be reduced by only around 20 percent if all 2,197 potential controls were used (i.e. a quadrupling of the sample size of the controls).4 With heterogeneous treatment effects the variance is likely to be higher in the treatment state and then the efficiency gains are in expectation even smaller.

The idea behind rerandomization is the same as with stratification or blocking, that is, to remove from consideration assignments with imbalance in observed covariates between treated and control units and then randomize within the set of assignments with balance on these covariates. Call the set of all assignments within a stratum

A

and the set of acceptable assignments

A_{a}

. Thus,

C a r d (A_{a}) < C a r d (A)

, where

C a r d (A)

and

C a r d (A_{a})

is the cardinality of

A

and

A_{a}

, respectively. Morgan and Rubin (2012) suggested calculating the Mahalanobis distance between the means of K covariates of the potentially treated and controls and then to accept a specific random allocation only if this measure is less than a, where a is small. The Mahalanobis distance is chi-square distributed with K degrees of freedom,

χ_{K}^{2}

. The criterion a, determining the set

A_{a}

, can thus be decided implicitly by accepting a random allocation if the calculated Mahalanobis distance between the two means is smaller than a pre-specified probability derived from the

χ_{K}^{2}

-distribution. If we let

Pr (χ_{K}^{2} \leq a) = p_{a}

, the specific random allocation is one allocation of the

100 \times p_{a}

% allocations with the smallest difference in means between treated and controls. For details on the procedure, see Appendix B.

As shown in Appendix B, the percent reduction in variance in contrast to complete randomization can be shown to be equal to

100 \times R^{2} (1 - ν_{a}),

(1)

where

R^{2}

is the coefficient of determination of a regression of the outcome on

X

and

ν_{a} = \frac{Pr (χ_{K + 2}^{2} \leq a)}{Pr (χ_{K}^{2} \leq a)}; 0 < ν_{a} < 1 .

(2)

ν_{a}

is non-increasing with K and decreasing in

p_{a}

. The implication is that the choice of covariates is important. One should not add unnecessary covariates (i.e., covariates that are not partially correlated with the outcome) as this reduces the efficiency of the estimator. Furthermore, a strict criterion for a given K will increase the efficiency in comparison to a less strict criterion.

We define compliance by using information about whether the amount of foreign dividends obtained from the CRS-data for the income year of 2018 was less than or equal to total capital income during the same year, i.e.,

1 [c o m p l i a n t = 1] \equiv 1 [f d i v \leq c a p i n c]

, where

1 [\cdot]

takes the value one if the expression within brackets is true and zero otherwise.5 As we observe the sex of individuals in our data this allows us to define the four strata.

The following six covariates, measured for the pre-experiment income year 2018, were used to calculate the Mahalanobis distance:

$a g e$ : the taxpayer’s age
$f d i v$ : foreign dividends
$c a p i n c$ : capital income
$t a x$ : total tax paid
$e a r n$ : earnings including labor income, sick pay, pension, etc.
$f i n c \equiv 1 [h a s f o r e i g n i n c o m e]$ : categorical variable based on a check box in the tax declaration which equals one if the box is checked and zero otherwise.

Using the 2,697 individuals we proceeded as follows:

Divide the sample in a compliant (1,759) and a non-compliant group (938) where the number of observations is given within parentheses.
Draw two simple random samples, each of size 500, from the two groups. These 1,000 individuals constitute the sampling frame of the trial.
Create four strata; compliant women, compliant men, non-compliant women, and non-compliant men
Within each stratum, randomly select an allocation with a Mahalanobis distance between treated and controls means of the six covariates to be less than 0.17. As $P (χ^{2} (6) < 0.17) = 0.0001$ this means that the specific random allocation is one allocation of the 0.01 % allocations with the smallest differences in means between the treated and the controls.

Table 1 shows descriptive statistics by stratum after performing the rerandomization as well as the resulting number of observations in each stratum. From this table we can see that we have

167 \times 2

non-compliant men,

83 \times 2

non-compliant women,

200 \times 2

compliant men, and

50 \times 2

compliant women. The mean difference of the covariates between the treated are, as expected, very small within each of the four strata.

4. Analysis and Results

A drawback with the rerandomization strategy is that the difference-in-means estimator is no longer asymptotically normally distributed (see [17]). However, [16] showed that standard asymptotic inference can be conducted using ordinary least squares (OLS). To be specific, let

x_{i}

be the

K \times 1

vector of covariates used in the Mahalanobis distance for individual i,

W_{i}

be the treatment indicator, and

Y_{i}

the outcome. The treatment effect is the estimated coefficient on

W_{i}

in the regression of

Y_{i}

on

W_{i}

,

x_{i}

and

W_{i} (x_{i} - \bar{x})

, where

\bar{x}

is the vector of sample means of the covariates. To construct asymptotically valid confidence intervals one should use the Eicker-Huber-White (EHW) robust standard error estimator (see [5]; [8]; [22]).

A balanced design simplifies the analysis and the tests for effect differences across strata. The reason is that the four effects estimated in the saturated linear regression model are the same as the four between-group differences in mean estimates, which are unbiased (see [11]). This holds also in a rerandomization design.

The following regression model is used in the analysis

\begin{matrix} Y_{i} = & α_{0} + τ W_{i} + α_{w} 1 [w o m a n = 0] + α_{c} 1 [c o m p l i a n t = 1] \\ + α_{w c} 1 [w o m a n = 0] \times 1 [c o m p l i a n t = 1] + β_{0}^{'} {\tilde{x}}_{i} + {β_{1}}^{'} W_{i} {\tilde{x}}_{i}, \end{matrix}

(3)

where

{\tilde{x}}_{i} \equiv x_{i} - \bar{x}

.

We have two outcomes (

c a p i n c

and

t a x

measured in 1000s SEK) and the test for overall effect for each outcome is

H_{0} : τ = 0

against the alternative

H_{1} : τ \neq 0

. We let the overall risk level for judging whether we have an effect or not to be 5% which means that each single test will be conducted at the 2.5% risk level.

4.1. Results

The raw mean differences of the two outcomes (capital income (

c a p i n c

) and final tax (

t a x

), respectively, for 2019, measured in 1000s SEK) for treated and controls and their differences overall and across the strata are shown in the first six columns in Table 2. Overall we see a higher declared capital income for those receiving the digital message, but also a lower declared taxed income. In addition, we see large differences in outcome levels and treatment-control differences across the strata. In the last six columns, we also show the share of taxpayers in the various strata who have declared any capital income, or who had to pay any tax. The differences between treated and controls with respect to these shares vary in sign and are relatively close to zero in all strata and for both variables.

The OLS-results from estimating Equation (3) on

c a p i n c

and

t a x

are presented in Table 3. The point estimate in column (1) shows an increase in the declared capital income of 5,225 SEK on average, or an increase with around 43% when evaluated at the mean declared capital income of 12,196 SEK. Column (2) shows a reduction on declared tax by 3,856 SEK on average, or a reduction by 1.23%, when evaluated at the mean declared taxed income of 313,290 SEK. Both parameters are however imprecisely estimated and none is statistically significant at the 10%-level.

4.2. Sub-group analysis

In the pre-analysis plan we stated that we would only test for effect heterogeneity across groups if the parameter estimate

\hat{τ}

from (3) was found to be statistically significant at the 2.5 percent level. The reason for this restriction was to have control over the size of the test and at the same time to have power of detecting the ITT-effect.6

Since the point estimate

\hat{τ}

was not statistically significant, we have not included the heterogeneity analysis in this paper. We did, however, perform exploratory analyses. The point estimates for

c a p i n c

are in line with the discussion in the pre-analysis plan of larger effects for the non-compliant.7 We also conjectured that the effect for women would be larger in magnitude than for men, and this is also supported by the point estimates for non-compliant taxpayers. However, the only case a confidence interval that does not cover zero is for non-compliant women.

5. Efficiency Gains from the Statistical Design

We perform the simulation using the subset of non-treated individuals from the original experiment. To recap, the experiment was based on 2,697 individuals (see Section 3), from which 1,000 individuals were sampled and 500 were treated. In the simulation, we exclude the 500 treated and remove those few who had not filed their declaration as of July 27, 2020, resulting in a sample of 2,190 taxpayers which we denote the simulation sample. As in the original experiment, we divide the sample in a compliant and non-compliant group. The first group which we denote (a) consists of 1,509 individuals and the second, (b) of 681 individuals. In each replication of the simulation (

r = 1, 2, \dots, 1000

), the following steps were performed:

Draw two simple random samples, each of size 500, from (a) and (b) respectively, resulting in 1,000 individuals.
Draw 500 individuals assumed to be treated according to three different experimental designs:
- Rerandomization within stratum: Perform steps 3 and 4 described in Section 3.
- Stratification, i.e. complete randomization within stratum: Create the four strata and randomly allocate 50% to be `treated’ within each stratum.
- Complete randomization: Randomly allocate 500 individuals to be `treated’.
Estimate $τ$ for the three designs, with and without covariates, and store the estimates.
For each of the designs and estimator calculate the standard deviation of the 1000 estimates.

The results from the simulation with and without the covariates are presented in panel I and II in Table 4, respectively. The regression estimator is used in all analysis with covariates based on the specification in Equation (3). Without covariates, the difference-in-means estimator is used in the completely randomized experiment while for the two other designs the regression estimator is used. Note that due to the balanced design the regression estimator with the stratum indicators and interaction with the treatment is equivalent to the stratified estimator,

{\hat{τ}}^{S E} = \sum_{g = 1}^{4} f_{g} \times ({\bar{y}}_{g 1} - {\bar{y}}_{g 0})

, where

{\bar{y}}_{g, W}

are the mean outcomes of the treated (

W = 1

) and controls (

W = 1

) in stratum g and

f_{g}

is the fraction of observations in stratum g. This is thus equivalent to Equation (3) without any covariates.

From panel I we can see that the efficiency gains from stratification over complete randomization on capital income and income tax are 2.94 and 5.60 percent, respectively (see row I.6). The corresponding efficiency gains from rerandomization within strata are substantial; 3.23 and 47.19 percent, respectively (see row I.4). Finally, the total efficiency gains from the stratified rerandomization on these two variables are 6.27 and 55.45 percent, respectively (see row I.5).

The percentage reduction in variance in a rerandomization design is

100 \times R^{2} (1 - ν_{a})

(cf. Equation (1)). As

ν_{a} = P r (χ_{8}^{2} \leq 0.17) / Pr (χ_{6}^{2} \leq 0.17)) = 0.12

, the efficiency gains in our design should be approximately

100 \times R^{2} (1 - 0.12)

within each stratum. Using the

R^{2}

from from Table 3 we find an expected gain of 2.73% and 48.66%, respectively. The observed efficiency gains displayed in panel I of Table 4 against complete randomization were 6.27% and 55.44% which is substantially larger. Most likely, this is a consequence of the stratification.

The results from the simulation with the covariate adjustment is presented in panel II. As expected, the efficiency gains are much more modest. The total efficiency gains for capital income and income tax from stratified rerandomization compared to stratification are 5.02 and 5.32 percent, respectively (see row II.5). The majority of the efficiency gain stems from the stratification. As can be seen in row II.4, for final tax, rerandomization within stratum even reduced the efficiency in contrast to only stratification.

For inference, standard errors need no be estimated, thus a more relevant analysis of the efficiency gains is to study the length of the estimated asymptotic confidence intervals given correct coverage rate. Here, we for each replicate and for all cells, estimate the standard errors and use them together with the point estimate to calculate the asymptotic confidence interval. The coverage rates are calculated as the fraction of replication where the estimated confidence interval covers the null of

τ = 0

given a level of significance

α

of

1 %, 5 %

and

10 %

.

The coverage together with the average length of the confidence interval is displayed in Table 5. From the table we can see that the coverage rates are, as expected, close the the nominal level for all estimators. For both outcomes the stratified design has the shortest length, followed by the stratified rerandomization design. However, the average lengths of the confidence intervals are very similar, so in practice there is hardly any efficiency gains from the designs in contrast to ex-post covariate adjustments in the completely randomized design.

6. Discussion

Based on a field experiment conducted by the Swedish Tax Agency, this paper contributes to the research community by providing a better understanding of costs and benefits of using the Mahalonobis based rerandomization design. The Swedish Tax Agency was interested in analyzing the possibilities of precisely estimating the effect of a low-cost intervention using a digital platform on the subsequent tax compliance of Swedish taxpayers who had foreign dividends during the income year of 2018. Although the direct cost of sending a digital message is low, the number of treated was limited to 500 individuals by the Swedish Tax Agency. Furthermore, the effect size was not expected to be large. For these reason it was imperative to consider an efficient experimental design. To gain an understanding for future implementations the experiment was deliberately set up of to examine the efficiency gains of the experimental design.

The results show that the efficiency gains from the stratified rerandomization design compared to complete randomization and a stratified design can be substantive. As compared with the completely randomized design and the stratified experiment the efficiency gain for one of the outcomes, final tax paid, was 55.4 and 47.19 percent, respectively. The corresponding efficiency gains for reported capital income, where the covariates are less relevant for the outcome were only 6.3 and 3.23 percent, respectively. There are gains also when the treatment effect is estimated using a regression estimator adjusting for the same covariates that were used in the design. In this case the efficiency gains against a completely randomized experiment were more modest (5.3 percent for tax and 5.0 percent for capital income). Furthermore, compared to the results with a stratified experiment there were essentially no efficiency gains from the the stratified rerandomization design.

In the comparison of the asymptotic confidence intervals of the regression estimator we found for both outcomes that the average length of confidence intervals are the shortest under the stratified experiment. However, the differences in lengths are marginal across the three designs.

When covariates are used, the results thus suggests marginal, if any, efficiency gains in inference to a population from the stratified rerandomization design in contrast to a completely randomized experiment, or a stratified experiment, under the null of no effect. It may be the case that there are efficiency gains using the stratified rerandomization design with smaller sample sizes or under the alternative, as has previously been shown in Monte Carlo-simulations (see e.g. [9,10,11,23]). However, based on our experience with this experiment, in a large organization, there might be considerable administrative costs beyond the relatively small cost of writing the source code for the experiment, meaning that a standard randomized experiment would be a more practical option. For this and similar applications, our conclusion is that regression-based post-stratification along with including relevant covariates might be a reasonably efficient alternative.

It should be stressed that if one chooses the covariates in the regression models after the experiment is conducted, the resulting inference may be flawed as researchers are prone to searching for statistically significant results (see e.g [7]; [1]; [21]; [19]). To avoid difficult post-experiment decisions and for transparency, we recommend a careful choice process documented in a pre-analysis plan.

Institutional Review Board Statement

The randomized experiment (sending the message) was part of the Swedish Tax Agency’s regular operation. During the time of the experiment, Nikolay Angelov was employed at the Swedish Tax Agency.

Conflicts of Interest

None.

Appendix A. The Message in Swedish

Deklarera dina utländska inkomster/räntor

Hej!

Vi får många frågor om hur man deklarerar utländska inkomster och därför har vi utvecklat en ny tjänst för att göra det lättare.

Skatteverket har fått information från en utländsk skattemyndighet om att du kan ha haft utdelning eller ränta i utlandet under 2018.

Om du har haft utdelning eller ränta i utlandet även under 2019 kan du använda tjänsten när du deklarerar. Den hjälper dig med vilket belopp du ska ta upp i deklarationen och hur mycket avräkning av utländsk skatt du har rätt till.

Du hittar tjänsten här:

https://app.skatteverket.se/klient-sifu-segmentering/

Med vänlig hälsning,

Skatteverket

Appendix B. Experimental Design, Rerandomization and Inference

Consider a Randomized Control Trial (RCT) with n units in the sample, indexed by i, with

n_{1}

assigned to treatment and

n_{0}

assigned to control. Let

W_{i} = 1

or

W_{i} = 0

if unit i is assigned treatment or control, respectively, and define

W = {(W_{1}, . . ., W_{n})}^{'} .

Furthermore, let

X

be the

n \times K

matrix of fixed covariates in the sample (

x_{i}, i = 1, . . ., n),

with sample covariance

c o v (X) .

There are

(\binom{n}{n_{1}}) = A

possible treatment allocation (assignment) vectors labeled

W^{j} = {(W_{1}^{j}, . . ., W_{n}^{j})}^{'},

j = 1, . . ., A

, where

A = card (A)

, i.e., the cardinality of the set

A

. The Mahalanobis distance for allocation j is

M (W^{j}, X) = \frac{n}{4} {\hat{τ}}_{X}^{j}^{'} c o v {(X)}^{- 1} {\hat{τ}}_{X}^{j}, j = 1, . . ., A,

where

{\hat{τ}}_{X}^{j} = \frac{1}{n_{1}} \sum_{i = 1}^{n_{1}} W_{i}^{j} x_{i}^{'} - \frac{1}{n_{0}} \sum_{i = 1}^{n_{0}} (1 - W_{i}^{j}) x_{i}^{'} = {\bar{X}}_{T}^{j} - {\bar{X}}_{C}^{j} .

[18] proposed accepting the jth allocation when its treatment assignment vector

W^{j}

satisfies

M (W^{j}, X) \leq a,

where a is a positive constant.

By the central limit theorem, the sample means of the covariates will be normally distributed across random samples, so that

M (W^{j}, X) \sim χ_{K}^{2}

. Letting

p_{a} = Pr (χ_{K}^{2} \leq a) ≃ Pr (M (W^{j}, X) \leq a),

(A1)

we see that a is determined from the choice of

p_{a} .

Because the number of rerandomizations is geometrically distributed, the expected number of randomizations needed to obtain an acceptable allocation is

1 / p_{a}

. This means for instance that for

p_{a} = 0.001

, the expected number of randomizations before drawing an allocation that fulfills the criterion is

1, 000

.

[18] show that since

M (W^{j}, X) \sim χ_{K}^{2}

,

C o v ({\bar{X}}_{T}^{j} - {\bar{X}}_{C}^{j} | X, M (W^{j}, X) < a) = ν_{a} C o v ({\bar{X}}_{T} - {\bar{X}}_{C} | X),

(A2)

with

ν_{a} = \frac{Pr (χ_{(K + 2)}^{2} \leq a)}{Pr (χ_{K}^{2} \leq a)}; 0 < ν_{a} < 1 .

(A3)

This result implies that the variance in the covariate mean differences across allocations in

A_{a}

is reduced relative to its variance across the allocations in

A

by the factor

ν_{a}

, and the percent reduction in variance of each of the covariates in

X

(or any linear combination of them) is equal to

100 (1 - ν_{a}) .

Let

Y_{i} (w)

be the potential outcome under treatment w for individual

i .

Under the Stable Unit Treatment Value Assumption (SUTVA, see [20]), the observed outcome when i is assigned

W_{i}

is equal to

Y_{i} = Y_{i} (W_{i})

The difference-in-means estimator is defined as

\hat{τ} = {\bar{Y}}_{1} - {\bar{Y}}_{0}

(A4)

where

{\bar{Y}}_{1} = \frac{1}{n_{1}} \sum_{i = 1}^{n} W_{i} Y_{i} (1)

and

{\bar{Y}}_{0} = \frac{1}{n_{0}} \sum_{i = 1}^{n} (1 - W_{i}) Y_{i} (0)

.

Let

{\hat{τ}}^{C R}

and

{\hat{τ}}^{R R}

be the estimators defined in (A4) under complete randomization and Mahalanobis-based rerandomization, respectively. These estimators are unbiased for the estimation of the sample average treatment effect (SATE) and also of the population average treatment effect (PATE) under random sampling of the n units from the population.

The variance of

{\hat{τ}}^{C R}

is given by

V ({\hat{τ}}^{C R}) = \frac{S_{Y (1)}^{2}}{n_{1}} + \frac{S_{Y (0)}^{2}}{n_{0}} - \frac{S_{Y (1) Y (0)}}{n}

(A5)

where

S_{Y (w)}^{2} = \frac{1}{n - 1} \sum_{i = 1}^{n} {(Y_{i} (w) - \bar{Y} (w))}^{2}, \bar{Y} (w) = \frac{1}{n} \sum_{i = 1}^{n} (Y_{i} (w)

and

\begin{matrix} S_{Y (1) Y (0)} & = & \frac{1}{n - 1} \sum_{i = 1}^{n} (Y_{i} (1) - Y_{i} (0) - {(\bar{Y} (1) - \bar{Y} (0))}^{2} \\ = & \frac{1}{n - 1} \sum_{i = 1}^{n} {(τ_{i} - τ^{s})}^{2} = S_{τ}^{2} \end{matrix}

that is, the sample variance of the unit-level treatment effects. Note that with homogeneous treatment effects, i.e.

τ_{i} = τ

,

S_{Y (1) Y (0)} = 0

and

S_{Y (1)}^{2} = S_{Y (0)}^{2} = S_{Y}^{2}

. This means that

V ({\hat{τ}}^{C R}) = \frac{n}{n_{0} n_{1}} S_{Y}^{2}

Thus, all else equal, the variance of the estimator will be larger with heterogeneous effects than with homogeneous effects.

The asymptotic distribution is given by

\sqrt{n} ({\hat{τ}}^{C R} - τ) \overset{d}{\to} N (0, V_{τ τ})

where

V_{τ τ} = \frac{S_{Y (1)}^{2}}{n_{1}} + \frac{S_{Y (0)}^{2}}{n_{0}} - \frac{S_{τ}^{2}}{n} .

Under the superpopulation assumption and inference to the PATE, the third term vanishes since treated and controls are sampled independently.

Li et al (2018) derive the asymptotic results for Mahalanobis-based rerandomization. It is shown that the asymptotic distribution of the SATE and PATE (under random sampling) estimators after rerandomization is generally non-normal. Instead, the asymptotic distribution is a linear combination of a normal distributed variable and a truncated normal variable.

Let

Y (w) = {(Y_{1} (w), Y_{2} (w), . . ., Y_{n} (w))}^{'}, w = 0, 1,

and let

R^{2}

be the squared multiple correlation of

Y (0)

on

X .

Under the assumptions that (i) the residual in the linear projection of

Y (0)

on

X

is normally distributed and that (ii) treatment effects are additive (so that

R^{2}

is also the squared multiple correlation of

Y (1)

on

X)

), it holds that the percentage reduction in variance (PRIV) of

{\hat{τ}}^{R R}

against the corresponding estimators under complete randomization is

P R I V = \frac{V ({\hat{τ}}^{C R}) - V ({\hat{τ}}^{R R})}{V ({\hat{τ}}^{C R})} = 100 \times R^{2} (1 - ν_{a}),

(A6)

where

V (.)

denotes the variance of the estimators. From this expression together with Equations (A1) and (A3), it becomes clear that the variance reduction from Mahalanobis-based rerandomization relative to complete randomization is decreasing in

p_{a}

, the strictness of the rerandomization criterion, and non-increasing in

K,

the dimension of

X .

References

Amrhein, V., Greenland, S., and McShane, B. (2019). Scientists rise up against statistical significance. Nature, 567(7748):305–307. [CrossRef]
Angelov, N. and Johansson, P. (2020). Using intelligence from international tax cooperation to improve voluntary tax compliance: Evidence from a swedish field study. AEA RCT Registry, May 4 2020.
Bertsimas, D., Johnson, M., and Kallus, N. (2015). The power of optimization over randomization in designing experiments involving small samples. Operations Research, 63(4):868–876. [CrossRef]
Chung, E. and Romano, J. P. (2013). Exact and asymptotically robust permutation tests. Annals of Statistics, 41(2):484–507. [CrossRef]
Eicker, F. (1967). Limit theorems for regressions with unequal and dependent errors. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, volume I, pages 59–82. University California Press, Berkeley, CA.
Freedman, D. (2008). On regression adjustments to experimental data. Advances in Applied Mathematics, 40(1):180–193. [CrossRef]
Gelman, A. and Loken, E. (2014). The statistical crisis in science. American Scientist, 102:460–465.
Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, volume I, pages 221–233. University California Press, Berkeley, CA.
Johansson, P., Rubin, D. B., and Schultzberg, M. (2021). On optimal rerandomization designs. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 83(2):395–403. [CrossRef]
Johansson, P. and Schultzberg, M. (2020). Rerandomization strategies for balancing covariates using pre-experimental longitudinal data. Journal of Computational and Graphical Statistics, 29(4):798–813. [CrossRef]
Johansson, P. and Schultzberg, M. (2022). Rerandomization: A complement or substitute for stratification in randomized experiments? Journal of Statistical Planning and Inference, 218:43–58. [CrossRef]
Kallus, N. (2018). Optimal a priori balance in the design of controlled experiments. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 80(1):85–112. [CrossRef]
Kapelner, A., Krieger, A. M., Sklar, M., Shalit, U., and Azriel, D. (2020). Harmonizing optimized designs with classic randomization in experiments. The American Statistician, 0(0):1–12. [CrossRef]
Krieger, A. M., Azriel, D., and Kapelner, A. (2019). Nearly random designs with greatly improved balance. Biometrika, 106(3):695–701. [CrossRef]
Lauretto, M. S., Stern, R. B., Morgan, K. L., Clark, M. H., and Stern, J. M. (2017). Haphazard intentional allocation and rerandomization to improve covariate balance in experiments. AIP Conference Proceedings, 1853(June).
Li, X. and Ding, P. (2019). Rerandomization and regression adjustment. To appear in Journal of the Royal Statistical Society, Series B. [CrossRef]
Li, X., Ding, P., and Rubin, D. B. (2018). Asymptotic theory of rerandomization in treatment Ccontrol experiments. Proceedings of the National Academy of Sciences of the United States of America, 115(37):9157–9162. [CrossRef]
Morgan, K. L. and Rubin, D. B. (2012). Rerandomization to improve covariate balance in experiments. Annals of Statistics, 40(2):1263–1282. [CrossRef]
The American Statistician, Mutz, D. C., Pemantle, R., and Pham, P. (2019). The perils of balance testing in experimental design: Messy analyses of clean data. The American Statistician, 73(1):32–42. [CrossRef]
Rubin, D. B. (1980). Discussion of “randomization analysis of experimental data: the Fisher random-ization test” by D. Basu. Journal of the American Statistical Association, 75(2):591–593. [CrossRef]
The American Statistician, Wasserstein, R. L., Schirm, A. L., and Lazar, N. A. (2019). Moving to a world beyond p. The American Statistician, 73(sup1):1–19. [CrossRef]
White, H. (1980). Using least squares to approximate unknown regression functions. International Economic Review, 21(1):149–170. [CrossRef]
Zhang, J. L. and Johansson, P. (2022). Model-based bayesian inference under computer assisted balance-improving designs. Statistics in Medicine, 41(21):4245–4265. [CrossRef]

1	Within the CRS, tax authorities, including the Swedish Tax Agency, obtain information from financial institutions in their own jurisdiction and automatically exchange that information with other jurisdictions on an annual basis. The exchanged information covers many countries and a vast amount of assets. In 2019, nearly 100 countries carried out automatic exchange of information, enabling their tax authorities to obtain data on 84 million financial accounts held offshore by their residents. This covered total assets of EUR 10 trillion which is twice as much as the number during 2018, the first year in which such automatic information exchange took place. See http://www.oecd.org/tax/international-community-continues-making-progress-against-offshore-tax-evasion.htm. (retrieved on September 20, 2020).
2	The design together with a pre-analysis plan is published in [2].
3	Source: https://svenskarnaochinternet.se/rapporter/svenskarna-och-internet-2019/digitala-samhalls tjanster/halften-av-svenskarna-har-en-digital-brevlada/. The information was retrieved on September 30, 2020.
4	Let $Y_{i}$ be an outcome variable for individual i and let $n_{1}$ and $n_{0}$ be the number of treated ( $W_{i} = 1)$ and controls ( $W_{i} = 0)$ , respectively. The variance of the difference in means estimator, $\hat{τ} = \frac{1}{n_{1}} \sum_{i : W_{i} = 1}^{n_{1}} Y_{i} - \frac{1}{n_{1}} \sum_{i : W_{i} = 0}^{n_{0}} Y_{i},$ is: $V a r (\hat{τ}) = \frac{σ_{1}^{2}}{n_{1}} + \frac{σ_{0}^{2}}{n_{0}} = \frac{n_{0} σ_{1}^{2} + n_{1} σ_{0}^{2}}{n_{0} n_{1}},$ where $σ_{1}^{2}$ and $σ_{0}^{2}$ is the variance if being treated and not being treated in the population. Let $σ_{1}^{2} = σ_{0}^{2} = σ^{2},$ then $V a r (\hat{τ}) = σ^{2} (n_{0} + n_{1}) / n_{0} n_{1} .$ With $n_{0} = n_{1} = n / 2$ we get $\sqrt{V a r (\hat{τ})} = 2 σ / \sqrt{n_{1}} \sqrt{2} .$ Now let $n_{0} = k n_{1}, k > 1$ then $\sqrt{V a r (\hat{τ})} = σ \sqrt{(1 + k)} / \sqrt{n_{1}} \sqrt{k} .$ A quadrupling of the number of controls (i.e. $k = 4)$ would thus in expectation decrease the standard error by around 20 percent: $\frac{\sqrt{2} - \sqrt{(1 + k)} \sqrt{k}}{\sqrt{2}} ≃ 0.20$
5	The logic behind this is that for compliance with the tax code, the amount of $f d i v$ should be included along with other capital income sources in the declared $c a p i n c$ . As mentioned previously, $c a p i n c$ does not include capital gains and can be zero but cannot be negative. Therefore, although $f d i v \leq c a p i n c$ is not necessarily a sign of compliance, $f d i v > c a p i n c$ is a clear measure of non-compliance.
6	We have two outcomes and thus two main effects. With an additional four heterogeneous effects on two outcomes, a total of ten tests was of interest. Using the Bonferroni correction, the individual tests would have been at 0.5 % level in order to have an overall risk of 5 %. Instead we decided on a sequential procedure and to test for heterogenous effects only if we found an overall effect based on the Bonferroni correction.
7	The results can be obtained be contacting the authors by email.

Table 1. Group averages and standard deviations (in parenthesis) after rerandomization

		$a g e$		$f d i v$		$c a p i n c$		$t a x$		$e a r n$		$f i n c$		$# o b s$
$1 [c o m p l i a n t = 1]$	$1 [w o m a n = 1]$	T	C	T	C	T	C	T	C	T	C	T	C	T	C
0	0	50.6	50.9	7.4	7.5	1.8	1.8	253.1	248.6	625.6	615.5	0.08	0.08	167	167
		(11.2)	(11.3)	(5.2)	(5.4)	(3.7)	(3.1)	(226.7)	(241.9)	(464.9)	(458)
0	1	51.6	51.9	6.5	6.5	1.5	1.6	206.9	208	544.5	548.7	0.06	0.07	83	83
		(11.1)	(10.8)	(4.4)	(4.8)	(3.6)	(3.9)	(227.1)	(233.2)	(422.3)	(446)
1	0	52.7	52.8	6.4	6.4	11.2	11.2	341.6	339.8	829.6	829.5	0.07	0.07	200	200
		(7.6)	(8.9)	(3.9)	(4.3)	(10)	(9.5)	(208.4)	(205.1)	(374.9)	(383.7)
1	1	51.9	51.4	7.8	7.6	13.9	13.3	310.1	306.4	774.5	769.9	0.18	0.18	50	50
		(9.2)	(10.4)	(6)	(5.2)	(13.3)	(9.3)	(232.7)	(204.7)	(423.1)	(391.7)

Note: T denotes treated individuals and C denotes controls. The variables are measured during the pre-intervention income year of 2018 and defined as follows:

a g e

is the taxpayer’s age measured in years,

f d i v

is foreign dividends (1,000s SEK),

c a p i n c

is capital income (1,000s SEK),

t a x

is total tax paid (1,000s SEK),

e a r n

is earnings including labor income, sick pay, pension, etc. (1,000s SEK), and

f i n c \equiv 1 [h a s f o r e i g n i n c o m e]

is a categorical 1/0 variable based on a check box in the tax declaration.

Table 2. Group averages of the outcome variables

		$c a p i n c$			$t a x$
$1 [c o m p l i a n t = 1]$	$1 [w o m a n = 1]$	T	C	$T - C$	T	C	$T - C$
0	0	$20.68$	$5.20$	$15.48$	$250.70$	$263.30$	$- 12.60$
0	1	$8.62$	$2.16$	$6.46$	$210.95$	$247.87$	$- 36.92$
1	0	$13.06$	$15.21$	$- 2.15$	$398.03$	$369.53$	$28.49$
1	1	$12.76$	$13.57$	$- 0.81$	$342.28$	$371.79$	$- 29.51$
All strata		$14.85$	$9.54$	$5.31$	$312.39$	$314.18$	$- 1.79$

Note: T denotes treated individuals, C denotes controls, and

T - C

denotes the difference between treated and controls. The variables are measured post-intervention during the income year 2019.

c a p i n c

is capital income (1,000s SEK) and

t a x

is total tax paid (1,000s SEK).

1 [\cdot]

denotes the indicator function valued one if the expression within brackets is true and zero otherwise.

Table 3. Effect estimates

	OLS (1000s SEK)
	(1)	(2)
	$c a p i n c$	$t a x$
W(treatment effect)	5.225	−3.856
	(4.292)	(12.593)
$w o m a n$	−6.108	14.356
	(5.446)	(21.153)
$c o m p l i a n t$	−9.768	34.853**
	(8.640)	(15.958)
$w o m a n \times c o m p l i a n t$	2.537	−15.505
	(6.526)	(31.088)
$a g e$	−0.076	−0.896
	(0.108)	(0.621)
$f d i v$	0.011	3.973
	(0.362)	(3.097)
$c a p i n c$	1.339***	−0.589
	(0.310)	(1.006)
$e a r n$	−0.003	0.238***
	(0.006)	(0.087)
$t a x$	0.008	0.511***
	(0.010)	(0.179)
$f i n c$	8.343	61.637
	(7.545)	(59.788)
$W \times a g e$	0.405*	0.681
	(0.220)	(1.070)
$W \times f d i v$	1.034	−6.818*
	(1.005)	(3.739)
$W \times c a p i n c$	−0.739	0.608
	(0.611)	(1.270)
$W \times e a r n$	0.025	0.012
	(0.017)	(0.118)
$W \times t a x$	−0.010	0.023
	(0.020)	(0.232)
$W \times f i n c$	−9.955	−39.750
	(8.748)	(70.193)
$I n t e r c e p t$	15.860***	295.468***
	(5.164)	(12.908)
Observations	998	998
Adjusted $R^{2}$	0.015	0.546

Note: *p<0.1; **p<0.05; ***p<0.01. Outcomes are valued during the income year of 2019 and are expressed in either 1000s SEK which is approximately equal to 100s EUR, or as categorical variables (

1 [\cdot]

). The covariates are demeaned and valued pre-intervention during 2018. The measure units are as follows:

a g e

is measured in years,

f i n c

is valued 0 or 1, and the rest of the variables are expressed in 1000s SEK. Eicker-Huber-White (EHW) robust standard errors are shown in parentheses.

Table 4. Standard deviations of the estimators under the null of no effect

	$c a p i n c$	$t a x$
	I. No covariates
I.1 Rerandomization within stratum, A	$2.8927$	$12.1143$
I.2 Stratification, B	$2.9861$	$17.8315$
I.3 Complete randomization, C	$3.0741$	$18.8310$
I.4 $100 \times (B - A) / A$	$3.2310$	$47.1940$
I.5 $100 \times (C - A) / A$	$6.2700$	$55.4452$
I.6 $100 \times (C - B) / B$	$2.9439$	$5.6056$
	II. Covariates included
II.1 Rerandomization within stratum, D	$2.5993$	$12.8921$
II.2 Stratification, E	$2.6041$	$12.7851$
II.3 Complete randomization, F	$2.7297$	$13.5777$
II.4 $100 \times (E - D) / D$	$0.1846$	$- 0.8295$
II.5 $100 \times (F - D) / D$	$5.0186$	$5.3184$
II.6 $100 \times (F - E) / E$	$4.8251$	$6.1993$

Note: The regression estimator is used in all analysis with covariates based on the specification in equation . The difference-in-means estimator and the stratified estimator is used in the analysis without covariates. The latter estimator is used in the stratified randomization design and the rerandomization design. The simulation is performed on non-treated individuals, i.e., under the null hypothesis of no effect.

Table 5. Average coverage and length of the estimated asymptotic confidence interval

	capinc 10%	capinc 5%	capinc 1%	tax 10%	tax 5%	tax 1%
Coverage
Rerandomization	$0.9040$	$0.9600$	$0.9990$	$0.8920$	$0.9500$	$0.9890$
Stratification	$0.9020$	$0.9610$	$0.9970$	$0.8970$	$0.9590$	$0.9930$
Complete randomization	$0.8910$	$0.9590$	$0.9970$	$0.8720$	$0.9350$	$0.9860$
Length
Rerandomization	$8.2917$	$9.8802$	$12.9847$	$41.6318$	$49.6073$	$65.1951$
Stratification	$8.1711$	$9.7365$	$12.7959$	$41.6130$	$49.5849$	$65.1656$
Complete randomization	$8.3017$	$9.8921$	$13.0004$	$42.0467$	$50.1017$	$65.8448$

Note: The regression estimator is used in all analysis on the specification in Equation (3). Standard errors are estimated using the robust covariance estimator. The simulation is performed on non-treated individuals, i.e., under the null hypothesis of no effect.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Lessons Learned from Applying Computer Assisted Experimental Designs

Abstract

Keywords:

Subject:

1. Introduction

2. The Intervention, Study Population and Outcome Data

2.1. The Study Population

2.2. The Outcome Data

3. Experimental Design

4. Analysis and Results

4.1. Results

4.2. Sub-group analysis

5. Efficiency Gains from the Statistical Design

6. Discussion

Institutional Review Board Statement

Conflicts of Interest

Appendix A. The Message in Swedish

Appendix B. Experimental Design, Rerandomization and Inference

References

MDPI Initiatives

Important Links

Subscribe