4 mins
Net impact of Interreg: statistical inference
As part of its transnational outreach efforts, ESPON has been working with the Austrian Institute for Regional Studies, ÖIR, on assessing counterfactual methods for their applicability to Interreg. The results have been published in a transnational brief for Interreg programming authorities, outlining popular causal inference methods typically applied to policy impact assessments. The paper offers conclusions about their applicability to Interreg, taking into account basic statistical assumptions.
Such conclusions are not only relevant for designing the performance framework. Interreg programmes are often designed to support projects that internalise area-specific negative externalities occurring across jurisdictions. In such cases, eligible applicants (e.g. local administrations) are selected because of negative externality, leading to the problem of reversed causality, i.e. the locality causing the treatment. Counterfactual comparisons would then always tend to be downwardly biased. Is this a bad thing? Not necessarily: it may be more difficult to statistically infer improvements as a result of the treatment (i.e. the Interreg intervention), but this does not mean that the treatment has not been beneficial.
In considering externalities, we enter the domain of the famous Coase theorem: by lowering the transaction cost for border stakeholders, cross-border cooperation projects are the most efficient way of internalising negative border externalities. For example, nutrient outflows from agriculture in region A and fisheries in region B can be reconciled at a Pareto-efficient level and at the lowest possible cost. This can certainly be measured, and statistically significant results can be obtained, provided that there are enough treated and nontreated farmers and fishers. The problem is that border area stakeholders may identify a number of externality cases, which would all require a different indicator, and these are most likely not compatible with deriving a programme result indicator. Moreover, programming authorities do not possess the ex-ante knowledge of all possible negative externalities, which is typically the role of the supported actions.
This problem would typically force programmes to apply a teleological programme design, i.e. Interreg interventions that are a function of their goal (e.g. increase the prosperity of the region – put as a proxy for all possible generic result indicators). Although an admirable endeavour, this pathway will inevitably cross other funding streams pursuing similar objectives. There is also the fact that many Interreg beneficiaries are also beneficiaries of other funding schemes, and so intangible benefits such as knowledge are moving from project to project regardless of the funding scheme. This makes it impossible to disentangle the effects of participation in treatments (various publicly funded projects), a problem known statistically as multicollinearity. This is where quasiexperimental design counterfactuals come into play. Here the problems are related not to ceteris paribus reliability but to statistical significance. However, even if statistically significant estimates can be obtained, the economic significance is likely to be negligible given the overlaps of Interreg with other schemes that are often more influential in pecuniary terms.
This would lead to the conclusion that the question behind the programme mission (‘what do we want to change?’) cannot be answered without taking into account the question of impact (‘is what we changed of economic significance?’) and the question of evidence for the net impact (‘can we make a statistically significant inference that what we have changed is a result of our Interreg intervention?’). So, discussions about territoriality and functional areas, i.e. the identification of territorial needs that cannot adequately be addressed by any other public policy (i.e. Interreg niches), would be better informed by an understanding of statistical and economic significance. Such an understanding can draw attention to measurable benefits for end-users rather than beneficiaries; to place-specific externalities rather than place-invariant challenges.
The transnational brief explains the counterfactual methods most frequently applied in policy impact assessments: difference in differences (DiD), propensity score matching (PSM) and regression discontinuity design (RDD), among others.
The DiD method measures the outcome of a treatment (i.e. participation in an Interreg project), comparing a treatment group (beneficiary) with a control group (non-beneficiary) before and after treatment. Deducting the trend over time and the difference before and after treatment yields an estimate of the net impact.
Statistical matching seeks to attain an accurate estimate of the Average Treatment Effect (ATE). This is done by deducting the mean of a certain indicator of non-treated units (i.e. nonbeneficiaries) from the mean of the treated units (i.e. beneficiaries). The result, however, contains the ATE and a selection bias, i.e. the treated units have not been randomly selected but certain observable traits suggest that they actually necessitate the treatment (i.e. reversed causality). Propensity score is the probability of participation in the treatment estimated based on multiple observed traits that treated and nontreated units have in common. Treated units can then be accurately matched with counterfactuals that exhibit similar probability levels (strata). ATE can be obtained for every strata, and the overall ATE is then obtained by the weighted average of the ATE obtained within individual strata. The RDD method solves the so-called endogeneity problem, i.e. beneficiaries may be awarded on the basis of selection criteria that only well-performing organisations can meet. The result of an endogeneity would be an upward bias in the impact estimates, as well-performing organisations might be better endowed to deal with a problem with or without Interreg treatment. Regression discontinuity sets a threshold for eligibility or selection, and the units (e.g. public organisations, associations, individuals or, where applicable, firms) just above and below the threshold are expected to have similar observable and unobservable traits. Thus, the units just above the threshold can serve as the treatment group and those just below the threshold as the control group. Comparing the mean of the outcome (result indicator) of the treated and non-treated units would yield the net impact of the treatment.
The authors’ verdicts on the applicability of the above methods to Interreg, as well as other methods, are available on the ESPON website