Preprint
Article

This version is not peer-reviewed.

Picking Winners or Marking Them? Timing-Based Evaluation of Innovation Certification

Submitted:

19 June 2026

Posted:

22 June 2026

You are already at the latest version

Abstract
Voluntary certification policies — which let firms self-nominate for innovative or privileged status — increasingly channel fiscal, financial, and regulatory support, yet evaluating them is deceptively hard: certified firms may outperform because certification works, or simply because firms apply when already rising. We show that voluntary certification can certify winners precisely because rational firms time entry to moments of transitory expansion: the measured premium largely reflects when firms participate, not what participation does. We formalise this as anticipatory take-up: because the benefits are most valuable while a firm scales, forward-looking firms register at the crest of a growth run-up. The mechanism yields three predictions — a cross-sectional premium, a premium that largely predates certification, and entry driven by recent growth, not profitability. We test them in the Italian innovative-SME regime, linking panel data for roughly 2,900 certified and 1,200 non-certified SMEs to each firm's registration date and sector, and treating performance as a six-dimensional vector. Certified firms grow far faster than balanced peers yet are financially less solid, especially when smaller and capital-intensive. Once registration timing is exploited through event-study and staggered difference-in-differences designs, the premium proves largely selection: about 81 per cent of the revenue advantage predates registration, and the residual merely continues a pre-existing trend. A hazard model confirms it — entry rises with recent growth and smaller size, not profitability. The paper reframes the evaluation of voluntary certification from average effects to observable dynamic selection: such schemes mark firms already on distinctive trajectories rather than create them. The lesson generalises — robustness to omitted variables is not robustness to selection on trends, and cross-sectional evaluations mislead unless they exploit the timing of take-up.
Keywords: 
;  ;  ;  ;  

1. Introduction

Voluntary certification policies are increasingly used to identify firms, technologies, or organisations considered especially innovative, high-potential, sustainable, or socially valuable. These schemes typically grant preferential access to fiscal incentives, public guarantees, finance, reputational benefits, or simplified regulation. Their evaluation, however, faces a common problem: certified units may outperform because certification works, but also because high-performing units choose certification when their performance is already improving. The Italian innovative-SME regime provides an unusually informative setting in which to study this broader problem.
This paper asks why certified firms display a performance premium. Existing evidence leaves this question unresolved for three reasons. First, performance is often reduced to a single indicator or composite score, concealing the trade-offs between growth, profitability, productivity, volatility, and financial stability that shape SME competitiveness. Second, certified and non-certified firms are usually compared contemporaneously, although participation is voluntary and cross-sectional differences may reflect selection rather than certification effects. Third, little is known about when firms enter certification and which firms do so, even though voluntary take-up may be timed to moments of exceptional growth. As a result, we do not know whether the performance premium reflects a treatment effect or a dynamic selection process.
We address this question using the Italian innovative-SME regime as an empirical laboratory. The setting is especially useful because we can link an integrated firm-level dataset for roughly 2,900 certified and 1,200 non-certified SMEs to each firm’s exact date of first entry into the register and to its official industry code. The empirical design is organised around one mechanism: voluntary certification may create the appearance of a policy effect when firms enter certification while already on exceptional growth trajectories. We first document the cross-sectional premium, measuring performance across six dimensions — persistence, growth, productivity, profitability, volatility, and financial stability — rather than collapsing it into a composite index. We then exploit registration timing to test whether the premium appears before or after certification. Finally, we model the registration decision itself to determine whether recent firm growth predicts entry into certification.
The results identify a clear pattern of observable dynamic selection. Cross-sectionally, certified SMEs grow faster but are less financially solid, especially when they are smaller and capital-intensive. Timing evidence shows why this premium appears: roughly four-fifths — about 81 per cent — of the revenue advantage accumulated over the event window emerges before firms enter the register. The remaining post-registration difference is consistent with the continuation of this pre-existing growth trajectory. The registration model confirms the mechanism directly: firms are more likely to register after recent revenue growth and when they are smaller, but not when they are more profitable. Certification is therefore acquired at the crest of an expansion already underway.
The paper contributes to the broader literature on voluntary certification and policy participation by showing how certification schemes can generate observable selection on trends. Certification may mark firms already on distinctive trajectories rather than create those trajectories. Methodologically, the paper shows that robustness to omitted variables is not robustness to selection on trends: balanced cross-sectional comparisons can be internally consistent yet causally misleading unless treatment timing is exploited. Theoretically, it develops an anticipatory take-up mechanism in which firms time certification to the moment when its benefits are most valuable. For policy, the implication is general: voluntary certification schemes should not be credited with performance improvements that are already visible before certification; complementary support should instead target the vulnerabilities of the firms such schemes attract.
The paper proceeds as follows. Section 2 situates the study in the literatures on policy evaluation, voluntary participation, and firm performance, and introduces the Italian setting; Section 3 develops the anticipatory take-up mechanism and its predictions. Section 4, Section 5, Section 6 and Section 7 describe the data, the empirical strategy, the sectoral taxonomy, and the case for treating performance as a vector rather than a composite index. Section 8, Section 9 and Section 10 establish the cross-sectional premium — faster growth, lower solidity — and show it is robust to unobserved confounding and concentrated among smaller, capital-intensive firms. Section 11, Section 12, Section 13 and Section 14 turn to identification: registration timing reveals that most of the premium predates entry, a battery of robustness checks confirms it, a hazard model shows that recent growth (not profitability) drives registration, and a simple model rationalises the pattern. Section 15, Section 16, Section 17 and Section 18 discuss the findings, draw policy implications, set out limitations, and conclude. Two appendices support the analysis: Appendix A reports descriptive statistics by group, and Appendix B details the correlation and principal-component evidence behind treating performance as a vector.

3. Theoretical Mechanism and Testable Predictions

The evaluation of certification-based innovation policies often assumes that certification precedes and contributes to superior firm performance. This assumption is problematic when certification is voluntary. Firms choose not only whether to participate, but also when to participate. The timing decision can itself generate systematic differences between certified and non-certified firms.
The mechanism is anticipatory take-up. Consider a firm experiencing a temporary period of rapid expansion. During such periods, certification benefits — fiscal incentives, public guarantees, financing opportunities, reputational value, and regulatory simplification — become especially valuable because the firm faces greater financing needs, larger investment commitments, and stronger organisational pressure. If benefits are available immediately upon registration, a forward-looking firm has an incentive to apply when those benefits are most valuable: during a growth run-up, not at a random point in its life cycle.
Certification therefore becomes endogenous to firm dynamics. Firms do not necessarily grow because they are certified; rather, firms already experiencing exceptional growth are more likely to seek certification. Smaller firms should have stronger incentives to do so, because financing constraints make certification benefits more valuable for them. Profitability, by contrast, should matter less for the timing of registration, since the value of certification comes primarily from supporting expansion rather than rewarding operating margins.
This mechanism generates four testable hypotheses.
  • H1. Recent growth predicts registration. Firms with higher recent revenue growth should be more likely to enter certification, because expansion increases the value of certification benefits. This hypothesis is tested with the discrete-time hazard model.
  • H2. Profitability does not independently predict registration. If certification is timed to expansion rather than to operating surplus, profitability should play a weaker role than recent growth in explaining entry. This hypothesis is tested in the same hazard model.
  • H3. Smaller firms are more likely to register. Because certification benefits are more valuable when financing constraints bind, smaller firms should have a higher probability of entering certification, conditional on growth, profitability, sector, and region. This hypothesis is also tested with the hazard model.
  • H4. Pre-registration growth absorbs most of the apparent premium. If certification mainly reflects dynamic selection, the revenue premium observed in cross-sectional comparisons should largely emerge before registration. The event-study and staggered difference-in-differences designs test this hypothesis by examining whether the premium appears before or after entry.
The empirical strategy follows directly from these hypotheses. The hazard model tests whether recent growth, profitability, and size predict registration as the anticipatory take-up mechanism implies. The event-study and staggered difference-in-differences designs test whether the apparent growth premium is already present before certification. Together, these tests distinguish a certification effect from observable dynamic selection and assess whether innovative-SME status functions primarily as a motor of performance or as a marker of firms already on exceptional trajectories.

4. Data and Sample Construction

This section describes the data on which the analysis rests and the way the estimation sample is built. The empirical design has two layers — a cross-sectional comparison of innovative and non-innovative SMEs across six performance dimensions, and a difference-in-differences analysis that exploits the timing of entry into the innovative-SME register — and each makes distinctive demands on the data. The first requires comparable, multi-year performance measures and covariates for the two populations; the second requires, in addition, the exact date at which each treated firm acquired its status. Assembling a dataset that meets both demands is itself part of the contribution, and the three subsections that follow document how it is done.

4.1. Sources: the AIDA Financial Panel, the Business Register, and ATECO Codes

We combine three sources. The backbone is the AIDA database (Analisi Informatizzata delle Aziende Italiane, Bureau van Dijk), which provides harmonised financial statements for Italian limited companies (Bajgar et al., 2020) and is widely used in firm-level analyses of Italian innovative SMEs (Anderloni & Harasheh, 2025; Schifilliti & La Rocca, 2024). From AIDA we draw 2,919 innovative SMEs and 1,182 non-innovative SMEs from the same size range. For each firm, AIDA reports up to ten years of balance-sheet and income-statement items, with accounting years mostly ending in 2024 and, for a minority, in 2025. We extract revenue, EBITDA, operating cash flow, total assets, employees, and operating-cost components as full annual series, so that all performance measures are constructed from primitive accounting quantities rather than pre-computed ratios.
The AIDA extracts required two harmonisation steps. First, misaligned column headers in the innovative-SME extract were corrected by identifying fields from their content before merging. Second, because the two populations used different native margin conventions, we discarded pre-computed margins and recomputed EBITDA margins uniformly as EBITDA over revenue, winsorised to the [−100, 100] interval, following standard Bureau van Dijk data-cleaning practice (Kalemli-Özcan et al., 2024). Tax identifiers were normalised to eleven digits to restore leading zeros.
The second source is the special section of the Italian Business Register for innovative SMEs, maintained by the Chambers of Commerce and used in recent analyses of the Italian innovative-firm regime (Biancalani et al., 2022; Cassinis et al., 2025; Manaresi et al., 2021). From the register we obtain each firm’s tax identifier, first registration date, incorporation date, municipality and province, legal form, and coarse size and capital classes. The extract covers 3,221 registered innovative SMEs; after parsing, 2,523 firms have usable records. Registration dates span 2015–2026, confirming that the field captures first entry into the regime — the economically relevant treatment date — rather than an administrative snapshot.
The third source is the official ATECO industry code, also recovered from the register. Because the AIDA extraction did not include sectoral information, we use ATECO codes to validate, and where coverage permits replace, the cost-based taxonomy used in the cross-sectional analysis. Since AIDA also lacks regional information, we map each firm’s province to the North, Centre, or South using the standard administrative correspondence. The three sources are linked by tax identifier: AIDA supplies outcomes and firm characteristics, the register supplies treatment timing, incorporation date and sector, and the province map supplies geography. The resulting dataset combines multi-year financial histories, exact registration timing and official sectoral classification, enabling the difference-in-differences identification used below.

4.2. The Six Performance Dimensions

Performance, for a small firm, is not a single quantity but a profile: a firm may grow rapidly and erratically, or expand slowly while generating stable cash flows, and these are different forms of performance that no single measure can represent (Csapi & Balogh, 2020; OECD, 2021). We therefore characterise each firm along six dimensions, constructed directly from annual series of revenue, EBITDA, operating cash flow, total assets, and employees, and computed over the full observation window. All measures are standardised across the pooled sample so that they are expressed in comparable standard-deviation units.
Growth is the mean annual log change in revenue, capturing the pace of expansion (OECD, 2021; Serrasqueiro et al., 2023). Persistence measures the first-order serial correlation of annual revenue growth, capturing whether growth carries forward over time. Volatility is the standard deviation of annual revenue growth, recording how erratic the expansion is. Productivity is mean log revenue per employee, locating the firm on the efficiency margin (OECD, 2021; Serrasqueiro et al., 2023). Profitability is the EBITDA margin, recomputed uniformly as EBITDA over revenue, expressed in percentage points and winsorised to the [−100, 100] interval to limit the influence of very small denominators (Csapi & Balogh, 2020; Rodríguez Valencia, 2025). Financial stability is a cash-flow stability index, defined as mean operating cash flow over its standard deviation, so that higher values indicate steadier and more predictable cash generation (Bakhtiari et al., 2020; Laghari et al., 2023).
We analyse these six dimensions as a multivariate vector rather than aggregating them into a composite competitiveness score. The reason is substantive and methodological: growth, persistence, volatility, productivity, profitability, and cash solidity need not move together, and the central tensions of SME performance lie precisely in their trade-offs (Csapi & Balogh, 2020; OECD, 2021; Serrasqueiro et al., 2023). A composite index would net these movements against one another and obscure the growth-versus-solidity structure the paper seeks to identify. Section 6 therefore tests aggregability through principal component analysis and shows that the dimensions are weakly and, in places, oppositely correlated. This justifies the multivariate treatment adopted throughout and makes visible both the growth–stability trade-off and the selective nature of the apparent growth premium. See Table 2.

4.3. Matching of Registration Dates, Sample, and Descriptive Statistics

The estimation sample is built in three steps: enforcing the SME perimeter, matching registration dates, and constructing the comparison group. We first retain only firms satisfying the European SME thresholds over the observation window — fewer than 250 employees, EUR 50 million in revenue, and EUR 43 million in total assets — and remove duplicates across extracts. This yields a cross-sectional sample of roughly 3,900 firms with complete information on performance dimensions, region, sector, and size, including about 2,700–2,900 innovative SMEs and about 1,160 non-innovative SMEs.
Registration dates are then matched to the AIDA panel by tax identifier. Among innovative firms, 2,040 are linked to a usable registration date; unmatched firms remain in the cross-sectional analysis but are excluded from the difference-in-differences sample. The 1,182 non-innovative firms never enter the register and therefore provide a clean never-treated comparison group, the preferred control in staggered designs (Roth et al., 2023; Sant’Anna & Zhao, 2020). Registration dates span 2015–2026, with entries increasing sharply after 2019 and concentrating in 2022–2025. This places the design in a staggered-adoption setting where estimator choice is consequential (Baker et al., 2022), while also implying that recent cohorts have limited post-registration accounting data.
The panel design depends on overlap between registration timing and the accounting window, which closes in 2024 for most firms. Of the matched treated firms, 1,421 have at least one post-registration year, 1,010 have at least two pre- and two post-registration years, and 685 have at least three on each side. The difference-in-differences estimates therefore rely on about one thousand treated firms with adequate two-sided coverage, at the cost of restricting identification to cohorts whose registration falls within the observable window (Athey & Imbens, 2022; Sun & Shapiro, 2022). We also compare matched and unmatched innovative firms on observable characteristics to clarify the population to which the causal estimates apply.
Descriptive statistics already anticipate the paper’s central tension. Innovative SMEs grow much faster than comparison firms, with median revenue growth roughly three times higher, but display lower cash-flow stability and profitability that is at best comparable. Productivity is higher among comparison firms, while persistence and volatility differ only modestly. The two groups also differ geographically and structurally: innovative firms are more concentrated in the North and in capital- and knowledge-intensive activities, whereas comparison firms are more evenly distributed and more material-intensive. The subsequent cross-sectional analysis asks whether these gaps survive covariate balancing, while the difference-in-differences design asks whether they reflect the effect of innovative status or the type of firms that select into it. As later results show, the descriptive gaps are real, but their causal content is limited because registering firms are already on distinctive pre-treatment trajectories, precisely the setting in which parallel pre-trends require careful scrutiny (Roth, 2022; Roth & Sant’Anna, 2023).

5. Testing the Anticipatory Take-Up Mechanism

The empirical strategy is organised around the three predictions generated by the anticipatory take-up mechanism. The central question is whether innovative-SME certification creates superior performance or whether it selects firms already on superior trajectories. The design therefore proceeds sequentially: it first measures the performance profile associated with certification, then tests whether this profile can be explained by static differences between firms, then asks whether the premium predates registration, and finally examines whether recent growth predicts entry into certification.
The first step concerns measurement. We treat firm performance as a multidimensional profile rather than a single score because certification may affect, or select on, some dimensions but not others. The six dimensions — persistence, growth, productivity, profitability, volatility, and financial stability — are constructed from the financial panel. To compare firms within economically meaningful environments, we use official ATECO codes where available and, where sectoral information is missing, derive a cost-based taxonomy from operating-cost structure using k-means clustering (Hartigan & Wong, 1979). This taxonomy is a benchmarking device, not a separate contribution. We then use principal component analysis to test whether the six dimensions can be collapsed into a single index (Jolliffe & Cadima, 2016). Since no dominant same-signed component emerges, performance is analysed as a vector, allowing the growth-versus-solidity trade-off to remain visible.
The second step tests Prediction 1: certified firms should display a cross-sectional performance premium. We estimate the association between innovative status and each performance dimension after making certified and non-certified firms comparable on observed characteristics. Entropy balancing reweights the comparison group so that sector, size, and regional covariate moments match those of innovative firms, with balancing performed within macro-regions and inference obtained by bootstrap (Hainmueller, 2012). This step answers a descriptive question: once firms are observably comparable, how do certified firms differ from non-certified firms?
The third step asks whether the cross-sectional premium can be explained by omitted static characteristics. Oster bounds assess how strong unobserved confounding would need to be, relative to observed controls, to eliminate the main cross-sectional differences (Oster, 2019). This does not establish causality, because it cannot address selection on pre-treatment trends. Its role is to determine whether the premium is merely an artefact of unobserved time-invariant firm traits.
The fourth step identifies where the premium is concentrated. We use causal forests to recover flexible heterogeneity in the cross-sectional differences across firms (Athey et al., 2019; Wager & Athey, 2018), interpreted through feature-importance and SHAP analysis (Lundberg & Lee, 2017). This component is descriptive rather than decisive for identification: it shows which firm types drive the observed differences, especially whether the growth–solidity trade-off is concentrated among smaller and capital-intensive firms.
The fifth and decisive step tests Prediction 2: if certification selects firms already on superior trajectories, a substantial share of the growth premium should appear before registration. We exploit registration timing by treating innovative status as a staggered, time-varying event and comparing each firm with its own pre-registration trajectory, using non-innovative firms as a never-treated control group. A two-way fixed-effects event study provides the event-time profile and tests for pre-trends; the doubly robust staggered estimator of Callaway and Sant’Anna (2021) addresses the biases of conventional two-way fixed effects under heterogeneous treatment timing. Firm fixed effects absorb time-invariant heterogeneity, while year fixed effects absorb common shocks.
The final step tests Prediction 3: recent growth, rather than profitability, should predict entry into certification. A discrete-time hazard model estimates which firms enter the register and when. This step tests the mechanism directly. If anticipatory take-up drives certification timing, registration should be more likely after recent revenue growth and among firms for which certification benefits are most valuable, while profitability should play a weaker role.
Taken together, the methods form a sequence of tests of a single theoretical mechanism rather than a collection of independent exercises. The cross-sectional layer tests whether a premium exists; sensitivity analysis asks whether it reflects omitted static characteristics; heterogeneity analysis identifies where it is concentrated; event-time evidence tests whether it predates certification; and the hazard model tests whether recent growth predicts entry. The common purpose is to distinguish what innovative-SME status marks from what it causes. See Table 3.
A synthesis is represented in the following Figure 1.

6. Sectoral Structure and Performance Measurement

Sector is an essential conditioning variable in comparisons of firm performance, since cost structures, capital intensity, and margins differ systematically across industries. Our primary sectoral classification is the official ATECO code recovered from the Business Register, which we use directly wherever it is available. We complement it with a taxonomy built from firms’ operating-cost composition — the economic structure that industry classifications are themselves meant to proxy — for two reasons: to validate the official codes against firms’ actual production technology, and to fill the residual gaps where register coverage is incomplete. This cost-based measure is particularly valuable because capital intensity, which later emerges as the main moderator of the differences between innovative and comparison firms, is captured more directly and continuously by cost structure than by discrete ATECO headings. The taxonomy therefore supplements and is disciplined by ATECO; it does not replace it, and all substantive results are unchanged when ATECO is used on its own.
The taxonomy is constructed as follows. For each firm we compute the shares of materials, services, personnel, and depreciation in total operating cost, obtaining a four-dimensional profile of whether production is materials-, labour-, service-, or capital-intensive; shares rather than levels are used so that the profile reflects production technology rather than firm size. We partition firms in this cost-share space using k-means clustering, with the number of clusters chosen by the average silhouette criterion and the algorithm initialised repeatedly to avoid local optima. The criterion selects four clusters, with an average silhouette width of 0.41, indicating a moderate but economically interpretable separation. The clusters map cleanly onto four production profiles: materials-intensive firms, where materials account for roughly three-fifths of operating cost; labour-intensive firms, dominated by personnel costs; services-intensive firms, where external services account for close to seventy per cent of cost; and capital-intensive firms, distinguished by a high depreciation share — the accounting footprint of investment in tangible and intangible fixed assets. Alternative cluster numbers, and the use of cost levels rather than shares, leave these core profiles essentially unchanged. See Table 4
We validate the cost-based taxonomy against the official ATECO codes, which we recover for 1,510 of the innovative firms. The two classifications are moderately and meaningfully associated (Cramér’s V ≈ 0.32), but the cost-based partition cuts across ATECO rather than reproducing it: materials-intensive firms concentrate in manufacturing and trade (about 68 per cent combined), while the labour-, services-, and capital-intensive clusters draw disproportionately on ICT and professional-scientific activities (around three-quarters combined). This is precisely the pattern a cost-structure taxonomy should display — it recovers genuine sectoral structure rather than statistical noise (Kim et al., 2022), yet encodes a technology dimension, namely capital and labour intensity, that discrete ATECO headings do not. The taxonomy therefore complements rather than replaces the official classification: we use ATECO wherever it is complete and the cost-based measure to fill the remaining gaps and to supply the continuous capital-intensity axis the analysis requires.
The cluster composition is itself economically telling. Innovative SMEs are over-represented in the capital- and services-intensive clusters, consistent with their reliance on intangible assets, research, and knowledge-intensive production (Horsch et al., 2021), whereas non-innovative firms concentrate in the materials-intensive cluster. Because capital intensity later emerges as the main moderator of the differences between the two populations (Le Mouel & Schiersch, 2024), the taxonomy — disciplined by ATECO but more continuous than it — serves both as a balancing covariate and as the sectoral axis along which the growth-versus-solidity trade-off is interpreted. See Figure 2.

7. Measuring Performance Without Aggregation

A common way to handle multidimensional firm performance is to collapse its indicators into a single composite score. This is justified only if the dimensions measure one underlying construct and move together. We therefore test, before estimating any effect, whether the six performance dimensions can legitimately be aggregated — a choice central to the paper, since a valid aggregation would license a single index whereas an invalid one requires a vector.
The evidence rejects aggregation. Pairwise correlations among the six dimensions are weak and, where non-trivial, of opposing sign: growth is negatively correlated with both profitability and financial stability, so the firms that grow fastest tend to be the least financially solid. Principal component analysis confirms this — the first component explains only 32.3 per cent of total variance and the first two together only 51.2 per cent, so no single factor summarises performance — and, crucially, that leading component is not a “more-is-better” competitiveness factor but a growth-versus-solidity contrast, loading positively on financial stability and profitability and negatively on growth and volatility (full loadings and diagnostics in Appendix B, Table B1 and Figure B1). Aggregation would therefore average away the very opposition that defines these firms — the compensability problem of composite indicators (Gibari et al., 2021; Greco et al., 2021; Mazziotta & Pareto, 2022) — making a dynamic but financially exposed firm resemble an average all-round performer. We accordingly reject a composite index and analyse the six dimensions as a multivariate outcome vector, estimating each separately and retaining the first principal component only as an interpretive growth-versus-solidity axis, never as a performance score.

8. The Cross-Sectional Premium

Having established that performance must be treated as a vector, we estimate how innovative SMEs differ from comparable non-innovative firms across the six dimensions. This is the cross-sectional layer of the design: it asks what premium is associated with innovative status once firms are made observably comparable. We stress at the outset that this layer is descriptive: innovative firms are self-selected, structurally different, and geographically concentrated, so a raw comparison would conflate innovative status with sectoral, size, and regional composition, and even a well-balanced comparison cannot, on its own, identify a causal effect. Its role is to establish what differs; the timing-based designs of Section 10 establish why.
We make firms observably comparable using entropy balancing (Hainmueller, 2012), which reweights non-innovative firms so that their covariate distribution matches that of innovative firms while retaining all observations. Balance is imposed on sector, firm size, and location, and because geography is structurally uneven, the procedure is performed within macro-regions and then aggregated into a pooled ATT using the treated firms’ regional distribution. Confidence intervals are obtained from 200 bootstrap replications.
The procedure substantially improves comparability for region, for firm revenue, and for the materials- and services-intensive sectors, and we are deliberately explicit about where it does not. Overlap is thin in some Northern cells, where genuinely comparable non-innovative firms are scarce, so a substantial residual imbalance remains on the capital-intensive share — whose post-balancing standardised mean difference, ≈ 0.44, is essentially unchanged from the unweighted 0.46 — and a smaller but non-trivial imbalance on firm size (≈ 0.33 for employees and ≈ 0.19 for revenue), all above the conventional 0.10 benchmark. We confront this rather than conceal it, for two reasons. First, it bounds the reading: the cross-sectional estimates are a precise descriptive comparison of observably similar firms, not causal evidence, and we label them as such throughout. Second, and decisively, the residual imbalance falls on time-invariant structural characteristics — capital intensity and size — which is precisely the heterogeneity that the within-firm design differences out. Identification in this paper therefore does not rest on cross-sectional overlap at all: it rests on the staggered difference-in-differences and event-study designs of Section 10, which compare each firm with its own pre-registration trajectory and are thus unaffected by the scarcity of comparable firms across the two populations — an identification whose own assumption, parallel trends, we probe directly in Section 11. Weak overlap therefore qualifies the cross-sectional portrait; it does not touch the identification, which is exactly why the causal weight rests on the timing-based designs rather than on the balanced comparison. See Table 5.
Innovative status is associated with a large, precisely estimated growth premium of +0.75 standard deviations (95% CI [0.69, 0.82]) and with lower profitability (−0.24, [−0.33, −0.18]) and financial stability (−0.44, [−0.67, −0.32]); the effects on productivity, persistence, and volatility are not statistically distinguishable from zero (Figure 3).
The pattern is therefore not a uniform improvement but a reconfiguration: innovative firms trade solidity for growth. The full set of pooled estimates and their confidence intervals is reported in Table 6.
Disaggregated by macro-region, the structure is striking. The growth premium is essentially universal (North +0.84, Centre +0.66, South +0.64), but the solidity penalty is geographically contingent: profitability falls by 0.50 standard deviations in the North, is flat in the Centre (−0.05), and turns positive in the South (+0.21), while financial stability follows the same gradient (North −0.72, Centre −0.29, South +0.07), as Table 7 reports.
In the South, innovative firms obtain the growth dividend without the solidity cost borne in the North — a contrast visible across the whole effect vector in Figure 4.
These estimates describe how innovative SMEs differ once observably comparable; they are not yet causal claims. The profitability result in particular is the most fragile, and the sensitivity analysis that follows subjects each estimate to a formal bound, while the within-firm identification asks how much of this cross-sectional structure reflects the status itself rather than the firms that select into it. Read in that light, the balanced contrasts are best understood as a precise portrait of difference rather than a measure of effect — the portrait the remainder of the paper progressively qualifies.

9. Is the Premium Explained by Omitted Characteristics?

The cross-sectional estimates rely on a selection-on-observables assumption: once sector, size, and region are controlled for, innovative and comparison firms are treated as comparable. The main remaining threat is unobserved confounding. Firms may differ in managerial quality, technological opportunities, or access to finance, and these unobserved factors may influence both registration and performance. We assess this risk using Oster’s coefficient-stability approach, which asks how strong selection on unobservables would need to be, relative to selection on observed controls, to overturn the estimated differences (Oster, 2019).
The intuition is simple. If adding observed controls greatly improves the model’s explanatory power but barely changes the estimated premium, the result is unlikely to be driven only by omitted firm characteristics. Oster’s method formalises this comparison and reports how influential unobserved factors would have to be to reduce each estimate to zero.
The three significant cross-sectional effects behave differently. The growth premium is highly robust: unobserved factors would have to be far stronger than sector, size, and region combined to explain it away. The financial-stability penalty is also robust, since unobservables would need to be nearly twice as influential as the observed controls to eliminate it. The profitability penalty, by contrast, is fragile: unobserved selection as strong as the observed controls would be enough to remove it. We therefore treat growth and financial stability as robust cross-sectional associations, while interpreting profitability more cautiously. See Figure 5.
We therefore downgrade the profitability result throughout the paper to suggestive, and rest the cross-sectional solidity story on financial stability, which the test supports, rather than on profitability, which it does not. The underlying coefficients, fit statistics, and bias-adjusted effects are reported in Table 8.
Two qualifications are essential, and they shape how the result feeds into the rest of the paper. First, Oster bounds address a specific threat — confounding by time-invariant unobserved characteristics that operate proportionally to the observables. They do not address reverse causation or selection on the outcome’s own trajectory. A large for growth therefore establishes that the cross-sectional growth premium is not an artefact of omitted firm characteristics; it does not establish that the status causes growth, because firms may enter the register precisely when their growth is accelerating — a dynamic the difference-in-differences analysis confirms. Robustness to unobserved confounding is thus necessary, but not sufficient, for a causal reading (Cinelli & Hazlett, 2020; Masten & Poirier, 2020). Second, the test is conditional on the proportional-selection assumption (Diegert et al., 2022) and on ; we adopt Oster’s conventional values rather than searching for bounds favourable to our estimates, and we report the implied bias-adjusted coefficients alongside the values so that readers can judge the sensitivity themselves. Taken together, the Oster analysis sharpens the cross-sectional findings without resolving their causal status: it retains growth and financial stability as robust associations, retires profitability to the status of a suggestive pattern, and motivates the within-firm identification to which we now turn.

10. Who Drives the Premium?

The cross-sectional difference vector is an average, and averages can conceal heterogeneity. The financial-stability penalty, for example, may be spread evenly across firms or concentrated among specific types. Because the policy interpretation depends on which firms drive the difference, we map how the gap between innovative and comparison firms varies by sector, size, and region using a flexible machine-learning approach. We stress at the outset that this is a descriptive decomposition of the balanced differences — it shows where the innovative-versus-comparison gap is largest, not how innovative status causally moderates outcomes.
We use a two-model forest estimator, fitting separate predictive models for innovative and comparison firms and comparing their predicted outcomes firm by firm (Wager and Athey, 2018; Künzel et al., 2019). This flexible approach allows the difference associated with innovative status to vary non-linearly across firm characteristics. We interpret the resulting heterogeneity through feature-importance measures and SHAP values, which identify which covariates explain the variation in predicted differences (Lundberg and Lee, 2017). As a check, the average predicted differences reproduce the cross-sectional estimates reported above, while also revealing how they are distributed across firms. We focus on financial stability, the dimension carrying the most robust solidity gap. The heterogeneity is pronounced and is driven primarily by sector rather than geography. Capital intensity is by far the strongest moderator, ahead of firm size, while region plays a secondary role. Capital-intensive innovative firms show the largest financial-stability gap, followed by materials-, services-, and labour-intensive firms. Size also matters: smaller firms are more exposed, consistent with thinner internal buffers. The apparent Northern gap largely reflects the concentration of capital-intensive innovative firms in the North; once sector and size are considered, residual regional moderation is modest. See Figure 6.
The economic reading is coherent. The solidity cost associated with innovative status falls on firms that invest heavily in long-horizon, illiquid tangible and intangible assets and that lack the scale to absorb the resulting strain on cash generation — capital-intensive, smaller firms. Geography matters not in itself but as a proxy for this industrial composition. This is precisely the kind of targeting information a capability-development programme requires, since it identifies the population whose financial fragility is most acute; the full set of conditional effects by sector and macro-region is reported in Table 9.
Two cautions bound the interpretation. First, like the cross-sectional estimates, these conditional differences are conditional on the same observed characteristics and rest on the selection-on-observables assumption; they map how innovative firms differ across types, not necessarily how the status causally moderates outcomes. Second, the moderator set is deliberately limited to pre-determined structural variables — sector, size, region — to avoid conditioning on outcome-contaminated channels. Read together with the difference-in-differences evidence, the heterogeneity map is best understood as a precise description of where the cross-sectional differences are largest, and therefore of where the selection into innovative status is most consequential, rather than as a menu of causal margins.

12. Robustness of the Identification Strategy

The within-firm evidence rests on three assumptions that a referee will rightly press: that the post-registration revenue path can be read against a credible parallel-trends benchmark despite the pronounced pre-trend; that the staggered design is not distorted by the negative weighting that afflicts two-way fixed-effects estimators; and that testing six outcomes does not manufacture significance by chance. This section subjects the identification to all three tests and finds the central conclusion — that the performance differences of innovative SMEs are driven by selection — to be unusually robust.

12.1. Sensitivity to Violations of Parallel Trends

Figure 9 applies the HonestDiD sensitivity analysis of Rambachan and Roth (2023) to the average post-registration revenue effect, asking how large a violation of parallel trends the result can absorb before it ceases to be distinguishable from zero. Panel A imposes the relative-magnitudes restriction: it bounds any post-period deviation by a multiple M̄ of the largest pre-period deviation and traces the robust 95% confidence interval as that bound grows. The interval widens rapidly and first contains zero at a breakdown value of only M̄ ≈ 0.13 — an undemanding threshold, since it means that allowing post-treatment trend violations just 13 per cent as large as those already visible before registration is enough to overturn significance. Panel B imposes the smoothness restriction, bounding the curvature (second differences) of the trend by M; here the robust interval straddles zero across essentially the entire plausible range. Together the panels show that the modest post-registration effect is not robust to even mild departures from parallel trends. Far from weakening the paper, this is what the selection reading predicts: with a pronounced pre-trend, there is no clean post-registration effect to defend, and the apparent premium is best read as the continuation of a trajectory already under way. See Figure 9.

12.2. Heterogeneity-Robust Estimators

Two-way fixed-effects event studies can be contaminated under staggered adoption, because already-treated units serve as controls and effects are combined with potentially negative weights. We therefore re-estimate the revenue path with the interaction-weighted estimator of Sun and Abraham (2021), which aggregates cohort-specific event-time coefficients using clean cohort shares, and compare it with the Callaway–Sant’Anna estimator reported in the main text and with an imputation estimator in the spirit of Borusyak, Jaravel and Spiess (2024).
The conclusion does not depend on the estimator (Figure 14). The Sun–Abraham path reproduces the two-way fixed-effects path almost exactly: the same large, monotone pre-trend and the same modest post-registration coefficients. The heterogeneity-robust correction therefore changes nothing of substance — the pre-trend is not an artefact of negative weighting. Tellingly, the one estimator that imposes parallel trends rather than testing them — the imputation estimator, which extrapolates each firm’s pre-registration level forward — returns a larger and very imprecise post effect, precisely because it reads the continuation of the pre-trend as a treatment effect. This is the cleanest possible demonstration that the threat to identification here is selection on the outcome trajectory, not the mechanics of the estimator. See Table 10.

12.3. Multiple-Hypothesis Testing

Because the cross-sectional analysis tests innovative status against six performance dimensions, conventional significance levels overstate the evidence: with six independent tests, the probability of at least one spurious rejection at the 5% level approaches one-quarter. We therefore control the family-wise error rate across the six outcomes with the studentised step-down procedure of Romano and Wolf (2005), in which the adjusted significance of each effect is judged against the bootstrap distribution of the largest studentised statistic among the outcomes not yet rejected, accounting for the correlation among them.
The correction leaves the headline findings intact (Table 9). The three effects that were individually significant — the growth premium and the profitability and financial-stability penalties — retain family-wise adjusted p -values below 0.01, while the three null effects remain indistinguishable from zero. Multiplicity is therefore not driving the results. One nuance deserves emphasis and reconciles this section with the sensitivity analysis: the profitability penalty survives the multiple-testing correction — it is not a false positive thrown up by testing many outcomes — yet it remains fragile to confounding, with an Oster δ * of approximately one. The two tests address different threats, and profitability passes one while failing the other; this is exactly why we treat it as suggestive rather than established. See Table 11.
Taken together, the three tests place the identification on firm ground. The within-firm growth result is not an artefact of estimator choice, it survives multiple-testing correction in the cross-section, and — most importantly — its post-registration component does not survive even a mild relaxation of parallel trends. Each test points the same way: the apparent performance advantages of innovative SMEs are a property of the firms that select into the status, exposed once treatment timing is taken seriously.

13. Direct Evidence on Selection: A Hazard Model of Registration Timing

The difference-in-differences evidence suggests dynamic selection: revenue rises before registration, implying that firms enter the register while already growing. We test this mechanism directly by modelling which firms register and when. If certification is sought at the crest of a growth trajectory, recent growth should predict registration even after accounting for size, profitability, sector, and region.
We estimate a discrete-time hazard model on the firm-year panel, where firms remain at risk until they register and never-registering firms are treated as censored. The model includes year, sector, and macro-region effects, together with lagged revenue growth, growth acceleration, firm size, and profitability. The estimation uses 13,559 firm-years and 1,321 registrations.
The results confirm the mechanism. Recent revenue growth strongly predicts registration: faster-growing firms are substantially more likely to enter the register, and the implied probability of registration rises from below 8 per cent among the slowest-growing firms to nearly 13 per cent among the fastest-growing firms. Smaller firms are also more likely to register. The mildly negative acceleration term suggests that firms enter as growth reaches its crest and begins to level off, matching the concave pre-registration trajectory observed in the event study. Profitability, by contrast, does not predict registration. Certification is therefore sought by firms already on a growth trajectory, not by firms that are simply more profitable. See Figure 10.
This first-stage evidence converts the selection mechanism from an inference into a measurement. The pre-trend told us that revenue was rising before registration; the hazard model shows that revenue growth is precisely what draws firms into the register, and that they enter as small, fast-growing firms at the peak of their expansion. The full estimates are reported in Table 12.
The implication for interpretation is decisive. Because entry into the register is itself driven by the outcome’s own trajectory, a contemporaneous comparison of registered firms with their peers necessarily compares firms selected for growth with firms that were not — and reads that selection as an effect. The hazard model thus supplies, from the entry side, the same conclusion the within-firm design reaches from the outcome side: innovative status marks firms already on a distinctive trajectory rather than placing them on one.

14. A model of Anticipatory Take-Up

The empirical results point to a single mechanism: firms enter certification at the crest of a temporary growth surge. The apparent performance premium is therefore not simply a missing causal effect, but the result of observable dynamic selection. This section provides a simple theoretical explanation for that pattern.
The logic is straightforward. Firm growth is partly transitory: unusually rapid expansion tends to slow down over time. Certification is most valuable precisely during such expansion, because growing firms face stronger financing needs, larger investment commitments, and greater organisational pressure. These benefits are especially valuable for smaller firms, which are more financially constrained. By contrast, profitability is not central to the timing decision, because the value of certification comes mainly from supporting expansion rather than rewarding margins.
A firm therefore has an incentive to apply when growth is high and the benefits of certification are greatest. Since growth is temporary, the optimal moment is near the crest of the expansion: the firm has already experienced a strong run-up, but the trajectory is beginning to level off. This generates three observable implications. First, recent growth should raise the probability of registration. Second, smaller firms should register more readily. Third, profitability should not strongly predict entry. These are precisely the patterns recovered by the hazard model.
The same mechanism also explains the event-study profile. Conditional on registration, revenue has already risen before entry because firms select into certification after a growth run-up. After registration, growth naturally slows as the temporary surge reverts. The observed post-registration premium is therefore the tail of a pre-existing trajectory rather than a new effect caused by certification. When calibrated to the panel, the model reproduces the empirical event-study pattern closely: most of the cumulative revenue gain occurs before registration, about 79 per cent in the model against 81 per cent in the data. A contemporaneous comparison would misread this run-up as a policy effect; the model shows that it is instead the signature of firms selecting into certification at the peak of a temporary expansion. See Figure 11.
The model is disciplined by the data it was built to explain, but it also makes sharp qualitative predictions about who registers and when that were not used in its construction, and each is borne out by the first-stage hazard estimates and the within-firm evidence. See Table 13.
The contribution of the model is conceptual rather than quantitative. It turns the paper’s central empirical finding into a statement about behaviour: when a policy status carries benefits that are most valuable to firms in the midst of a transitory expansion, voluntary and anticipatory take-up will concentrate registration at the crest of that expansion, and any evaluation that compares registrants with non-registrants at a point in time will mistake the resulting selection for an effect. This is why robustness to unobserved confounding and robustness to selection on trends are distinct properties — and why only a design that exploits the timing of take-up can tell them apart.

16. From Rewarding Momentum to Building Resilience

The reframing at the heart of this paper changes the policy question. If innovative-SME status mainly marks firms already on a growth trajectory, the question is not how much the status raises performance, but what a qualification that certifies already-dynamic firms accomplishes, and what its design and evaluation can credibly claim. Two implications follow, and both are general to voluntary certification rather than specific to the Italian regime: caution in attributing aggregate growth to the policy, and a positive agenda built on the robust fact that such schemes attract financially fragile firms.
First, the growth of certified firms cannot be read as the policy’s effect. Because the growth premium predates registration, much of the apparent dividend reflects self-selection into a voluntary status whose fiscal, financing, and regulatory benefits are most valuable during scaling; the hazard analysis confirms this, with registration rising with recent revenue growth and smaller size while profitability plays no role. A substantial share of support is therefore likely to be inframarginal — the classic deadweight concern of “picking winners” (Mina et al., 2021) — and the take-up model indicates that this is structural to applicant-initiated certification rather than an implementation flaw: rational firms time application to the crest of expansion. A scheme whose objective were to cause growth would thus have to reach firms earlier or decouple benefits from self-selected timing. This is not a verdict against the scheme, but a caution against reading the growth of certified firms as its causal effect (Ghanem et al., 2022; Marx et al., 2024).
Second, the analysis points to where such a scheme can add value: in addressing the vulnerability of the firms it attracts. The descriptive analysis identifies a financially fragile segment — smaller, capital-intensive firms investing in long-horizon tangible and intangible assets with limited internal buffers, which show lower cash-flow stability — consistent with evidence on financing constraints, innovation investment, and SME vulnerability (Bakhtiari et al., 2020; Cecere et al., 2020). Because the within-firm analysis finds no sign that the status itself causes this fragility, complementary support is the natural margin, with instruments such as patient and equity-like capital, working-capital and guarantee facilities aligned with intangible-investment payback periods, and advisory support in cash-flow and risk management.
Such support gains from following the heterogeneity rather than the register as a whole: capital intensity and small scale, not geography, identify the fragile segment, so capability-development and reskilling in financial management, treasury, and risk-control competences are most valuable when routed to smaller, capital-intensive firms rather than distributed uniformly.
A third implication concerns evaluation and governance. Contemporaneous comparisons of certified and non-certified firms can mislead even when robust to unobserved confounding, multiple-testing correction, and heterogeneity-robust estimators, because none of these detects selection on the growth trajectory itself — consistent with work on staggered adoption, heterogeneous treatment effects, and endogenous treatment timing (Callaway & Sant’Anna, 2021; Ghanem et al., 2022; Marx et al., 2024). Credibly identifying the effect of such a status therefore requires exploiting entry timing through within-firm and staggered difference-in-differences designs, disciplined by honest bounds on parallel trends (Rambachan & Roth, 2023); the implication for data collection is that first-registration dates and pre-entry trajectories are the relevant records, and that a multidimensional view, rather than a composite score, is needed to keep the growth-versus-solidity trade-off visible.
Taken together, the evidence suggests that a scheme certifying momentum adds most value not by rewarding growth already under way, but by strengthening the financial solidity that scaling erodes among its most exposed firms — an agenda to which capability- and reskilling-oriented interventions, such as the LUCE programme in the Italian setting, can contribute when targeted at smaller, capital-intensive firms and evaluated with designs able to distinguish selection from effect (Callaway & Sant’Anna, 2021; Rambachan & Roth, 2023). See Figure 12.
Figure 12. The growth–solidity trade-off and the selection behind the growth premium. Note. Innovative-SME status reconfigures performance — about +0.75 SD growth against −0.44 SD solidity (Section 8) — yet roughly 81 per cent of the revenue advantage predates registration, reflecting selection, not policy effect.
Figure 12. The growth–solidity trade-off and the selection behind the growth premium. Note. Innovative-SME status reconfigures performance — about +0.75 SD growth against −0.44 SD solidity (Section 8) — yet roughly 81 per cent of the revenue advantage predates registration, reflecting selection, not policy effect.
Preprints 219307 g012

17. Caveats, Coverage, and the Road Ahead

Several limitations qualify the findings and define the agenda for further work. The first concerns identification. The cross-sectional estimates rest on selection on observables: although Oster bounds show that the growth and financial-stability associations are robust to time-invariant unobserved confounding, this does not make them causal. The difference-in-differences design improves on this but cannot recover a clean growth effect because revenue violates parallel trends, a central concern in staggered designs where pre-trends reveal selection into treatment (Roth et al., 2023; Sant’Anna & Zhao, 2020). What the design identifies rigorously is selection on the growth trajectory. Honest bounds show that the small post-registration component does not survive even mild relaxations of parallel trends, so any genuine growth effect is small and not robustly distinguishable from continuation of the pre-trend (Rambachan & Roth, 2023; Masten & Poirier, 2020).
A second limitation is outcome coverage. Financial stability, the most robust cross-sectional penalty, is based on the multi-year dispersion of operating cash flow and has no natural single-year analogue. We test an annual proxy, operating cash flow scaled by revenue, and find no robust within-firm effect, strengthening the marker-not-motor interpretation. Yet this proxy captures the level of cash generation rather than steadiness, so the causal status of the steadiness cost remains narrowed but not fully closed.
Third, the panel sample is timing-selected. Of roughly 3,200 registered innovative SMEs, 2,040 match to the financial panel and about 1,000 have the two-sided coverage required for an event study. Identification therefore applies to cohorts whose registration falls within the observable accounting window, a standard constraint in event-study designs requiring sufficient pre- and post-treatment support (Borusyak et al., 2024). Although we compare matched and unmatched firms on observables, representativeness remains a concern. The comparison group is also treated as never-treated, though some firms may register later, and the register’s current-stock nature means that past entrants who exited are unobserved.
Fourth, measurement introduces noise. Where ATECO codes are incomplete, the cost-based sectoral taxonomy is a validated but imperfect proxy; residual imbalance remains on capital intensity and size in Northern cells with scarce comparison firms. The outcomes are annual and accounting-based: revenue per employee is a coarse productivity proxy, EBITDA margins are noisy for small and young firms, and the administrative data required non-trivial harmonisation. These choices matter because balancing and modern estimators rely on numerical optimisation routines whose properties may become relevant when samples are thin or constraints bind (Hsieh et al., 2022).
Fifth, the pre-trend complicates the baseline. Because firms self-select at the crest of expansion, there is no uncontaminated pre-registration period against which to benchmark. We argue in Section 10 that this pre-trend reflects selection rather than anticipation — the regime’s benefits accrue only at registration, and recent growth predicts entry rather than the reverse — though a small anticipatory component cannot be fully excluded. The hazard model and take-up model characterise the broad mechanism — registration timed to recent growth, concentrated among smaller firms, and unrelated to profitability — but the model is stylised, and the micro-trigger, whether financing rounds, investment surges, or expected eligibility, is identified only indirectly. Leverage, which other studies link to registration, is observed only as a contaminated latest-year value rather than a time-varying covariate, so its role cannot be tested directly.
Finally, the evidence concerns one country, one legal instrument — the Italian innovative-SME regime under Decree-Law 3/2015 — and one period. Future work should extend the panel, construct annual steadiness measures, use complete entry-and-exit records, collect application-level data to identify the selection mechanism and leverage channel, extend coverage to the full register, and replicate the analysis across countries and certification schemes.

18. Conclusions

Innovative-SME status marks firms already in motion rather than setting them in motion. Certified firms grow faster than their peers but are financially less solid, especially when small and capital-intensive; yet once each firm is compared with its own history, the headline growth premium is largely selection. Most of it predates registration, the residual does not survive even mild departures from parallel trends, and firms enter the register when recent growth is high, not when profitability improves. Profitability shows no within-firm effect, and an annual proxy for solidity none either: the apparent advantages of certification are properties of the firms that choose it, not consequences of the status itself.
The broader lesson outlasts the Italian case. Robustness to omitted variables is not robustness to selection on trends: a cross-sectional design can be balanced, bounded, and multiplicity-corrected and still mistake the selection of dynamic firms for the effect of a status, because none of those checks sees the pre-trend. Wherever a voluntary status carries benefits most valuable during expansion, rational firms time entry to the crest of their growth — so certified firms will tend to be firms already rising, and only identification built on the timing of take-up can separate the two.
That is the paper’s contribution in a sentence: distinguishing the marker from the motor. What remains open is whether the variance-based solidity cost is itself causal, what micro-events trigger entry, and how far the matched sample speaks for the full register.

Acknowledgments

This research was supported by the project “LUtech Campus Ecosystem – LUCE”, Project Code 22ROJB5, funded under a subsidized financing scheme of the Puglia Region within the framework of a Program Agreement, Contratto di Programma. The authors gratefully acknowledge this financial support, which made this study possible.

Appendix A

Table A1. Descriptive statistics by group.
Table A1. Descriptive statistics by group.
Variable Innovative Control SMD
median mean (SD) median mean (SD)
Panel A. Performance dimensions
Persistence (EBITDA autocorrelation) 0.38 0.33 (0.43) 0.43 0.39 (0.37) -0.15
Revenue growth (CAGR) 0.23 0.43 (0.54) 0.07 0.11 (0.15) 0.79
Productivity (revenue per employee, €) 112,564 168,208 (192,545) 181,171 348,776 (452,898) -0.52
Profitability (EBITDA margin, %) 10.25 2.65 (34.30) 6.5 9.49 (12.27) -0.27
Volatility (CV of margin) 0.74 2.32 (5.31) 0.54 1.84 (4.68) 0.09
Financial stability (cash-flow stability) 0.8 0.71 (1.18) 1.36 1.58 (1.23) -0.72
Panel B. Size and structure
Employees 8 17.2 (25.8) 24 45.7 (50.5) -0.71
Revenue (€ thousand) 900 2,731 (4,900) 7,543 12,585 (13,192) -0.99
Total assets (€ thousand) 1,738 4,137 (6,153) 6,216 9,669 (10,145) -0.66
Panel C. Macro-region (% of firms)
North 54.30% 33.80% 0.41
Centre 22.60% 32.30% -0.22
South 23.00% 33.90% -0.25
Panel D. Sector (% of firms)
Materials 15.40% 42.20% -0.63
Labour 41.60% 33.70% 0.16
Services 29.90% 22.70% 0.16
Capital 13.10% 1.50% 0.39
Note. Descriptive statistics for the estimation sample (N = 2786 innovative, N = 1160 non-innovative firms). Continuous variables are winsorised at the 1st and 99th percentiles; cells report the group median and the group mean with standard deviation in parentheses. SMD is the standardised mean difference between innovative and control firms (difference in means divided by the pooled standard deviation); for categorical variables it is the standardised difference in proportions. Productivity is revenue per employee; revenue and assets are in thousands of euro. The table reports raw, unbalanced differences; covariate balancing is applied in the treatment-effect analysis.

Appendix B. Aggregability of the six performance dimensions

This appendix details the correlation and principal-component analysis behind Section 6’s decision to treat performance as a vector. Correlations are weak and of opposing sign; the first component explains only 32.3 per cent of variance (51.2 per cent for two) and is itself a growth-versus-solidity contrast (Table B1), not a common performance factor.
Table B1. Principal component analysis of the six performance dimensions.
Table B1. Principal component analysis of the six performance dimensions.
Dimension PC1 PC2
Persistence 0.13 0.68
Growth −0.41 0.39
Productivity 0.18 0.47
Profitability 0.56 −0.19
Volatility −0.29 −0.36
Fin. stability 0.62 −0.02
Variance explained 32.30% 18.90%
Note. Loadings on the first two principal components of the standardised dimensions. PC1 explains 32.3% of variance, the first two 51.2%; it is a growth-versus-solidity contrast — positive on solidity, negative on growth — not a common performance factor.
Figure B1. Principal component structure of the six performance dimensions. Note. (A) Scree plot: variance explained by each principal component, none far above an equal split, so no single factor dominates. (B) Correlation heatmap: the dimensions are weakly and oppositely correlated, with growth opposing profitability and financial stability.
Figure B1. Principal component structure of the six performance dimensions. Note. (A) Scree plot: variance explained by each principal component, none far above an equal split, so no single factor dominates. (B) Correlation heatmap: the dimensions are weakly and oppositely correlated, with growth opposing profitability and financial stability.
Preprints 219307 g0b1
Figure B2. Revenue event study under alternative estimators. Note. Event-time coefficients on log revenue from the two-way fixed-effects event study and the heterogeneity-robust Sun–Abraham estimator, relative to k = −1. The two paths coincide: the pre-trend and the modest post-registration response are invariant to the estimator.
Figure B2. Revenue event study under alternative estimators. Note. Event-time coefficients on log revenue from the two-way fixed-effects event study and the heterogeneity-robust Sun–Abraham estimator, relative to k = −1. The two paths coincide: the pre-trend and the modest post-registration response are invariant to the estimator.
Preprints 219307 g0b2

References

  1. Acs, Z. J., & Audretsch, D. B. (1988). Innovation in large and small firms: an empirical analysis. The American economic review, 678-690.
  2. Albanese, G., & Bronzini, R. (2026). The impact of public incentives on the birth of innovative start-ups. Small Business Economics, 66(3), 1309-1332. [CrossRef]
  3. Anderloni, L., & Harasheh, M. (2025). Innovative startups and their traditional peers: Further evidence using performance and survival analysis. Review of Financial Economics, 43(3), 317-335. [CrossRef]
  4. Ashenfelter, O. (1978). Estimating the effect of training programs on earnings. The Review of Economics and Statistics, 47-57. [CrossRef]
  5. Athey, S., & Imbens, G. W. (2022). Design-based analysis in difference-in-differences settings with staggered adoption. Journal of econometrics, 226(1), 62-79. [CrossRef]
  6. “Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized random forests. The Annals of Statistics, 47(2), 1148–1178. [CrossRef]
  7. Bajgar, M., Berlingieri, G., Calligaris, S., Criscuolo, C., & Timmis, J. (2020). Coverage and representativeness of Orbis data.
  8. Baker, A. C., Larcker, D. F., & Wang, C. C. (2022). How much should we trust staggered difference-in-differences estimates?. Journal of financial economics, 144(2), 370-395. [CrossRef]
  9. Bakhtiari, S., Breunig, R., Magnani, L., & Zhang, J. (2020). Financial constraints and small and medium enterprises: A review. Economic Record, 96(315), 506-523. [CrossRef]
  10. Biancalani, F., Czarnitzki, D., & Riccaboni, M. (2021). The Italian start up act: A microeconometric program evaluation. Small Business Economics, 58(3), 1699. [CrossRef]
  11. Bianchini, M., & Kwon, I. (2020). Blockchain for SMEs and entrepreneurs in Italy. OECD SME and Entrepreneurship Papers.
  12. Borusyak, K., Jaravel, X., & Spiess, J. (2024). Revisiting event-study designs: robust and efficient estimation. Review of Economic Studies, 91(6), 3253-3285. [CrossRef]
  13. Bottazzi, G., & Secchi, A. (2006). Explaining the distribution of firm growth rates. The RAND Journal of Economics, 37(2), 235-256. [CrossRef]
  14. Bronzini, R., & Piselli, P. (2016). The impact of R&D subsidies on firm innovation. Research Policy, 45(2), 442-457.
  15. Cabral, L. M. B., & Mata, J. (2003). On the evolution of the firm size distribution: Facts and theory. American economic review, 93(4), 1075-1090. [CrossRef]
  16. Callaway, B., & Sant’Anna, P. H. (2021). Difference-in-differences with multiple time periods. Journal of econometrics, 225(2), 200-230.
  17. Cantner, U., & Kösters, S. (2012). Picking the winner? Empirical evidence on the targeting of R&D subsidies to start-ups. Small Business Economics, 39(4), 921-936. [CrossRef]
  18. Cassinis, M. G., Cintolesi, A., Formai, S., Locatelli, A., Manaresi, F., Manzoli, E., ... & Zuccolala, S. (2025). Innovative firms unveiled: economic and financial insights from Italian start-ups. Bank of Italy Occasional Paper, (967).
  19. Cecere, G., Corrocher, N., & Mancusi, M. L. (2020). Financial constraints and public funding of eco-innovation: Empirical evidence from European SMEs. Small Business Economics, 54(1), 285-302. [CrossRef]
  20. Cefis, E., & Marsili, O. (2006). Survivor: The role of innovation in firms’ survival. Research policy, 35(5), 626-641. [CrossRef]
  21. Cerulli, G. (2010). Modelling and measuring the effect of public subsidies on business R&D: A critical review of the econometric literature. economic record, 86(274), 421-449. [CrossRef]
  22. Cinelli, C., & Hazlett, C. (2020). Making sense of sensitivity: Extending omitted variable bias. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82(1), 39-67. [CrossRef]
  23. Civera, A., Meoli, M., & Vismara, S. (2020). Engagement of academics in university technology transfer: Opportunity and necessity academic entrepreneurship. European economic review, 123, 103376. [CrossRef]
  24. Coad (2009), The Growth of Firms: A Survey of Theories and Empirical Evidence, Edward Elgar.
  25. Coad, A., & Rao, R. (2008). Innovation and firm growth in high-tech sectors: A quantile regression approach. Research policy, 37(4), 633-648. [CrossRef]
  26. Coad, Daunfeldt & Halvarsson (2018), Bursting into life: firm growth and growth persistence by age, Small Business Economics.
  27. Colombo, M. G., & Grilli, L. (2010). On growth drivers of high-tech start-ups: Exploring the role of founders’ human capital and venture capital. Journal of business venturing, 25(6), 610-626. [CrossRef]
  28. Combs, J. G., Crook, R. T., & Shook, C. L. (2005). The dimensionality of organizational. Research Methodology in Strategy and Management, 259.
  29. Comin, D., & Philippon, T. (2005). The rise in firm-level volatility: Causes and consequences. NBER macroeconomics annual, 20, 167-201. [CrossRef]
  30. Csapi, V., & Balogh, V. (2020). A financial performance-based assessment of SMEs’ competitiveness–an analysis of Hungarian and US small businesses. Problems and Perspectives in Management, 18(3), 452. [CrossRef]
  31. Czarnitzki, D., & Delanote, J. (2013). Young innovative companies: the new high-growth firms?. Industrial and Corporate change, 22(5), 1315-1340. [CrossRef]
  32. Czarnitzki, D., & Lopes-Bento, C. (2013). Value for money? New microeconometric evidence on public R&D grants in Flanders. Research policy, 42(1), 76-89. [CrossRef]
  33. Daunfeldt, S. O., & Halvarsson, D. (2015). Are high-growth firms one-hit wonders? Evidence from Sweden. Small Business Economics, 44(2), 361-383. [CrossRef]
  34. De Chaisemartin, C., & d’Haultfoeuille, X. (2020). Two-way fixed effects estimators with heterogeneous treatment effects. American economic review, 110(9), 2964-2996.
  35. Del Monte, A., & Papagni, E. (2003). R&D and the growth of firms: empirical analysis of a panel of Italian firms. Research policy, 32(6), 1003-1014. [CrossRef]
  36. Delmar, F., Davidsson, P., & Gartner, W. B. (2003). Arriving at the high-growth firm. Journal of business venturing, 18(2), 189-216.
  37. Diegert, P., Masten, M. A., & Poirier, A. (2022). Assessing omitted variable bias when the controls are endogenous. arXiv preprint arXiv:2206.02303.
  38. Finaldi Russo, P., Magri, S., & Rampazzi, C. (2016). Innovative start-ups in Italy: their special features and the effects of the 2012 law. Bank of Italy Occasional Paper, (339).
  39. Ghanem, D., Sant’Anna, P. H., & Wüthrich, K. (2022). Selection and parallel trends. arXiv preprint arXiv:2203.09001.
  40. Gibari, S. E., Cabello, J. M., Gómez, T., & Ruiz, F. (2021). Composite indicators as decision making tools: The joint use of compensatory and noncompensatory schemes. International Journal of Information Technology & Decision Making, 20(03), 847-879. [CrossRef]
  41. Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. Journal of econometrics, 225(2), 254-277.
  42. Greco, S., Ishizaka, A., Tasiou, M., & Torrisi, G. (2019). On the methodological framework of composite indices: A review of the issues of weighting, aggregation, and robustness. Social indicators research, 141(1), 61-94. [CrossRef]
  43. Greco, S., Ishizaka, A., Tasiou, M., & Torrisi, G. (2021). The ordinal input for cardinal output approach of non-compensatory composite indicators: The PROMETHEE scoring method. European Journal of Operational Research, 288(1), 225-246. [CrossRef]
  44. Grilli, L., & Murtinu, S. (2014). Government, venture capital and the growth of European high-tech entrepreneurial firms. Research Policy, 43(9), 1523-1543. [CrossRef]
  45. Grilli, L., Mrkajic, B., & Giraudo, E. (2023). Industrial policy, innovative entrepreneurship, and the human capital of founders. Small Business Economics, 60(2), 707-728. [CrossRef]
  46. Hainmueller, J. (2012). Entropy balancing for causal effects: A multivariate reweighting method to produce balanced samples in observational studies. Political analysis, 20(1), 25-46. [CrossRef]
  47. Hall, B. H., Lotti, F., & Mairesse, J. (2009). Innovation and productivity in SMEs: empirical evidence for Italy. Small business economics, 33(1), 13-33. [CrossRef]
  48. Hamilton, R. T., & Ng, P. Y. (2025). What we know about high-growth firms, and what we do not: A systematic review. International Small Business Journal, 43(4), 420-451. [CrossRef]
  49. Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics), 28(1), 100-108.
  50. Heckman, J. J. (1979). Sample selection bias as a specification error. Econometrica: Journal of the econometric society, 153-161. [CrossRef]
  51. Heckman, J. J., LaLonde, R. J., & Smith, J. A. (1999). The economics and econometrics of active labor market programs. In Handbook of labor economics (Vol. 3, pp. 1865-2097). Elsevier.
  52. Horsch, P., Longoni, P., & Oesch, D. (2021). Intangible capital and leverage. Journal of financial and quantitative analysis, 56(2), 475-498. [CrossRef]
  53. Hsieh, Y. W., Shi, X., & Shum, M. (2022). Inference on estimators defined by mathematical programming. Journal of Econometrics, 226(2), 248-268. [CrossRef]
  54. Imbens & Rubin (2015), Causal Inference for Statistics, Social, and Biomedical Sciences, Cambridge University Press.
  55. Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: a review and recent developments. Philosophical transactions. Series A, Mathematical, physical, and engineering sciences, 374(2065), 20150202.
  56. Kalemli-Özcan, Ş., Sørensen, B. E., Villegas-Sanchez, C., Volosovych, V., & Yeşiltaş, S. (2024). How to construct nationally representative firm-level data from the orbis global database: New facts on smes and aggregate implications for industry concentration. American Economic Journal: Macroeconomics, 16(2), 353-374. [CrossRef]
  57. Kim, D., Kang, H. G., Bae, K., & Jeon, S. (2022). An artificial intelligence-enabled industry classification and its interpretation. Internet Research, 32(2), 406-424. [CrossRef]
  58. Laghari, F., Ahmed, F., & López García, M. D. L. N. (2023). Cash flow management and its effect on firm performance: Empirical evidence on non-financial firms of China. Plos one, 18(6), e0287135. [CrossRef]
  59. Lerner, J. (2000). The government as venture capitalist: the long-run impact of the SBIR program. The Journal of Private Equity, 55-78.
  60. Lerner, J. (2009). Boulevard of broken dreams: why public efforts to boost entrepreneurship and venture capital have failed--and what to do about it. Princeton University Press.
  61. Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in neural information processing systems, 30.
  62. Manaresi, F., Menon, C., & Santoleri, P. (2021). Supporting innovative entrepreneurship: an evaluation of the Italian “Start-up Act”. Industrial and Corporate Change, 30(6), 1591-1614. [CrossRef]
  63. Marcus, M., & Sant’Anna, P. H. (2021). The role of parallel trends in event study settings: An application to environmental economics. Journal of the Association of Environmental and Resource Economists, 8(2), 235-275. [CrossRef]
  64. Marx, P., Tamer, E., & Tang, X. (2024). Parallel trends and dynamic choices. Journal of Political Economy Microeconomics, 2(1), 129-171. [CrossRef]
  65. Masten, M. A., & Poirier, A. (2020). Inference on breakdown frontiers. Quantitative Economics, 11(1), 41-111.
  66. Masten, M. A., & Poirier, A. (2020). Inference on breakdown frontiers. Quantitative Economics, 11(1), 41-111.
  67. Matricano, D. (2024). Designing effective policies for innovative start-ups: Lessons learned in Italy. Journal of Business Venturing Insights, 22, e00486. [CrossRef]
  68. Mazziotta, M., & Pareto, A. (2022). Composite indices construction: The performance interval approach. Social indicators research, 161(2), 541-551. [CrossRef]
  69. Mazzoni, L., Riccaboni, M., & Stam, E. (2025). Entrepreneurial ecosystems and interregional flows of entrepreneurial talent: L. Mazzoni et al. Small Business Economics, 65(3), 1327-1361. [CrossRef]
  70. Mazzucato, M. (2015). The green entrepreneurial state. In The politics of green transformations (pp. 134-152). Routledge.
  71. Menon, C., DeStefano, T., Manaresi, F., Soggia, G., & Santoleri, P. (2018). The evaluation of the Italian “Start-up Act”. OECD Science, Technology and Industry Policy Papers.
  72. Mina, A., Di Minin, A., Martelli, I., Testa, G., & Santoleri, P. (2021). Public funding of innovation: Exploring applications and allocations of the European SME Instrument. Research Policy, 50(1), 104131. [CrossRef]
  73. Mouel, M. L., & Schiersch, A. (2024). Intangible capital and productivity divergence. Review of Income and Wealth, 70(3), 605-638.
  74. Nicoletti, G., & Smiderle, I. (2025). Where has all the productivity gone? Italy’s missing growth in the XXIst century: issues and pro-productivity policies. Italy’s missing growth in the XXIst century: issues and pro-productivity policies (September 01, 2025).
  75. OECD. (2021). Understanding Firm Growth. OECD Publishing.
  76. Onesti, G., Monaco, E., & Palumbo, R. (2022). Assessing the Italian innovative start-ups performance with a composite index. Administrative sciences, 12(4), 189.
  77. Oster, E. (2019). Unobservable selection and coefficient stability: Theory and evidence. Journal of Business & Economic Statistics, 37(2), 187-204. [CrossRef]
  78. Penrose (1959), The Theory of the Growth of the Firm, Oxford University Press.
  79. Rambachan, A., & Roth, J. (2023). A more credible approach to parallel trends. Review of Economic Studies, 90(5), 2555-2591. [CrossRef]
  80. Richard, P. J., Devinney, T. M., Yip, G. S., & Johnson, G. (2009). Measuring organizational performance: Towards methodological best practice. Journal of management, 35(3), 718-804. [CrossRef]
  81. Rodríguez Valencia, L. (2025). Financial performance and corporate governance on firm value: Evidence from Spain. International Journal of Financial Studies, 13(3), 123. [CrossRef]
  82. Rosenbaum & Rubin (1983), The central role of the propensity score in observational studies, Biometrika.
  83. Rosenbusch, Brinckmann & Bausch (2011), Is innovation always beneficial? A meta-analysis of innovation and performance in SMEs, Journal of Business Venturing.
  84. Roth, J. (2022). Pretest with caution: Event-study estimates after testing for parallel trends. American Economic Review: Insights, 4(3), 305-322.
  85. Roth, J., & Sant’Anna, P. H. (2023). Efficient estimation for staggered rollout designs. Journal of Political Economy Microeconomics, 1(4), 669-709. [CrossRef]
  86. Roth, J., Sant’Anna, P. H., Bilinski, A., & Poe, J. (2023). What’s trending in difference-in-differences? A synthesis of the recent econometrics literature. Journal of econometrics, 235(2), 2218-2244. [CrossRef]
  87. Sant’Anna, P. H., & Zhao, J. (2020). Doubly robust difference-in-differences estimators. Journal of econometrics, 219(1), 101-122.
  88. Santoleri, P. (2020). Innovation and job creation in (high-growth) new firms. Industrial and Corporate Change, 29(3), 731-756. [CrossRef]
  89. Schifilliti, V., & La Rocca, E. T. (2024). Board gender diversity in innovative SMEs: An investigation across industrial sectors. European Journal of Innovation Management, 27(9), 461–486. [CrossRef]
  90. Serrasqueiro, Z., Pinto, B., & Sardo, F. (2023). SMEs growth and profitability, productivity and debt relationships. Journal of Economics, Finance and Administrative Science, 28(56), 404-419.
  91. Sun, L., & Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous treatment effects. Journal of econometrics, 225(2), 175-199.
  92. Sun, L., & Shapiro, J. M. (2022). A linear panel model with heterogeneous coefficients and variation in exposure. Journal of Economic Perspectives, 36(4), 193–204.
  93. Tarantola, S., Vertesy, D., & Albrecht, D. (2012). Composite Indicators measuring the progress in the construction and integration of a European Research Area.
  94. Valente, S., & Pisoni, A. (2024). Entrepreneurial Ecosystems: exploring the Italian Tech Scaleups scenario. Management of sustainability and well-being for individuals and society, 89.
  95. Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523), 1228-1242. [CrossRef]
  96. Zúñiga-Vicente, J. Á., Alonso-Borrego, C., Forcadell, F. J., & Galán, J. I. (2014). Assessing the effect of public subsidies on firm R&D investment: a survey. Journal of economic surveys, 28(1), 36-67. [CrossRef]
Figure 1. Testing the Anticipatory Take-Up Mechanism: From Cross-Sectional Premium to Observable Dynamic Selection. Note. The figure summarises the paper’s empirical logic. Cross-sectional comparisons establish a performance premium among certified firms; event-time evidence shows that most of this premium predates certification; hazard estimates confirm that recent growth predicts entry. Together, the results support observable dynamic selection rather than a certification-induced performance effect.
Figure 1. Testing the Anticipatory Take-Up Mechanism: From Cross-Sectional Premium to Observable Dynamic Selection. Note. The figure summarises the paper’s empirical logic. Cross-sectional comparisons establish a performance premium among certified firms; event-time evidence shows that most of this premium predates certification; hazard estimates confirm that recent growth predicts entry. Together, the results support observable dynamic selection rather than a certification-induced performance effect.
Preprints 219307 g001
Figure 2. Cost-based sectoral taxonomy: cluster centroids. Note. Stacked cost-share profiles of the four clusters. The capital-intensive cluster (96% innovative) and the services-intensive cluster (76%) contrast sharply with the materials-intensive cluster (47% innovative), where non-innovative firms concentrate, foreshadowing the heterogeneity analysis.
Figure 2. Cost-based sectoral taxonomy: cluster centroids. Note. Stacked cost-share profiles of the four clusters. The capital-intensive cluster (96% innovative) and the services-intensive cluster (76%) contrast sharply with the materials-intensive cluster (47% innovative), where non-innovative firms concentrate, foreshadowing the heterogeneity analysis.
Preprints 219307 g002
Figure 3. Cross-sectional effect vector of innovative status. Note. Pooled region-stratified ATT of innovative status on each of the six performance dimensions, in standard-deviation units, with 95% confidence intervals. Red marks statistically significant effects: a large growth premium against lower profitability and financial stability.
Figure 3. Cross-sectional effect vector of innovative status. Note. Pooled region-stratified ATT of innovative status on each of the six performance dimensions, in standard-deviation units, with 95% confidence intervals. Red marks statistically significant effects: a large growth premium against lower profitability and financial stability.
Preprints 219307 g003
Figure 4. Effect of innovative status by macro-region. Note. ATT of innovative status by macro-region across the six dimensions, with 95% confidence intervals. Growth is positive everywhere, whereas the profitability and financial-stability penalties fall on Northern firms and disappear or reverse in the South.
Figure 4. Effect of innovative status by macro-region. Note. ATT of innovative status by macro-region across the six dimensions, with 95% confidence intervals. Growth is positive everywhere, whereas the profitability and financial-stability penalties fall on Northern firms and disappear or reverse in the South.
Preprints 219307 g004
Figure 5. Oster (2019) sensitivity of each cross-sectional effect. Note. Bias-adjusted coefficient β*(δ) as the assumed selection on unobservables (δ) rises relative to observables. Growth never crosses zero (robust); financial stability crosses only at δ ≈ 1.85; profitability crosses at δ ≈ 0.92, hence fragile.
Figure 5. Oster (2019) sensitivity of each cross-sectional effect. Note. Bias-adjusted coefficient β*(δ) as the assumed selection on unobservables (δ) rises relative to observables. Growth never crosses zero (robust); financial stability crosses only at δ ≈ 1.85; profitability crosses at δ ≈ 0.92, hence fragile.
Preprints 219307 g005
Figure 6. Heterogeneity in the balanced financial-stability difference. Note. (A) Feature-importance ranking of the drivers of financial-stability heterogeneity: the capital-intensive indicator dominates, far ahead of firm size and region. (B) Conditional difference by sector: the solidity gap is concentrated in capital-intensive innovative firms.
Figure 6. Heterogeneity in the balanced financial-stability difference. Note. (A) Feature-importance ranking of the drivers of financial-stability heterogeneity: the capital-intensive indicator dominates, far ahead of firm size and region. (B) Conditional difference by sector: the solidity gap is concentrated in capital-intensive innovative firms.
Preprints 219307 g006
Figure 7. Dynamic Selection into Certification: Revenue Trajectories Before and After Registration. Note. Panel A reports event-study estimates of revenue around registration, showing a pronounced pre-registration growth trajectory. Panel B decomposes the cumulative revenue premium, indicating that roughly 80% of the observed advantage is already in place before firms enter certification.
Figure 7. Dynamic Selection into Certification: Revenue Trajectories Before and After Registration. Note. Panel A reports event-study estimates of revenue around registration, showing a pronounced pre-registration growth trajectory. Panel B decomposes the cumulative revenue premium, indicating that roughly 80% of the observed advantage is already in place before firms enter certification.
Preprints 219307 g007
Figure 8. Financial solidity under within-firm identification. Note. Event-time coefficients for operating cash flow scaled by revenue, from the two-way fixed-effects event study and the Callaway–Sant’Anna estimator, with 95% confidence intervals. Near-event pre-trends are flat and no post-registration effect is significant.
Figure 8. Financial solidity under within-firm identification. Note. Event-time coefficients for operating cash flow scaled by revenue, from the two-way fixed-effects event study and the Callaway–Sant’Anna estimator, with 95% confidence intervals. Near-event pre-trends are flat and no post-registration effect is significant.
Preprints 219307 g008
Figure 9. HonestDiD sensitivity of the post-registration revenue effect. Note. Robust 95% confidence interval for the average post-registration effect as the assumed parallel-trends violation grows, under relative-magnitude (A) and smoothness (B) restrictions. The interval includes zero from a relative magnitude of roughly 0.13, an undemanding threshold.
Figure 9. HonestDiD sensitivity of the post-registration revenue effect. Note. Robust 95% confidence interval for the average post-registration effect as the assumed parallel-trends violation grows, under relative-magnitude (A) and smoothness (B) restrictions. The interval includes zero from a relative magnitude of roughly 0.13, an undemanding threshold.
Preprints 219307 g009
Figure 10. Determinants of registration timing. Note. (A) Odds ratios from a discrete-time logit hazard of entry into the register, with year, sector, and region fixed effects; red marks effects whose 95% interval excludes one. (B) Model-implied registration probability across the revenue-growth distribution.
Figure 10. Determinants of registration timing. Note. (A) Odds ratios from a discrete-time logit hazard of entry into the register, with year, sector, and region fixed effects; red marks effects whose 95% interval excludes one. (B) Model-implied registration probability across the revenue-growth distribution.
Preprints 219307 g010
Figure 11. The anticipatory take-up model reproduces the observed pattern. Note. (A) The calibrated model’s event-study path against the empirical estimates.
Figure 11. The anticipatory take-up model reproduces the observed pattern. Note. (A) The calibrated model’s event-study path against the empirical estimates.
Preprints 219307 g011
Table 1. Positioning relative to prior evaluations of innovative-firm policies.
Table 1. Positioning relative to prior evaluations of innovative-firm policies.
Study Setting / instrument Performance treated as Identification strategy Selection addressed? Key finding
Menon et al. (2018) Italian Startup Act (DL 179/2012) Multiple outcomes, levels Conditional DiD / matching On observables Positive effects on access to equity, debt and employment
Anderloni & Harasheh (2025) Italian Startup Act, innovative startups Financial structure, survival (multiple) Matching, survival & probit (cross-section) Documented, not timing-based Stronger finances and survival, slower to profit; leverage predicts registration
Onesti et al. (2022) Italian innovative startups Composite index Descriptive (no causal design) No Builds a single composite performance score
Albanese & Bronzini (2026) Public incentives, Italy Firm birth Quasi-experimental On observables Incentives raise the birth rate of innovative start-ups
Grilli, Mrkajic & Giraudo (2023) Industrial policy, Italy Financing / performance Cross-section / matching On observables Founders’ human capital is a key margin of policy effect
Colombo & Grilli (2010) Italian high-tech start-ups Growth Cross-section, selection-corrected On observables Human and venture capital drive start-up growth
Bronzini & Iachini (2014) Italian R&D incentives R&D investment Regression discontinuity Design-based (threshold) Effect concentrated among smaller firms
Grilli & Murtinu (2014) Government venture capital, Europe Growth (sales, employment) Panel, selection-corrected On observables Public-VC growth effect smaller than private VC
Lerner (2000) SBIR programme, US Growth, employment Matched long-run comparison On observables Awardees grow faster over the long run
Cantner & Kösters (2012) R&D subsidies to start-ups, Germany Innovation / targeting Matching Studies the targeting itself Subsidies partly reach firms that would have grown anyway (“picking winners”)
Czarnitzki & Delanote (2013) Young innovative companies, Belgium Growth Cross-section regression No Young innovative firms are the new high-growth firms
Bottazzi, Secchi & Tamagni (2008) Italian firms (general) Multidimensional (productivity, profitability, finance) Descriptive joint distribution n/a Performance dimensions are weakly associated; no single ladder
Coad & Rao (2008) US high-tech firms Innovation & growth Quantile regression (cross-section) No Innovation matters most for the fastest-growing firms
Note. Selected evaluations of innovative-firm status and comparable policies, classified by how performance is measured, the identification strategy, and whether selection is addressed. None combines a multidimensional performance vector with timing-based identification of selection — the gap this paper fills.
Table 2. The six performance dimensions: definitions and formulas.
Table 2. The six performance dimensions: definitions and formulas.
Dimension Definition Formula (per firm i)
Growth Mean annual log growth of revenue — the pace of expansion g i = m e a n t g t
Persistence First-order serial correlation of the annual growth series — how regularly growth carries forward ρ i = C o r r g t , g t 1
Volatility Standard deviation of annual revenue growth — how erratic, rather than how fast, the expansion is σ i = s d t g t
Productivity Mean log labour productivity (revenue per employee) p i = m e a n t l n R t / L t
Profitability Mean EBITDA margin in percentage points, winsorised to 100 , 100 m i = m e a n t 100 E B I T D A t R t
Financial stability Cash-flow stability index: mean operating cash flow over its standard deviation (an inverse coefficient of variation); higher = steadier cash s i = C F ¯ s d t C F t
Note. The annual growth rate is g t = l n R t l n R t 1 . Each measure is computed per firm over its full observation window and then standardised across the pooled sample, z = x x / σ ^ . R = revenue, L = employees, C F = operating cash flow, E B I T D A = earnings before interest, taxes, depreciation and amortisation. The six dimensions are analysed jointly as a vector; their aggregability is tested (and rejected) in Section 6.
Table 3. Empirical strategy as a sequence of linked testsGraphical abstract of the empirical logic.
Table 3. Empirical strategy as a sequence of linked testsGraphical abstract of the empirical logic.
Block Question addressed Evidence used Role in the argument
Measurement What does “performance” mean for SMEs? Six performance dimensions; sectoral taxonomy; PCA Shows that performance is a multidimensional profile, not a single score.
Cross-sectional premium Do certified firms differ from comparable non-certified firms? Region-stratified entropy balancing Establishes the descriptive premium: certified firms grow faster but are less financially solid.
Static selection Could the premium be explained by omitted firm characteristics? Oster bounds; covariate balance checks Shows that static omitted characteristics do not easily explain the growth premium, but this is not yet causal evidence.
Timing and dynamic selection Does the premium emerge before or after certification? Event study; staggered DiD; Callaway–Sant’Anna estimator; honest bounds Provides the decisive test: most of the growth premium predates certification and is consistent with dynamic selection.
Entry
mechanism
Which firms enter certification, and when? Discrete-time hazard model; anticipatory take-up mechanism Confirms the mechanism directly: recent growth predicts entry, while profitability does not.
Table 4. Cost-based sectoral taxonomy: cluster centroids (mean cost shares).
Table 4. Cost-based sectoral taxonomy: cluster centroids (mean cost shares).
Cluster Materials Services Personnel Depreciation N % innovative
Materials-intensive 0.62 0.18 0.16 0.04 918 47%
Labour-intensive 0.10 0.33 0.49 0.08 1,551 75%
Services-intensive 0.07 0.69 0.16 0.07 1,096 76%
Capital-intensive 0.06 0.33 0.20 0.41 381 96%
Note. Cluster centroids from k-means (k = 4; average silhouette width 0.41) on operating-cost shares. Clusters are labelled by their dominant cost category; “N” is cluster size and “% innovative” the share of innovative SMEs within it.
Table 5. Covariate balance before and after region-stratified entropy balancing.
Table 5. Covariate balance before and after region-stratified entropy balancing.
Covariate Abs. SMD (before) Abs. SMD (after)
Region: Centre 0.219 0.000
Region: North 0.423 0.000
Region: South 0.242 0.000
Sector: Capital-intensive 0.458 0.442
Sector: Labour-intensive 0.164 0.251
Sector: Materials-intensive 0.619 0.019
Sector: Services-intensive 0.165 0.003
log(Employees) 0.637 0.326
log(Revenue) 0.920 0.193
Note. Absolute standardised mean differences between innovative and reweighted comparison firms, before and after balancing. Region is matched by construction; residual imbalance on the capital-intensive share and firm size reflects sparse Northern comparison cells.
Table 6. Effect of innovative status on the six performance dimensions (pooled region-stratified ATT).
Table 6. Effect of innovative status on the six performance dimensions (pooled region-stratified ATT).
Dimension ATT (SD) 95% CI Significant
Growth +0.751 [+0.702, +0.810] Yes
Persistence +0.088 [−0.019, +0.189] No
Volatility +0.012 [−0.118, +0.128] No
Productivity −0.198 [−0.280, +0.031] No
Profitability −0.237 [−0.348, −0.186] Yes
Financial stability −0.443 [−0.678, −0.346] Yes
Note. ATT in standard-deviation units, pooled across macro-regions and weighted by the regional distribution of treated firms; 95% confidence intervals from 200 bootstrap replications. Growth, profitability, and financial stability are significant; the other three are not.
Table 7. Effect of innovative status by macro-region (ATT, SD units).
Table 7. Effect of innovative status by macro-region (ATT, SD units).
Dimension North Centre South
Growth +0.836 +0.661 +0.637
Persistence −0.021 +0.197 +0.239
Volatility +0.031 −0.148 +0.124
Productivity −0.284 −0.113 −0.080
Profitability −0.503 −0.051 +0.206
Financial stability −0.722 −0.293 +0.066
Note. Region-specific ATT in standard-deviation units (treated firms: North 1,514; Centre 630; South 642). The growth premium is essentially universal, while the solidity penalty is concentrated in the North and reverses in the South.
Table 8. Sensitivity to unobserved confounding (Oster, 2019).
Table 8. Sensitivity to unobserved confounding (Oster, 2019).
Dimension β (uncontr.) β (contr.) R² (contr.) R_max δ* Bias-adj. β (δ = 1)
Growth +0.660 +0.685 0.118 0.153 > 10 +0.729
Profitability −0.230 −0.054 0.110 0.143 +0.92 +0.005
Financial stability −0.693 −0.365 0.198 0.258 +1.85 −0.168
Note. δ* is the selection on unobservables, relative to observables, that would drive each effect to zero (R_max = 1.3 × controlled R², capped at 1). Values above 1 indicate robustness; the bias-adjusted coefficient is evaluated at δ = 1.
Table 9. Heterogeneity in balanced differences (forest-based conditional differences, SD units), by sector and macro-region.
Table 9. Heterogeneity in balanced differences (forest-based conditional differences, SD units), by sector and macro-region.
Group CATE growth CATE profitability CATE fin. stability N
By sector
Capital +0.93 −1.09 −2.09 381
Materials +0.74 −0.16 −0.67 918
Services +0.91 −0.23 −0.43 1,096
Labour +0.56 −0.17 −0.28 1,551
By macro-region
North +0.81 −0.52 −0.81 1,906
Centre +0.71 −0.10 −0.47 1,005
South +0.60 +0.01 −0.29 1,035
Note. Conditional average differences between innovative and comparison firms, in standard-deviation units, from a two-model gradient-boosted forest. These are descriptive conditional differences, not causal treatment-effect estimates. The financial-stability gap is concentrated in capital-intensive and smaller firms; macro-region is a secondary moderator that proxies industrial composition.
Table 10. Average post-registration revenue effect across estimators.
Table 10. Average post-registration revenue effect across estimators.
Estimator Parallel-trends treatment Average post effect (k ≥ 0)
Callaway–Sant’Anna (main text) heterogeneity-robust small, all coefficients ≈ +0.12 to +0.25
Two-way fixed effects (event study) diagnostic only +0.231 (0.031)
Sun–Abraham (interaction-weighted) heterogeneity-robust +0.182 (0.033)
Borusyak–Jaravel–Spiess (imputation) imposed +0.29 (wide, imprecise)
Note. Average of the post-registration event-time coefficients on log revenue. Heterogeneity-robust estimators (Sun–Abraham, Callaway–Sant’Anna) agree with two-way fixed effects; the imputation estimator, which imposes parallel trends, inflates the estimate by absorbing the pre-trend.
Table 11. Family-wise error-rate correction across the six dimensions (Romano–Wolf).
Table 11. Family-wise error-rate correction across the six dimensions (Romano–Wolf).
Dimension ATT (SD) p (raw) p (Romano–Wolf) Significant
Growth +0.751 < 0.001 < 0.001 Yes
Profitability −0.237 < 0.001 < 0.001 Yes
Financial stability −0.443 < 0.001 < 0.001 Yes
Persistence +0.088 0.097 0.184 No
Productivity −0.198 n.s. n.s. No
Volatility +0.012 0.848 0.848 No
Note. Studentised step-down Romano–Wolf adjusted p-values controlling the family-wise error rate across the six outcomes. The three significant cross-sectional effects survive the correction; the three null effects remain null. Productivity’s bootstrap confidence interval includes zero.
Table 12. Hazard model of entry into the innovative-SME register.
Table 12. Hazard model of entry into the innovative-SME register.
Covariate (one-year lag) Odds ratio 95% CI Significant
Recent revenue growth 1.46 [1.30, 1.64] Yes
Revenue acceleration 0.91 [0.84, 0.98] Yes
Size (log revenue) 0.79 [0.77, 0.82] Yes
Profitability (operating margin) 1.00 [1.00, 1.00] No
Note. Discrete-time logit hazard of registration, with calendar-year, sector, and macro-region fixed effects and standard errors clustered by firm; 13,559 firm-years and 1,321 registrations. Registration is timed to recent revenue growth and concentrated among smaller firms; profitability is not a selection margin.
Table 13. Model predictions and their empirical counterparts.
Table 13. Model predictions and their empirical counterparts.
Model prediction Empirical counterpart Confirmed
Hazard increasing in recent growth Growth odds ratio 1.46 (p < 0.001) Yes
Hazard decreasing in firm size Size odds ratio 0.79 (p < 0.001) Yes
Hazard invariant to profitability Margin odds ratio ≈ 1.00 (n.s.) Yes
Entry at the crest (growth decelerating) Acceleration odds ratio 0.91 (p = 0.01) Yes
Pre-registration run-up, post-registration plateau Event study; ≈ 81% of gain pre-entry Yes
No genuine post-registration effect HonestDiD breakdown at M 0.13 Yes
Note. Qualitative predictions of the anticipatory take-up model against the corresponding empirical estimates from the hazard model and the difference-in-differences analysis. Every prediction is confirmed in the direction and, where applicable, the magnitude implied by the model.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Accessibility

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated