Global surgery, obstetric, and anesthesia indicator definitions and reporting: an Utstein Consensus Report

: Background: Indicators to evaluate progress towards timely access to safe surgical, anaesthesia, and obstetric (SAO) care were proposed in 2015 by the Lancet Commission on Global Surgery. Despite being rapidly taken up by practitioners, datapoints from which to derive them were not defined, limiting comparability across time or settings. We convened global experts to evaluate and explicitly define - for the first time - the indicators to improve comparability and support achievement of 2030 goals to improve access to safe affordable surgical and anaesthesia care. Methods and findings: The Utstein process for developing and reporting guidelines through a consensus building process was followed. In-person discussions at a two day meeting were followed by an iterative process conducted by email and virtual group meetings until consensus was reached. Participants consisted of experts in surgery, anaesthesia, and obstetric care, data science, and health indicators from high, middle, and low income countries. Considering each of the six indicators in turn, we refined overarching descriptions and agreed upon data points needed for construction of each indicator at current time ( basic data points), and as each evolves over 2-5 ( intermediate ) and >5 year ( full ) timeframes. We removed one of the original six indicators (one of two financial risk protection indicators was eliminated) and refined descriptions and defined data points required to construct the 5 remaining indicators: geospatial access, workforce, surgical volume, perioperative mortality, and catastrophic expenditure. Conclusions: To track global progress toward timely access to quality SAO care, these indicators – at the basic level - should be implemented universally. Intermediate and full evolutions will assist in developing national surgical plans, and collecting data for research studies.

(intermediate) and >5 year (full) timeframes. We removed one of the original six indicators (one of

Introduction
In 2015, The Lancet Commission on Global Surgery (LCoGS), Disease Control Priorities-3 Surgery, and World Health Assembly Resolution 68/15 on "Strengthening Emergency and Essential Surgical Care and Anaesthesia as a Component of Universal Health Coverage" showed the dire global state of surgical and anaesthesia care provision globally and the necessity for large and rapid improvements in many low and middle income countries (LMICs). (1,2) Given there were no widely accepted indicators used to track progress towards improved timely access to quality surgical and anaesthetic care, members of LCoGS proposed a set of six indicators (appendix table 1) for this purpose. These were to be used as a set to illustrate access and quality, and broadly classified under preparedness for care (access to timely surgery and workforce density), delivery of surgical and anaesthesia care (surgical volume and perioperative mortality), and effect of surgery and anaesthesia (protection against catastrophic expenditure and protection against impoverishing expenditure). These indicators were rapidly adopted into the WHO's 100 Basic Global Health Indicators and the World Bank's World Development Indicators. (3, 4) They have also been used in research studies to assess state of provision of surgical care in multiple country settings and proposed for use by ministries of health to assess progress towards improving surgical care nationally. (5)(6)(7)(8) However, although widely accepted as valuable indicators, the LCoGS only broadly defined them, leaving much flexibility in the choice of data points from which to derive them. Given each indicator is formed from multiple data points (for example, perio-operative mortality requires assessment of death, the time of death, and, potentially, the risk of death for patients undergoing surgery) lack of clarity has resulted in confusion and delays in data collection, and difficulty in comparing results among countries and over time. Indeed, recently, an assessment of countrylevel indicator reporting found poor availability and heterogenous definitions which limited comparability and utility of the indicators. When using the indicators put forward by LCoGS, although 154 countries out of the WHO member states had data on workforce, only 19 had data on timely access to a facility capable of providing surgical care, 72 had data on the numbers of procedures done, and 9 had data on perioperative mortality. No country had empirical data on the 2 indicators of effect of surgery and anaesthesia. Even for the most available indicator of workforce, definitional issues limited its comparability across countries and its utility.(9) For perioperative mortality, there are several different reporting times in use i.e. 24-hour mortality, 7-day mortality, in-hospital mortality, 30-day mortality, or surgical mortality. This greatly hinders the ability to assess achievement of global targets for surgery or combine results from research studies in meta-analyses. (9)(10)(11) Our aim was therefore to bring experts in surgical, obstetric, and anaesthesia clinical care and academia together with global indicators experts, data scientists, and policy makers to appraise the existing indicators; refine their descriptions; and define data points needed for their derivation. The intention was to both reinforce and clarify the global indicator set for use in research and development purposes.

Methodology
We assembled an international group of experts in policy; surgery, anaesthesia, obstetrics; and data science for an in-person meeting to develop consensus using the principles of the Utstein Process. (12)(13)(14)(15)(16) Previous Utstein initiatives have focused on defining core outcome sets for out of hospital cardiac arrest and cardiopulmonary resuscitation, time-points at which they should be collected, and the way in which they should be reported. Our aim was to bring this Utstein evidence-based rigour and consensus-informed consolidation to Global Surgery Indicators.
The meeting took place at the Utstein Abbey, Mosteroy Island, Norway on June 16-18, 2019. The meeting was followed up by email correspondence amongst all members of the panel and virtual group discussions to resolve ongoing issues.

Panel selection
The steering committee (JD, JM, AG, JMO JGB) identified potential participants with relevant expertise as well as experience of working in multiple settings and country income strata. Snowball sampling was then used to identify further participants within these areas. 60 potential participants were identified, however, meetings space available was limited to 40. Therefore 40 were shortlisted by the steering committee based upon the relevance of their expertise and the need to achieve a balance across areas of expertise. All 40 participants were invited.

Preparation
Prior to the meeting, relevant literature on global surgery/anaesthesia and indicators were sent to participants. (1, 7-9, 11, 17-25) In addition, all participants were sent information on guiding principles previously used to establish global surgery indicators (table 1). (20) Members were informed that the purpose of the meeting was to appraise, revise, and define but not necessarily abandon, the existing indicators which have already garnered global momentum. Members were also encouraged to shared their own experience of indicator collection in their own specialty fields and recommendations, successes, and failures informed development of these indicators. (3, 26, 27) Table 1: Guiding Principles for Global Surgery Indicators. (20) Simplicity Indicators should be simple, clear, and inexpensive to obtain from hospitals, providers, professional societies, and governmental agencies. Health resources should not be diverted or unduly burdened by demands for data collection.

Wide applicability
Indicators should use definitions relevant to the span of surgical care worldwide. They should also be meaningful to health professionals, researchers, and policy makers, and provide information allowing reasonable conclusions on the state of surgical services within a country.
Relevance to public health Surgical indicators should incorporate measures of access and outcome. They should provide indicators likely to respond to substantial changes in the delivery or quality of surgical care.
Unintended negative consequences of measurement reduced to a minimum Potentially negative consequences should be considered, since scrutiny can result in perverse effects, driving practice patterns that bolster statistics at the expense of patient care.

Consensus process
Methods were in accordance with Utstein methodology on developing reporting guidelines, (12)(13)(14)(15) and other guidelines for developing reporting criteria.(16) Utstein-style conferences use an established consensus process to consolidate definitions and reporting criteria to improve comparability of outcomes reported in studies, databases, demographic surveys, and administrative reports. The resulting outputs are guidelines and templates which can subsequently be adopted by governments, policymakers, journals, demographers, and researchers as unifying reporting criteria. This ensures global consistency and comparability across data types, definitions, and reporting style.
The steering committee assigned attendees to one of 6 working groups based on attendee's knowledge and expertise. Each working group related to one indicator; Access, Volume, Workforce, Peri-Operative Mortality Rate (POMR). Catastrophic and Impoverishing expenditure was discussed by one group, given their similarity. An additional group, entitled the 'Parking Lot', was included to address gaps in the current set of indicators that should be further developed in future iterations of the Utstein consensus, and/or through future research -results of this group are not presented here, but will be used for future evolutions of the indicators.
Each group was assigned a lead and a deputy, based on their previous leadership in the indicator under consideration. The group lead presented an outline of the current definition of the indicator (1) and issues found in its availability, comparability, and utility. (9) After which, the groups were asked to develop a clear overall definition for each indicator and consider the overarching datapoints needed to derive it. Then, given the potential levels of granularity and complexity inherent in each indicator, each group was asked to consider minimum basic data-points (Basic) to allow global comparisons using nationally-led data collection initiatives. To be defined as Basic, we agreed that reporting at the country level should be feasible within the next 2-years). We then asked groups to consider how these data-points should evolve to Intermediate (2-5 years) and Full (>5 years) sets which can be used to guide research studies and aid policy making at the national level.
Each indicator working group was initially divided in half to address the indicator independently, and then re-convened as a complete group to compare notes and recommendations. After agreement within the working group, each presented their suggestions to the full panel for a plenary session to build consensus across all attendees. Thus, all participants contributed to the discussions on each of the indicators. Key points of discussion and the outcomes of the plenary discussions were recorded by the working group lead and deputy.
After the meeting, each working group lead and deputy entered the discussion results for their indicator into a template. Templates were compiled by the steering group and then circulated to all attendees for feedback. Comments were again compiled by the steering group who, on discussion, further refined indicators or their data-points to ensure that there was consistency in reporting across all indicators. After this process, any adjustments were sent to all attendees for further feedback and then correction by the writing group, until consensus was built. Where dissagreement remained after this process, small working groups of panel members with relevant expertise were assembled to enable consensus to be reached. These groups were facilitated by one of the steering group members.
Results 38 participants attended the meeting; country of origin and speciality are shown in table 2. Working group members and leads are shown in appendix table 2. Small group discussions to achieve consensus were complete by August 2020. There was consensus that the overarching descriptions and datapoints used to derive all indicators required further clarification in order to improve their availability, comparability, and utility for research, reporting, and national and international planning. The meeting resulted in changing the overarching desciptions of the indicators. Importantly, the panel reached consensus on datapoints and how to use these to derive all indicators across three progressive levels: Basic, Intermediate, and Full, whereas these were previously not defined.    The Basic data sets are for use for global reporting at the macro-level only since they provide insufficient granularity to inform national planning or service refinement at the meso-or micro-level. For example, the Basic data set does not provide meaningful comparison of POMR across settings since the results are not adjusted for baseline patient risk or type of procedure. * For comparability, travel time means ideal time to travel between a location and a facility. It does not mean experienced travel time from recognition of the need for surgery to arriving at a facility, which may incorporate delays in seeking care or delays in obtaining transport. ** We have not provided a definition of what a surgery, anaesthetic, or obstetric provider is; we agreed these should be defined by each country, with recognition that the definitions are likely to vary locally. Providers are persons directly involved in delivering the surgery, obstetric, or anaesthetic care; i.e. the person doing the operation or giving the anaesthetic. *** Certified means completion of a government and/or professionally approved advanced education program that leads to a nationally recognised qualification to provide surgery, anaesthesia, or obstetric care. **** Specialist physicians are providers who have obtained a medical degree (physician) and undergone specialty post-graduate training (certification) ***** This recognises that, at the current time, definitions of procedures that constitute surgery differ between countries and data sources. We have therefore agreed upon a broad definition of procedures for the Basic data set (<2 year timeframe), without defining a list. This definition includes incision, excision, or manipulation of tissue needing anaesthesia in an operating theatre. This includes day-cases, but excludes procedures in other locations i.e. outside of the operating theatre. Definition of anaesthesia is regional or general anaesthesia, or profound sedation to control pain. Number of surgical codes in a single anaesthesia procedure are counted as one case. If only a subset of procedures is feasible to collect for this indicator, then the type of procedures included should be transparently reported. ******Catastrophic expenditure is usually calculated at the individual level (with data collected on OOP and household expenditure for each individual undergoing a medical admission episode). However, many people do not access surgery care because of fear of catastrophic expenditure. This indicator thus uses individual OOP expenditure for those who seek surgery in combination with national average level household expenditure to estimate the proportion of people who would suffer catastrophic expenditure if they were to need surgery ******* Direct OOP costs could, in reality, include pre-hospital direct medical costs. However, they are not included here as they are small relative to the hospitalisation episode and patients may not recall these as readily as hospitalisation costs. This does not include direct non-medical costs (lodging, food, transport to and from facility). This does not include indirect costs (e.g.: loss of earnings) ******** we note as per SDG Target 3.8.2 there are two recognised thresholds, >10% and > 25%, however, we have chosen 10% Table 3 shows, for each indicator, the original LCoGS overarching description, the Utstein revised description, a summary description of the data points required to construct the indicator, and the Basic (<2 year) data points needed to construct the indicator. The appendix tables 3-7 contain these parameters for Intermediate and Full data sets.
Two indicators on effect of care were condensed into one: risk of catastrophic expenditure on requirement for surgery replaces protection against catastrophic expenditure and protection against impoverishing expenditure. Use of catastrophic expenditure aligns with the expenditure indicator used in the Sustainable Development Goals and is a key indicator to monitor progress towards Universal Health Coverage.
Regarding changes to the indicator descriptions, the panel agreed that the original indicator Access to Timely Essential Surgery should be changed to Geospatial Access to a facility that has capacity to deliver surgery and anaesthesia care for Bellwether procedures. This is in order to reduce the potential dimensions inherent in the broad concept of access -for example, cultural, quality, and financial -noting that quality and financial dimensions are covered by other indicators. We agreed that the data points for constructing this indicator at the Basic level should allow estimation of the proportion of the population who would have geospatial access to a facility were they to need care. Whilst realised access ( a person who needs care actually accesses it) may be feasible to measure in some countries, given the complexity of collecting these data in countries with under-developed health systems, the consensus was that this should not form part of the Basic data set. Information on whether a facility provides the Bellwether procedures (originally caesarean section, laparotomy, treatment of an open fracture) is a necessary component of this indicator. The Bellwethers were developed as a marker of a hospital which, if all three were provided, could deliver a broad base of surgical care. (1) Although these procedures have been collected as part of research studies, we agreed that their utility for national reporting was limited by lack of definitional clarity, especially for treatment of an open fracture. We discussed this issue at length, including whether we should remove the concept of Bellwethers from this indicator, however, ultimately reached consesnsus that they should be included, with clarity that treatment of an open fracture should become surgical management of an open long bone fracture.
The main consensus change to the Specialist Surgical Workforce Density indicator was to include all cadres providing surgical, obstetric, and anaesthesia care in the definition, broadening this out from being limited to the physician workforce to now including other nationally certified (non physician practitioners). We also improved clarity in the definition of providers in order to allow evolution of granularity.
The panel agreed that the potential breadth of procedures that can be defined under the umbrella of surgery limits the comparability and utility of the Surgical Volume indicator.(10) (19) For the Basic data set, we have therefore defined surgical procedures in broad terms as procedures done in an operating theatre. These include incision, excision, or manipulation of tissue using anaesthesia in an operating theatre, including day-cases but excluding surgical procedures in other locations i.e. outside of the operating room. Definition of anaesthesia is regional or general anaesthesia, or profound sedation to control pain during the procedure. We agreed a structure to increase the granularity of the data collected over time, acknowledging that whilst providers and operating theatres often capture detailed data on procedures done, these data are held in handwritten log books and are difficult to extract for monitoring purposes.
Regarding Perioperative Mortality Rate (POMR) -the only clinical indicator in the LCoGS indicators -we had disagreement about whether the indicator should simply be that countries are collecting information on POMR, given the potential adverse consequences of reporting a poor POMR. However, after discussion, we reached the consensus that there was utility in reporting POMR, although for global accountability processes, this should be at a national rather than hospital level. There was consensus that for the Basic data set, the time period for reporting should be inhospital, rather than 30-day mortality, which is a standard indicator reported in high income countries. This is due to strong evidence that mortality out to 30 days is generally currently not available in many countries. We also agreed that risk-adjustment is not currently possible for many countries which lack data related to procedure type and patient risk (derived using the American Society of Anaesthesia [ASA] score), therefore at the Basic level, POMR will not be risk adjusted. We noted that lack of risk-adjustment will limit comparisons across countries given the presence of differences in risk between country populations. Comparisons may become feasible at the Intermediate and Full level, when we agreed that covariates for risk-adjustment at the patient level should also be collected.
Discussions around the indicator on Catastrophic Expenditure centred on the nearly universal lack of data points from which to derive this indicator, especially in LMICs. For example, documented hospital costs of procedures often grossly under-represent the full extent of direct medical costs, patients may not be aware of their household expenditure, or people who are impoverished may not access surgery care at all. To rigorously collect these data requires doing exit interviews with patients. However, we recommend that at the least, data on costs of surgery are collected using nationally representative surveys where reliable information on costs of care are not available from other sources. To overcome difficulty in ascertaining individual's household expenses, we recommend the use of national household expenditure, which will allow estimation of the proportion of the population who would be at risk of catastrophic expenditure if they were to need surgery. We thus agreed to change the overarching description of this indicator to "Percentage of the population at risk of catastrophic expenditure if they were to require care for a surgical procedure". We recognised that it doesn't capture non-medical direct and indirect costs, e.g. those incurred in accessing care, but the difficulty in collecting these means they are not feasible for the Basic or Intermediate data sets.

Discussion
The meeting attendees agreed that LCoGS indicators as initially listed were too vague to allow for comparability across or within multiple settings, and their data elements had never been defined. However, , we were unanimous that the indicators themselves were useful, especially when used together as a set to assess timely access to quality surgical, obstetric, and anaesthesia care. We also agreed data points should evolve over time and account for the development of countries' ability to collect data, or the different uses of data. This "evolution" also enables different uses of those data, with the Basic data points -which should be collectable by most countries -used for international or national comparisons and the Intermediate or Full data points being of greater utility for national planning or research studies. (5) Given the broader utility of indicators derived from and disaggregated according to the Full data points, we urge researchers working at local, national, or regional levels to use these definitions in order to later allow compilation of data from across multiple countries using systematic methodologies and meta-analyses. Additionally, although we recognise that these more granular data points (of Intermediate and Full) may not be feasible to collect in countries where data-systems are nascent, we strongly recommend they are collected for national planning purposes as soon as possible.
To ensure political priority for anaesthesia and surgery requires that four elements are in place in the broad areas of i) actor power, ii) ideas, iii) political context, and iv) issue characteristics (broadly, the capture of data to show the issues that need to be addressed). (28)(29)(30) The global surgery movement has been shown to be deficient in all of these areas, especially in comparison to the movement to improve maternal health. (28) This Utstein meeting was convened to address, in particular, the area of issue characteristics. Harmonised data collection should facilitate coherent presentation of ideas and their internal and external framing, and, with strong joined-up actors, enable a shift in the political context.
Access to surgical and anaesthetic care is crucial for ensuring the health and wealth of populations. Global reporting of accessible, comparable, and utilizable data is central to ensuring accountability and advocacy, and the newly defined indicators will facilitate such data collection. More granular data for national policy-making will also be improved by these refinements. These updated indicator definitions applied at the international and national level will facilitate progress towards timely access to safe, affordable care and we advocate for their use in all surgery related data collection initiatives. An intended eventual output of this process should be directly actionable by individual countries and the United Nations Statistical Commission, with broad and long-term international impact.
Funding and organisational support for the meeting was provided by the WFSA and Laerdal Foundation Contributors: Authors are listed alphabetically. JD, AWG, JGB, JM, and JMO organised the meeting; JD and JM wrote the first draft of the of the manuscript before review by the members of the writing group and subsequent review and approval by the Utstein Global Surgery Indicators Group participants (see Appendix).