Retention in care and viral suppression in differentiated service delivery models for HIV treatment in sub-Saharan Africa: a rapid systematic review

(350 word limit) 340 words Introduction: Differentiated service delivery (DSD) models for antiretroviral treatment (ART) for HIV are being scaled up in the expectation that they will improve the quality and efficiency of treatment delivery and reduce costs while maintaining at least equivalent clinical outcomes. Even this minimum requirement of equivalent clinical outcomes is poorly documented for most models and settings, however. We reviewed the recent literature on DSD models to describe what is known about clinical outcomes. Methods: We conducted a rapid systematic review of peer-reviewed publications in PubMed, Embase, and the Web of Science and major international conference abstracts that reported outcomes of DSD models for the provision of ART in sub-Saharan Africa from January 1, 2016 to September 12, 2019. Sources reporting standard clinical HIV treatment metrics, primarily retention in care and viral load suppression, were reviewed and categorized by DSD model and source quality assessed. Results and discussion: Twenty-nine papers and abstracts describing 37 DSD 52 discrete met


Introduction
Throughout sub-Saharan Africa, most national HIV programs are striving to achieve the 95-95-95 targets for HIV diagnosis, treatment, and viral suppression 1 . The rapid expansion of antiretroviral therapy (ART) programs to reach these targets has created shortfalls in health system capacity and quality. 2 In response, many countries are scaling up alternative service delivery approaches, or differentiated service delivery (DSD) models. DSD models differ from conventional HIV care in the location and frequency of interactions with the healthcare system, cadre of provider involved, and/or types of services provided. 3 Grimsrud and colleagues broadly categorize DSD models as individual or group models, with service delivery at a facility or in the community. 4 DSD models aim to achieve a wide range of potential benefits to both providers and patients. The attractiveness of DSD models is generally considered to be conditional on maintaining at least equivalent clinical outcomes to conventional care; assuming no deterioration in clinical outcomes, DSD models are hoped to generate greater patient satisfaction, lower cost to both providers and patients, and create efficient and convenient service delivery.
Despite the large-scale rollout of DSD models in various formats across multiple countries, there is a dearth of evidence to document the purported benefits of the new models in routine implementation. Even the minimum requirement of equivalent clinical outcomes is poorly documented for most models and settings. The studies and evaluations available are widely inconsistent in their designs, methods, and outcomes, making it difficult to draw an overall picture of the impact of the models. Monitoring and evaluation systems have not kept up with DSD model implementation, and DSD participation is poorly captured in routine records, making it challenging to compare outcomes in DSD models with those in conventional care. 5 The information available to policy makers, funders, and program implementers is thus incomplete and difficult to interpret.
To help fill this gap and create a baseline to guide future research, we conducted a comprehensive, rapid review of the most recent peer-reviewed reports of the outcomes of DSD model implementation in sub-Saharan Africa. In view of the importance of achieving non-inferior clinical outcomes as a condition for adopting DSD models, we report here the results of our search for retention in care, viral suppression, and related clinical outcomes.

Methods
Following World Health Organization guidance for rapid reviews, 6 we conducted a rapid systematic review of peer-reviewed publications and conference abstracts that reported outcomes of differentiated service delivery (DSD) models for the provision of antiretroviral treatment (ART) in sub-Saharan Africa since 2016. 7 The search protocol was previously presented, 7 and the review was registered on the International Prospective Register of Systematic Reviews (PROSPERO CRD42019118230).
Although the full review included a wide range of outcomes for both providers and patients, the most widely available information pertained to patient-level clinical outcomes, specifically retention in care and viral suppression. In this report we focus on these outcomes only, to allow for a more detailed examination and discussion of consistently-defined indicators. The full report of the review is available online. 8 4

Search strategy and study selection
For this review, we adopted and modified the widely-cited frameworks put forward by Grimsrud et al 4 and Duncombe et al 3 and defined as a "differentiated model of service delivery" any approach to providing ART that focused on a specific population, the location of service delivery, the frequency of patient interaction with the healthcare system, or the cadre of health care provider involved. We did not consider a change in services provided, without adjustment of any other characteristics, to constitute a DSD model. Inclusion and exclusion criteria for the review are shown in Supplementary Table 1. We searched the PubMed, Embase, and Web of Science databases with a search string developed to identify publications which reported on HIV treatment delivery models in sub-Saharan Africa from 1 January 2016 until 12 September 2019. The final search was conducted on 12 September 2019. We supplemented the peer reviewed publications by manually searching peer-reviewed abstracts from major conferences for the same period. Search strings and a full list of conferences included can be found in Supplementary Table 2. Limiting eligible articles and abstracts to those published or presented since 2016 was intended to ensure that results come as close as possible to reflecting the current state of DSD model implementation and to avoid repeating the efforts of previous reviews. 2, [9][10][11][12] If a source reported patient follow up data collected both before and after January 1, 2016, we included it only if the majority of follow-up time, as stated in the source or estimated by the authors, occurred after that date.
We excluded sources that reported interventions aimed at improving conventional care that we judged did not in themselves comprise DSD models, such as adherence interventions that strengthened existing counseling or offered incentives for retention within the conventional model of care. We also excluded cross-sectional surveys of patients or providers who were asked to comment on DSD models but did not have personal experience with it. If two source documents described what we determined to be the same cohort of patients enrolled in the same instance of the model, we counted only one model but cited both references for it. If one source document superseded another, e.g. by reporting more complete data or longer-term outcomes, we kept only the more informative source.
All peer reviewed references identified using the respective search strings from PubMed, Embase and Web of Science were imported into an EndNote™ library, where deduplication occurred. An initial, independent, blinded review (reviewers were not aware of each other's decisions) of the titles and abstracts was conducted by three study team members (SK, RC, CG) using Rayyan QCRI. 13 A full text review was then conducted for all publications remaining after the initial review by two study team members (SK, CG). Reasons for excluding publications were recorded during the full text review. As a quality check, another author (LL) also checked a sample (10%) of the excluded sources against exclusion criteria. At each stage of the review process any conflicts between reviewers were assessed and resolved though consensus of two authors (LL, SR). The results of the search were documented in accordance with the PRISMA-P reporting checklist (Supplementary Text 1). 6,14

Data extraction
The data extraction tool was designed to capture each DSD model separately, regardless of whether the source publication described one or many models. In addition to standard bibliographic descriptors, we collected two types of data: a) a detailed description of the model of service delivery; and b) the outcomes that were reported for the model. We categorized each model according to the taxonomy described by Grimsrud 4 , with four categories: facility based individual models, out of facility based 5 individual models, healthcare worker led groups, and client led groups. We then used the adapted Duncombe 3 schema to describe the model in terms of population, provider, location, frequency, and services provided as well as and its outcomes. Where a comparison was provided with the pre-or nondifferentiated standard of care, we also extracted data about these comparison models, henceforth referred to as conventional care.

Outcomes
We report here standard clinical HIV treatment metrics, including retention in care, viral load suppression, adherence, and pharmacy refill rates. We used each source's own definition and timing of these outcomes, accepting that definitions for "retention in care" vary widely, as do thresholds for determining viral suppression. Retention usually referred to the proportion of patients enrolled in a DSD model and retained in the ART program at a specific time point after enrollment in the study. The point at which a patient was considered no longer in care (i.e. not retained) varied by study or country. Where a loss to follow up (LTFU) proportion was reported, we converted it to a retention rate (as 100-LTFU%). Most sources defined viral suppression as <1000 copies/mL. Adherence and prescription refill frequency were uncommon outcomes but are included in this analysis when reported. Other outcomes from the full review, such as costs to providers and patients, can be found in the previously cited report. 15

Analysis
To structure the results, we first divided the models into the four categories mentioned above: facility based individual models (FBIM), out of facility based individual models (OFBIM), client led groups (CLG), and healthcare worker led groups (HCWLG). In publications where more than one model was described, we reported each model separately. We report outcomes as stated in the original publications, adjusted where possible to utilize uniform metrics (for example, by converting a reported percentage of patients lost to follow up to the percentage of patients retained). As explained in the search protocol 7 , we feared that it would be misleading to conduct aggregate analyses due to the heterogeneity of model designs, participating populations, and study settings, even where outcomes themselves were similar. We thus report only the disaggregated results.
We assessed the quality of the cohort studies using the Newcastle-Ottawa scale. 16,17 The quality rating covered a review of selection, comparability, and outcome domains and generated a score out of 9. There are no standardized quality rating categories, but to simplify interpretation of scores, those studies that scored 7 or above were categorized as high quality, those scoring between 4-6 were of moderate quality, and those scoring below 4 were considered low quality, as done in previous studies. 17 Randomized controlled trials were assessed using the Cochrane Collaboration's tool for assessing risk of bias in cluster randomized controlled trials. 18 We assessed sequence generation, participant recruitment with respect to randomization timing , deviation from intended intervention, completeness of outcome data for each main outcomes, bias in measurement of outcome, bias in selection of the reported result. Risk of bias assessment for the one remaining cross-sectional study was not conducted. 19

Sources identified
The results of the systematic search are shown in Figure 1. A total of 3,498 non-duplicate abstracts of peer reviewed journal articles and 12,822 abstracts from the selected conferences were screened. After 6 the initial title and abstract review, 16,092 articles and abstracts were excluded, leaving 228 documents for full review. During the full review, an additional 181 were excluded. Reasons for exclusions are reported in Supplementary Table 3. The primary reason (60%) for excluding articles was date: most or all of the underlying data were collected prior to 2016. The main reason for excluding conference abstracts (33%) was insufficient information to adequately describe the model and at least one of the outcomes of interest.
Nine peer reviewed articles and 38 conference abstracts (47 total) were retained in the final data set for the full review. Of these, 29 included one or more clinical outcomes and were included in the analysis reported here. Three quarters of these sources (76%) reported observational cohort studies; most of the rest (21%) were randomized trials. South Africa (27%) and Zambia (22%) jointly accounted for nearly half the sample (Supplementary Table 3).

Differentiated models included in the review
The 29 sources described outcomes for a total of 37 discrete differentiated service delivery models, excluding conventional care models for comparison. Models are described briefly in Table 1 below and in full in Supplementary Table 4. In the tables, each model is assigned a model identifier (ID), which is 7 used to reference that model throughout the review. If a source document (article or abstract) reports on more than one DSD model, multiple model IDs will be associated with it in Table 2. Each model identifier contains an acronym for the model category (FBIM, OFBIM, CLG, or HCWLG) followed by a number. For example, client led groups have model IDs CLG1 through CLG5, indicating that there were 5 distinct CLG models identified. In one instance (HCWLG11), the same model is referred to in more than one source document. 20,21  *Most models where age was not specified appeared to be limited to adults. ¥The authors used associated documents (e.g. published study protocols, unpublished reports) relevant to these source documents to supplement the DSD model description, if insufficient detail was provided in the publication itself. §Sample sizes pertain to the entire study population rather than for a specific DSD model. For publications that evaluated different DSD models in each arm, we report the total N for the study cohort rather than the N in each study arm. ⱡFor most models, stable was defined per national guidelines, though clinicians used clinical criteria to define stability when necessary laboratory tests were not available. FBO, faith-based organization; FGD, focus group discussion In addition to the models listed in Table 1, 11 source documents reported comparative results for a conventional care model, creating a total of 48 model-instances with clinical outcomes included in this review (37 DSD + 11 conventional models). Out of facility based individual models (32%) and healthcare worker led group models (35%) were the most commonly reported categories (Supplementary Table 5).
Three quarters (76%) of the models were limited to clinically stable patients, and most (59%) were for adults ( Table 1). Definitions of stability varied. Some models required prior evidence of viral suppression while others relied on clinical condition, for example, and minimum duration on ART prior to model entry also differed. Details of how a stable patient is defined are presented elsewhere. 48 9 Additional model characteristics are described in Supplementary Table 6. Most models provided basic clinical care, antiretroviral medications (ARVs), and laboratory monitoring only (78%). Almost half (46%) included services delivered both in the clinic and in the community, rather than solely one or the other. For those that identified clinical care and pharmacy refill providers, nearly all clinical care (96%) was provided by trained clinicians, though few sources specified the clinical cadre involved; more than two thirds of medication refills (70%) were provided by non-clinician staff (community health workers, designated patients, or lay counselors). More than half the models (57%) required patients to have a total of 4-8 clinic visits or DSD model interactions per year; most of the rest required more than 8 visits or interactions per year, though a few (18%) were structured for 3 or fewer per year (Supplementary Table 6).
Quantitative results for each study are shown in Table 3. Some studies included effect sizes in comparison with conventional care, while others did not provide comparison values at all, but simply reported the outcomes of the DSD models. Table 4 provides additional information, including effect sizes, for studies that did report these measures. More detailed versions of both tables, including any estimates or calculations by the authors, can be found in Supplementary Table 7.

Retention in care
Although retention in care was the most commonly reported outcome, only a few sources provided a comparison to conventional care. For those that did, retention in the DSD model was generally within 5% of that in conventional care, with the exception of a healthcare worker led group model in the Democratic Republic of Congo, which greatly improved retention. 22 Among those not providing a comparison, retention generally exceeded 80%. For the few sources (n=4) which reported retention outcomes with an effect size, effects varied widely, from much better than conventional care to somewhat worse.

Viral load suppression
Among the 21 models that reported viral load suppression, eleven included a comparison with conventional care (including one that reported only an effect estimate and not actual values). All those with a comparison reported a small increase in suppression in the DSD model. Reported suppression exceeded 90% in 11/21 models. Five models reported viral suppression with an effect size estimate. Three of these found no difference in suppression when adjusting for baseline differences. Streamlined care in Uganda and Kenya 23 and CAGs in Mozambique 38 both reported approximately 15% (prevalence ratio=1.15 and unadjusted odds ratio=1.16, respectively) improvements in suppression.

Adherence and prescription refill rates
Few sources (n=4) used adherence to ARVs or prescription refill rates as outcomes; results are shown in Table 3. Rates of adherence (n=1) and prescription refill (n=3) were >90% across the models. Only two reported a comparison with conventional care and the DSD model outperformed conventional care in both instances. No effect sizes were reported for adherence or prescription refill measures.

Quality of evidence
Among the three quarters of the sources included that were cohort studies and thus evaluated on the Newcastle-Ottawa scale, the quality of the evidence was generally low to moderate (Supplementary Table 8). Only two of the 22 cohort studies received a score of 7 points (high quality) on the 9-point scale. The relatively low quality of evidence among cohort studies was due mainly to the absence of comparators in many of the studies and the scarcity of detail found in conference abstracts. Most of the remaining studies (n=6) were randomized controlled trials, for which we assessed quality using the Cochrane Collaboration's tool for assessing risk of bias cluster randomized trials (Supplementary Table  9). 18 All three full-length articles (four models) were at low risk for bias 21,23,33 but a concern about bias applied to the two abstracts, driven mainly by the fact that the conference abstracts did not contain full information on study methodology. 37,46

Discussion
We systematically reviewed and synthesized the current evidence related to clinical outcomes of differentiated service delivery models for HIV treatment in sub-Saharan Africa between 2016 and 2019. While we identified 29 sources that described one or more clinical outcomes of 37 DSD models in 11 countries, only a minority (28%) compared the alternative models to conventional care or to one another, making it difficult to draw strong conclusions about the overall impact of DSD models on clinical outcomes. Because of the heterogeneity of outcome definitions and timing and the highly variable quality, size, and scope of the studies included, we opted to present outcomes individually for each model, stratified by model category and outcome, rather than to estimate aggregate statistics.
For those models that did provide a comparison with conventional care, retention in care in DSD models was generally within 5% of that in conventional care, with a few exceptions that reported much better retention. Similarly, viral suppression was generally equivalent or slightly higher in the DSD models. We did not expect to see a marked improvement in clinical impact (retention or viral suppression) because most DSD models are limited to already-stable patients, for whom outcomes can be sustained but cannot improve. Where comparisons with conventional care were provided and effect sizes reported, effects on retention and suppression varied widely, from slightly worse than conventional care to moderately better. In general, DSD models were not associated with a meaningful deterioration in patient outcomes, despite in many cases having fewer interactions with patients or relying on lower cadres of clinicians than did conventional care.
As is evident from the discussion above, this review had many limitations. While we believe that our search of the peer-reviewed, published literature and abstracts was thorough, the lack of standard terminology for describing DSD models hampered the creation of precise search strings, and it is possible that some sources were missed. Most sources did not describe procedures for recruiting patients into DSD models, but it is possible that self-and provider-selection biased participation toward the most motivated and empowered patients, among all those who met formal eligibility criteria. More important, the extreme heterogeneity of the sources that did meet inclusion criteria rendered any attempt to aggregate results or produce summary statistics misleading. This heterogeneity manifested itself in multiple ways. The topic of DSD models is highly diverse in itself. Evaluation methods ranged from single-site, single-arm observational cohorts to large randomized trials. The majority of sources did not provide comparisons with conventional care, and metrics for assessing outcomes varied widely and were in many cases poorly defined. The underlying patient populations were often poorly described or differed by design between models even within countries.
Stemming from these limitations, the search reported here identifies gaps in the evidence base and research priorities for DSD model implementation in the coming years. In particular, rigorous evaluation of clinical outcomes, with relevant comparisons, is needed if we are to fully understand the implications of DSD models for HIV control. Longer-term follow-up under routine care settings, beyond the first 12 or 24 months, should be undertaken, as it is critical to know what happens to retention and viral suppression three, five, or ten years after entry into a DSD model. This is especially important when DSD models are focused on stable patients and large changes in treatment outcomes are unlikely in the short term. Evaluation reports on the outcomes of DSD models should consistently include a description of the population served, as models limited to already-stable patients are likely to have different outcomes from those that enroll a cross-section of the ART patient population. There is also a need for electronic medical record systems to evolve to capture data on DSD model participation, as this is an essential step towards understanding the true clinical and other impacts of DSD models.

Conclusions
We note that there is a difference between the clinical outcomes of the patient enrolled in DSD models and the "impact" of implementing DSD models as part of national HIV programs. In many of the studies included in this review, only a small proportion of eligible patients were enrolled in a DSD model, and only those patients' outcomes reported. The effect of those patients' outcomes on the overall, aggregate outcomes of the healthcare facilities at which the DSD models were implemented may have been modest, or even trivial, if large numbers of other patients remained in conventional care. Future evaluations of the outcomes of DSD models would be of greater value if they considered the entire, relevant patient population-for example, all the ART patients served by a facility, or all the ART patients in a catchment area-as the denominator for assessing success.