Ridge to Reef Management Implications for the Development of an Open-Source Dissolved Inorganic Nitrogen-Loading Model in American Samoa

Excessive nutrient discharge to tropical island coastlines drives eutrophication and algal blooms with significant implications for reef ecosystem condition and provision of ecosystem services. Management actions to address nutrient pollution in coastal ecosystems include setting water-quality standards for surface waters discharging to the coast. However, these standards do not account for the effects of groundwater discharge, variability in flow, or dilution, all of which may influence the assessment of true nutrient impacts on nearshore reef habitats. We developed a method to estimate dissolved inorganic nitrogen (DIN) loads to coastal zones by integrating commonly available datasets within a geospatial modeling framework for Tutuila, American Samoa. The DIN-loading model integrated an open-source water budget model, water-sampling results, and publicly available streamflow data to predict watershed-scale DIN loading to the island’s entire coastline. Submarine groundwater discharge (SGD) was found to deliver more terrigenous DIN to the coastal zone than surface water pathways, supporting findings from other tropical islands. On-site wastewater disposal systems were also found to be the primary anthropogenic sources of DIN to coastal waters. Our island-wide DIN-loading model provides a simple and robust metric to define spatially explicit sources and delivery mechanisms of nutrient pollution to nearshore reef habitats. Understanding the sources and primary transport modes of inorganic nitrogen to nearshore reef ecosystems can help coastal resource managers target the most impactful human activities in the most vulnerable locations, thereby increasing the adaptive capacity of unique island ecosystems to environmental variation and disturbances.


Introduction
On tropical islands, excessive nutrient discharge to naturally oligotrophic coastal waters can significantly disrupt the nearshore nutrient balance, potentially causing algal blooms or eutrophication (McCook 1999;Morton et al. 2011). In these environments, excessive nitrogen (N) loading, and in particular, high dissolved inorganic nitrogen (DIN) concentrations significantly affect phytoplankton, turf algae, and macroalgae growth (e.g., Smith et al. 1981;Pendleton 1995;Rodgers et al. 2015;Amato et al. 2016). Nutrient enrichment can potentially lead to a transition from a coralto an algal-dominated state on reefs (Hughes et al. 2007a;Littler et al. 2006;McCook 1999), with significant impacts to coral reef functions and ecosystem services (Hughes et al. 2007b).
Human population and development are dominant drivers of increased coastal nitrate and DIN concentrations (e.g., Caraco and Cole 1999;Waterhouse et al. 2017). Because coastal water DIN concentrations are reliable predictors of nearshore reef ecological condition Delevaux et al. 2018), management actions often include water-quality monitoring and defining nutrient-based water-quality standards for terrestrial and coastal surface waters (AS-EPA-American Samoa Environmental Protection Agency 2013; Hawai'i Administrative Rules 2013; SWRCB-State Water Resources Control Board 2015). Water-quality standards typically focus on surface water pathways of nutrient transport. However, this strategy does not account for the important effects of nutrient transport via submarine groundwater discharge (SGD). In tropical island settings across the globe, SGD delivers an equivalent or significantly higher nutrient load to coastal ecosystems than streamflow (e.g., D'Elia et al. 1981;Moosdorf et al. 2015;Bishop et al. 2017;). In addition, water-quality standards are typically defined for nutrient concentrations in environmental waters, which may not be representative of the true impact of terrestrial discharge into the coastal zone, as flow rates of streams or SGD may vary greatly. Instead, nutrient loading provides a better indication of how nutrients in terrestrial discharge impact nearshore reef ecosystems. While estimation of loads can be complicated by factors such as hysteresis (Lloyd 2016) or biases such as heteroscedasticity and seasonal variability in regression models (Hirsch 2014), in its most basic form, coastal nutrient loading is typically calculated by multiplying nutrient concentration by water discharge rate (e.g., Swarzenski et al. 2013;Pellerin et al. 2014;Delevaux et al. 2018). There are also many other factors affecting the fate of dissolved nutrients upon reaching the ocean, such as wind, nearshore circulation, and wave-driven currents. Also, the specific benthic habitat and substrate, as well as the water column depth at stream and coastal groundwater spring outlets, may be a factor in the ecosystem's response to any given nutrient load (Moore 2010). While it is beyond the scope of this study to constrain these influences as well, improving estimates of terrestrial nutrient loading is one important step in developing a holistic understanding of how human land use affects the coastal zone.
In this study, we integrated commonly available datasets within a geospatial modeling framework designed to estimate island-wide DIN-loading rates at a watershed scale. We developed a DIN-loading model for Tutuila, American Samoa, by combining output from an existing Tutuila-based open-source water budget model, existing water sample data for 1 year from nearly a third of the island's numerous watersheds, and publicly available streamflow data. We then identified the most impacted watersheds and prioritized the impact from different nonpoint sources to inform nutrient-management efforts on Tutuila. This was accomplished by summarizing model results through a newly developed watershed-impact ranking score to help coastal and terrestrial land managers assess which of the island's watersheds are subject to the greatest anthropogenic impacts, based on DIN loads. We examined the relative importance of different hydrologic pathways (e.g., baseflow, surface runoff, and SGD) as nutrient-transport mechanisms, and the impacts of loading from individual nonpoint DIN sources, including wastewater, livestock manure, and agriculture, were also assessed. Finally, we directly related the results of the nutrient-loading model to a management decision-making framework through stakeholder engagement and by producing innovative and opensource model outputs, thereby improving accessibility, reproducibility, and transparency to benefit stakeholders and others who may wish to apply this method in different locales.

Study Location
The island of Tutuila is the largest and most populous island in the Territory of American Samoa (Fig. 1). Tutuila is located in the South Pacific Ocean near the coordinates of 14°20′S and 170°40′W and has a land area of 142 km 2 and a population of 56,000 (AS-DOC-American Samoa Department of Commerce 2013). Tutuila's climate is warm and humid, and due to its position within the South Pacific Convergence Zone, the island receives significant rainfall, between 180 and 500 cm/year, depending on location and elevation (Daly et al. 2006). The majority of Tutuila's human development is located on the Tafuna-Leone Plain, with development off the plain concentrated on narrow strips of coastal land and in steep-sided valleys. The three primary nonpoint nutrient sources previously identified on Tutuila include on-site wastewater disposal systems (OSDS), piggeries (small backyard-scale livestock-rearing operations), and agricultural fertilizers (Shuler et al. 2017).

Modeling Framework
Our Tutuila DIN-loading model accounted for three hydrologic pathways from land to sea, including (1) stream baseflow from shallow aquifers, (2) surface runoff generated during rainfall events, and (3) SGD into the coastal zone. The anthropogenic DIN sources accounted for in the model were OSDS, livestock pigs, and synthetic fertilizer inputs to agricultural lands. The model followed a four-part workflow (Fig. 2). First, we used an existing open-source Tutuila-based water budget model calibrated with historical streamflow data, to calculate island-wide water discharge rates from all three hydrologic pathways. Second, we multiplied measured DIN concentrations in each hydrologic pathway by water discharge rates from the previous step to calculate observed DIN loads. For the third step, we used high-resolution geospatial data to calculate the prevalence of anthropogenic DIN sources in every watershed by identifying the total numbers of OSDS units, numbers of pigs, and estimated synthetic fertilizer inputs to agricultural lands within each watershed. Finally, we used the coastal DIN-loading model, which was calibrated with the measured DIN fluxes, to calculate a modeled DIN discharge rate for each of the island's watersheds. Watershed boundaries were defined by using boundaries in an existing dataset (AS-DOC-American Samoa Department of Commerce 2002) and merging nonsampled subwatersheds. This resulted in a total of ninety-three individual watersheds considered by the model.
Step 1. Island-Wide Water Discharge Rates from All Three Hydrologic Pathways Annual island-wide water discharge rates to the coastal zone were estimated for all hydrologic pathways (i.e., stream baseflow, surface runoff, and groundwater discharge) using an existing water budget model developed for Tutuila Fig. 2 Schematic of DIN-loading model workflow. The SWB2 component represents the water budget model used to determine water discharge rates, and observed streamflow and nutrient fluxes are from field data used to calibrate and validate the model. The DIN release rates were initially parameterized with values from Shuler et al. (2017), and the final rates were determined through loading model calibration.
Steps marked with Xs indicate multiplication of values used to calculate derived components  (Shuler and El-Kadi 2018). The water budget model used the Soil-Water Balance 2 (SWB2) code (Westenbroek et al. 2018), originally developed by the U.S. Geological Survey (USGS), which is based on the soil-water balance formulation of Thornthwaite-Mather (1955). The Shuler and El-Kadi water budget model is publicly accessible, published under an open-access license, and is available for download by following the link in Shuler and El-Kadi (2019). The model can be easily modified to develop estimates of discharge from any hydrologic pathway within user-set geographic boundaries. Inputs to the Tutuila Water Budget Model included precipitation data (Daly et al. 2006), land use data (Meyer et al. 2016), soil-type data (Nakamura 1984), spatially distributed estimates of direct infiltration (Shuler et al. 2017), runoff-to-rainfall ratios (Shuler and Mariner 2019), potential evapotranspiration data (Izuka et al. 2005), maximum and minimum temperature data (Daly et al. 2006), and mountain front recharge information (Izuka et al. 2007). The model output included island-wide estimates of water budget components, including precipitation, evapotranspiration, runoff, streamflow infiltration, and groundwater recharge (GW-R). Shuler and El-Kadi (2018) provide detailed information regarding the construction of the water budget model.
Because SWB2 was not originally developed to model subsurface processes, it does not partition the recharge fraction into baseflow or SGD components. We did this partitioning outside of the SWB2 model by calculating the average baseflow to recharge ratio in watersheds where stream-gaging data were available, and applied this ratio to the SWB2-calculated recharge to estimate baseflow and SGD rates in all watersheds. We computed average-annual baseflow discharge (Q-BF) rates in the gauged basins by applying a commonly used baseflow separation tool (Wahl and Wahl 1995) to existing streamflow data from 15 watersheds (Online Resources Fig. OR1 and Table OR1). We obtained streamflow data from two sources, historical USGS data (Wong 1996;Perreault 2010; https://waterdata. usgs.gov/nwis), and a local streamflow measurement network (American Samoa Power Authority and University of Hawaii; Shuler and Mariner 2019).
In each of the gauged watersheds, we then calculated the ratio of Q-BF to the SWB2-modeled GW-R rate for these watersheds to determine a Q-BF/GW-R ratio for each. To account for the amount of water pumped from municipal wells in each watershed (Online Resources Table OR3; ASPA-American Samoa Power Authority 2017), the calculation of GW-R also necessitated subtraction of the annual average pump rates of all wells in each watershed. For the watersheds with available stream gauge data, the measured Q-BF/GW-R ratios ranged from 0.08 to 0.49 and the weighted average (weights based on streamflow data period of record) from all gauged watersheds was 0.33 ± 0.17, which can be interpreted to mean that an estimated 33 ± 17% of the island's recharge discharges to the ocean as baseflow in streams, and 1.0-33% or 67% discharges as SGD, assuming that the system is at steady state and all recharge not discharged as baseflow discharges as SGD. We applied this average ratio to the SWB2-calculated GW-R in ungauged watersheds to estimate island-wide SGD and baseflow rates, respectively. This method was used to calculate island-wide modeled SGD and baseflow rates for all of Tutuila's watersheds, except for those located on the Tafuna-Leone Plain (Fig. 1), where no perennial streams flow, and young and fractured lava flows promote rapid infiltration and high aquifer permeabilities (Bentley 1975;Izuka et al. 2007). Therefore, we classified 100% of the SWB2-calculated net infiltration as SGD in the plain area.
Uncertainties on water fluxes were determined in different ways for each pathway. The above calculated 17% uncertainty represents the standard deviation of the weighted average Q-BF/GW-R and SGD/GW-R ratios for all gauged watersheds. However, we also compared a set of independently calculated, snapshot-resolution SGD measurements from four of Tutuila's bays , which suggested that this uncertainty may be too low.
While comparison between SGD rates as calculated by this study and from  suggests that our SWB2-estimated SGD fluxes are reasonable (Online Resource Table OR2), the relative percent difference between the two datasets was 48%, which is significantly higher than the aforementioned 17% uncertainty. The  data are the only available information for validating these estimates, and comparison to the measured SGD rates is the most conservative way to determine the uncertainty on our SGD/baseflow partitioning. Therefore, we used 48% as the uncertainty on our SWB2calculated SGD and baseflow rates. Uncertainty on SWB2 surface runoff rates could be directly calculated because SWB2 does directly calculate surface runoff fractions. Comparison between baseflow-separated measured flows and the SWB2-calculated surface runoff totals (Online Resource Fig. OR2) yielded a mean absolute percentage error of 21%, which was used to represent the uncertainty on the SWB2 surface runoff values.

Water Sampling and Nutrient Concentrations
We obtained existing water-quality data from stream and coastal spring samples taken throughout Tutuila by Comeros-Raynal et al. consisted of a total of 424 individual stream samples. Stream samples were collected at low tide and were taken at stream outlets to ensure that samples were representative of the discharge directly affecting coastal waters. A staff gauge was installed at each site to document the water height (stage) during sampling. While we lacked rating curves to estimate streamflow from the stage measurements, these data provided a relative measure of the flow. To distinguish between baseflow and surface runoff flow regimes, we used the baseflow separation tool discussed for step 1, to quantify the percentage of time-measured streams were under baseflow conditions (defined as days when baseflow discharge exceeded surface runoff discharge) or surface runoff conditions. The average percentage of time Tutuila streams ran under baseflow conditions was 68%, which is reasonable considering Bassiouni and Oki (2013) found that in Hawaii, perennial streams run under baseflow conditions between 60 and 80% of the time. Therefore, we used the 68th percentile of the stage height record at each sample site to classify each sample as either a surface runoff sample (if taken at a stage above this threshold) or a baseflow sample (if taken at a stage below this threshold).
The Comeros-Raynal et al. (2019) dataset also included coastal spring samples collected from 26 locations around Tutuila (Fig. 3), with sampling conducted approximately every 3 months. We grouped additional coastal spring data collected at 31 locations from 2013 to 2018 (Shuler et al. 2017 with the Comeros-Raynal coastal spring data to supplement the groundwater end-member dataset

Observed DIN Loads
We calculated the observed watershed-scale DIN loads by multiplying measured DIN concentrations with water discharge rates. These DIN loads were then used to calibrate the nutrient-loading model. Because most water-sampling sites were sampled repeatedly, DIN concentrations at each site were grouped by hydrologic pathways (surface runoff, baseflow, or SGD), and the average DIN concentrations for each pathway at each site were individually calculated. To estimate the average individual DIN-loading rates for each sampled pathway in every sampled watershed, we multiplied the average DIN concentrations by pathway-specific SWB2-modeled water fluxes for each watershed. Of the ninety-three (93) watersheds delineated for this project, thirty-four (34) were sampled for surface water nutrients, and coastal springs were sampled in twenty-two (22). However, of these, only thirteen (13) were sampled for both surface waters and coastal springs (Fig. 3), partially due to the fact that hydrogeologic regions with more SGD (i.e., more prevalent coastal springs) often have fewer surface water features. Therefore, the total observed coastal DIN fluxes (summed from all three pathways) could only be calculated in these thirteen watersheds. All observed watershed-scale DIN fluxes for each hydrologic pathway  Table 1. Uncertainties on total DIN loading were calculated by propagating error from the SWB2 fluxes (as described in step 1) and the standard deviation of DIN concentrations in sample waters.
Step 3. Identification of Terrestrial DIN Sources We considered four terrestrial DIN sources as drivers for the model: three anthropogenic sources (OSDS, piggeries, and agricultural fertilizer input), and DIN from natural sources such as leaf litter, animal waste, and atmospheric deposition (Savoie et al. 1987;Shuler et al. 2017). To resolve the spatial distribution of anthropogenic DIN sources to the watershed scale, we obtained the locations of every OSDS unit, every pig, and all known agricultural land on the island, and geospatially intersected these anthropogenic DIN sources with the watershed boundaries (Fig. 4). The OSDS units were located using the methods of Shuler et al. (2017), by identifying all buildings located more than 50 m from a sewer main or service line and under 120 m 2 in size. These buildings were assumed to rely on an OSDS unit for wastewater disposal. Small buildings were excluded since sheds or outbuildings typically do not contain facilities requiring an OSDS unit. Building locations were obtained from the American Samoa Department of Commerce (AS-DOC-American Samoa Department of Commerce 2009), and sewer line locations were obtained directly from ASPA. Locations of piggeries and the number of pigs in each was obtained directly from the American Samoa Environmental Protection Agency (AS-EPA). While there exists no direct data regarding fertilizer application in American Samoa, we considered agricultural areas from land use map of Meyer et al. (2016) to be the most likely locations for fertilizer applications. To initialize the model calibration, we used DIN release rates from Shuler et al. (2017) for the equivalent source activities on Tutuila.

. Model Calibration Process
We calibrated the DIN-loading model by parameterizing an individual DIN release rate for each of the anthropogenic sources. Parameter optimization was performed using the Nelder-Mead unconstrained minimization method (Nelder and Mead 1965). The optimization was implemented in Python using the scipy.optimize.minimize function (https:// docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy. optimize.minimize.html). The selected objective function was a linear regression between calculated and modeled nutrient loads, with the slope of the regression fixed at 1, so the optimization would be forced to minimize error without biasing the model toward over-or underprediction, which could occur with an unconstrained least-squares regression. We note that this optimization step acts as a "black-box" to conceptually represent all attenuation and N-transformation processes between sources and sinks, by using a single lumped parameter for each DIN release rate. In reality, these processes are complex, nonlinear, interdependent, and spatially distributed, which makes it extremely difficult to model them directly.

Watershed Prioritization Ranking
Absolute DIN loads from each watershed are fundamentally biased toward watershed size, as larger watersheds contain larger numbers of individual sources and thus produce more DIN. To control for this "area bias", modeled loads were scaled by watershed area and coastline length. Area-scaled loads essentially show the density of sources, which are more representative of human effects within terrestrial areas. Source density may also contribute to nonlinearity in the amount of DIN transported from source to sink, as higher source densities may overwhelm the natural attenuation capacity of soils or riparian zones. However, area scaling still does not provide the representation of how source waters are affected by dilution upon discharge to coastal areas. One way to more accurately predict terrestrial-hydrologic impacts on coastal waters is to scale absolute DIN loads by the coastline length of each watershed, although this method does not capture the effects of oceanographic circulation and biological activity, which were beyond the scope of this modeling effort to consider. Also, coastline-length scaling is biased by the selection of watershed boundaries, whereas watersheds fronted by more convex parts of the island coastline will have longer coastlines, and watersheds fronted by more concave coastlines will have much shorter coastlines, in some cases significantly increasing the length-scaled DIN release rates. In reality, both of these metrics provide a different and unique presentation of impacts, while at the same time being limited by different biases. To incorporate information from both of these metrics, we developed a single watershed prioritization scheme to synthesize the information from each scaling method into a single metric. The ranking scheme was developed by calculating the island-wide means of the area-scaled loads, and coastline-length-scaled loads, and computing anomalies from the island-wide mean for the two metrics in each watershed. Anomalies were ranked from 1 to 93 with the highest absolute anomaly (i.e., the highest DIN impact) from each metric getting assigned the lowest rank. These ranks were then summed, and the sums were ranked from 1 to 93 to incorporate the importance of each metric into a single prioritization-ranking score, again with the lowestscoring watersheds being the ones with the highest degree of impact.

Sensitivity Testing
To examine sensitivity within the optimization routine to each of the starting values for anthropogenic DIN release rates, we ran sensitivity tests on the model calibration by calculating island-wide DIN loading for a variety of "scenarios" representing a shift in the starting values for each of the individual DIN release rates. Loading rates for OSDS, pigs, and agricultural sources were independently reduced to 1/10th, 1/5th, and half of the base value, and were also increased by 2 times, 5 times, and 10 times. A fourth set of scenarios was also run where all three starting values were modified by the aforementioned shifts at the same time. Sensitivity test results were compared by computing the % difference between the scenario and the base case for islandwide DIN loading. The results showed the optimization is fairly robust for values within an order of magnitude of the base case ( Table 2). The model performance started to diverge from the base case, once the starting values were increased 5-10 times.

Management Decision-Making Framework
To maximize the potential for integration of this work into management decision-making, we conducted numerous conference calls with representatives from American Samoan Government agencies, including AS-EPA, the National Marine Sanctuaries, and the Coral Reef Advisory Group to obtain their input in prioritizing the desired outputs of the loading model. This model is one component within a larger Ridge to Reef project sponsored by the American Samoa EPA that is also focused on revising criteria for setting water-quality standards and formalizing links between land use impacts and resilience indicators on adjacent coral reefs. We implemented the modeling process using an open-source platform to make the results as integrable with other components of the project and also as readily accessible to other coastal managers as possible. The model workflow is documented as an interactive Jupyter Notebook (Kluyver et al. 2016) that contains results and explanations, as well as fully executable code, which can be used to reproduce or modify the model output. We made all model input data, code, and results available to anyone, including stakeholders or other researchers in an open-access repository (https://doi.org/10.5281/zenodo.3462869). These resources were developed to be understandable and accessible to stakeholders through publication-quality annotation and inclusion of metadata. Developing the project in an open-source framework provides for easy reproduction and update of the nutrient-loading model as additional watersampling data, better-constrained land use data, or other management considerations become available in the future.

Modeled DIN Loads
The DIN-loading model provides calibrated estimates of coastal DIN loading for every watershed on the island, and allows partitioning of the loads into each of the modeled nutrient source components. Calibration resulted in the final modeled DIN release rates of 0.041-kg-DIN/day per OSDS unit, 0.009-kg-DIN/day per pig, and 0.484-kg-DIN/day per km 2 of agricultural land. Multiplication of these release rates by the number of anthropogenic DIN sources in each watershed yielded absolute loading rates ranging from 0.1kg-DIN/day (0.3-kg-DIN/day per km 2 ) for some of the smallest watersheds, to 92.5-kg-DIN/day (8.4-kg-DIN/day per km 2 ) for the most impacted watershed on the Tafuna-Leone Plain. When error was minimized by the optimization routine, the model's mean average error (MAE) was ±1.09-kg-DIN/day and the r 2 of the regression was 0.74 (Fig. 5). Comparing observed and modeled DIN loads yielded an MAE of ±1.09-kg-DIN/day, which if divided by the modeled loading rate for the 13 watersheds used for calibration, yields an average percent error of 74%. For comparison between watersheds, DIN-loading rates were scaled both by watershed area (Fig. 6) and by length of watershed coastline (Fig. 7). These approaches remove the watershed-size dependence of absolute loading rates, and are more representative of the relative impact of DIN sources within each watershed on the terrestrial landscape or in the coastal zone. Because we considered each of the modeled DIN sources separately, their relative impact on the total load could be separated and examined. The fraction of the total DIN loading originating from each of the individual sources is shown graphically in Fig. 8, where the total modeled DIN loads from each watershed are shown in the upper-left panel, and the other three panels (clockwise from upper right) show the proportion of DIN loaded to each watershed from OSDS units, pigs, and agriculture, respectively. When summed, island-wide, model-estimated DIN load from all sources equaled approximately 428-kg-DIN/day. Of this, the model estimated that about 285 (67%) was from OSDS units, about 104 (24%) from pigs, 34 (8%) from natural sources, and only 5 (1%) kg-DIN/day from agriculture. While uncertainties on modeled loading rates could not be directly propagated through the optimization routine, these include uncertainty in the observed loading rates used as calibration data (shown in Table 1), as well as uncertainty derived from optimization of each of the model parameters. The average percent error on the observed total DIN loads in the 13 watersheds used for calibration was 55%, and the average percent error of the regression between observed and modeled DIN loads, shown graphically in Fig. 5, was 74%. Although the total uncertainty of the model result cannot be assessed through a standard error propagation formula, these values indicate the general order of the magnitude of uncertainty to be considered when interpreting the model results. The effects of the model parameters on the overall results were also examined though sensitivity testing.

Exploration of Correlation Relationships
Although the model could only be calibrated using the 13 watersheds for which measurements of both surface and groundwater were collected, we also produced two-variable linear regressions to explore how variability in modeled loading from individual DIN sources related to variability in observed loads for individual hydrologic pathways. Figure 9 provides a scatterplot matrix showing least-squares regressions between observed DIN and modeled DIN loads. Observed loads can be separated by the hydrologic pathway, each of which were plotted in a dedicated column of the scatterplot matrix. While modeled loads cannot be separated by pathway, they can be separated by different sources, each of which were plotted in a dedicated row of the matrix. Additive total observed and modeled loads for each DIN source and each hydrologic pathway were plotted on the rightmost column and the bottom row, respectively. Compared across rows, the slopes of regression relationships indicate the relative importance of each source to the total DIN load in each pathway. Regressions with higher slopes suggest that more of the DIN in the given pathway originates from the cross-plotted source. Similarly, when plots are compared across columns, those with higher r 2 coefficients suggest that the given nutrient source controls more of the variability in pathway-specific nutrient loads, which helps to indicate which sources are more impactful.
Specific conclusions that can be drawn from these relationships are (1) high slopes and r 2 values in plots with OSDS loading on the y-axis suggest that OSDS is a primary control on baseflow and SGD DIN loads, (2) low slopes in agricultural source plots suggest that this is the least important anthropogenic source, (3) large DIN loads in SGD suggest that groundwater is an important nutrientdelivery pathway that deserves additional management attention, (4) lower r 2 in regressions, including surface runoff, suggests that runoff-stage nutrient fluxes are not as well predicted by the model as baseflow-stage fluxes, which is reasonable because of typically larger variability in surface runoff nutrient concentrations, and (5) high r 2 (0.74) for the total modeled vs. total observed loading plot indicates that the model calibration performed satisfactorily.

Watershed Prioritization-Ranking Results
We developed the watershed prioritization-ranking scheme to synthesize model results and make them as relevant to coastal and land use managers as possible. The scheme incorporates both the area-scaled loads (Fig. 6) and the coastline-length-scaled loads (Fig. 7) and weighs them equally in the output. The results of the prioritizationranking system are shown in Fig. 10, and indicated that Tutuila's most heavily DIN-impacted areas are on the Tafuna-Leone Plain, in the Pago Pago Harbor area, and on the eastern side of the island where various villages, including Aoa, Fagaitua, and Tula, show a higher degree of DIN impact than the surrounding watersheds.

Discussion and Conclusions
The cumulative impacts of sediments, nutrients, acidification, and toxic compounds, all conspire to drive coastal ecosystem health. However, direct assessment of many of these factors is difficult at best. Because of their low cost and relative simplicity to measure, nutrient concentrations and loads are commonly examined to serve as indicators of human impact. In particular, DIN loading has been found to be one of the most robust metrics for understanding coastal ecosystem impacts, and thus remains a major consideration for coastal resource management (Caraco and Cole 1999;Waterhouse et al. 2017;Comeros-Raynal et al. 2019).

Identification of Tutuila DIN Hotspots
Prior to this study, the only watershed-scale impact classification available to local management agencies was developed with a basic assessment of population density within each "major" watershed ( Fig. 11) (DiDonato 2004). In contrast, our DIN-loading approach not only incorporates more direct metrics, but also provides a prioritization scheme at a higher "minor" watershed resolution. While population density is a good first-order indicator of human impacts, it is by nature an indirect indicator. By incorporating impacts from all known DIN sources, including livestock and agriculture, and scaling these impacts by their relative DIN loads, we provide a more direct assessment of how human populations physically affect coastal  Scatterplot matrices of all observed and modeled DIN-loading rates. All values are in kg-DIN/day. Abbreviation key: Obs. observed loading rates, Comp. computed loading rates, OSDS on-site disposal systems, AG agriculture, NAT natural, BF baseflow, RO surface runoff ecosystems. Nonetheless, the results from this study align with the DiDonato classifications fairly well, especially in areas such as the Tafuna-Leone Plain and Pago Harbor, which have been previously reported to be some of the highest human-impact hotspots on Tutuila (Whitall and Holst 2015;Polidoro et al. 2017;Shuler et al. 2017. The greater proportion of flat, developable land on the plain likely leads to higher population densities that drive the high DIN-loading rates from these watersheds even after scaling by watershed area or coastline length. Both this and the 2004 assessments also agree in the island's unpopulated north-draining watersheds, simply because these areas are fairly pristine and generally uninhabited. However, in Eastern Tutuila and on the Southwestern coast, the comparison is more nuanced. The 2004 population-density results were produced at a lower resolution and classified these areas mostly under an "intermediate" level of impact. Our model produced higher-resolution estimates and predicted particular villages in Eastern Tutuila, such as Tula, Aoa, Amouli, and Alao Villages, and the Seetaga area of Western Tutuila to be DIN-loading hotspots as well. These villages may be subject to DIN-loading impacts that are higher than expected in comparison with adjacent watersheds, highlighting the need for additional land use or nutrient source management attention in particular villages. This also exemplifies how summarization of multiple datasets into a single prioritization-ranking score can help managers to quickly and easily identify hotspots while integrating a large and diverse array of information into their decision-making.

Implications for Water-Quality Monitoring
Groundwater-quality monitoring and its inclusion in environmental, and not just drinking water, regulatory standards could improve the effectiveness of coastal resource management on Tutuila or other islands with similar hydrogeology. Throughout the United States, comparatively little management attention has been given to understanding the effects of groundwater discharge on coastal zones (Carlson et al. 2019). The American Samoa EPA has been progressive in implementing environmental water-quality standards for coastal waters and for territorial surface waters. However, this study and others (Whitall and Holst 2015; show that groundwater (SGD) is a significant nutrient-transport mechanism to Tutuila's coastal areas. Coastal spring data and SGD-loading rates compiled in this work are a useful start for developing a coastal groundwater-quality baseline, as this work presents some of the first known, and the most comprehensive set of coastal spring water-quality measurements for the territory. Baseline water-quality information is not only helpful for detecting acute changes in water quality, but is also important in effectively understanding and setting standards for managing nonpoint nutrient sources in a way that protects reef health. Nonetheless, as of this writing, it is rare for U.S. states to define groundwater-quality standards outside of the drinking water context (Kimsey 2005; N.J.A.C.-New Jersey Administrative Code 2018). However, as has been shown in European Union countries such as Denmark, groundwater-quality monitoring and regulatory standards are an integral part of land use management strategies that focus on drinking water and human uses, along with protection of the environment and associated aquatic ecosystems (Hinsby and Jørgensen 2009;Jørgensen and Stockmarr 2009). American Samoa is relatively small and has a significant motivation to protect reef health, which could make the territory an ideal pilot location to experiment with developing a coastal groundwater-monitoring and water-quality regulatory program.
Another consideration in the design of water-quality sampling protocols is the poorly studied, but clearly important mechanism of coastal groundwater discharge occurring within the boundaries of stream mouths and in estuaries. In Fagaalu Stream, which drains to Pago Harbor, Shuler (2019) found that surface water nutrient concentrations in the nearshore portions of streams, were significantly elevated relative to concentrations in upper-stream reaches. This was attributed to discharging baseflow originating from the nearshore basal-lens aquifer, which, like many other nearshore aquifers on Tutuila, is subject to contamination from cesspools and piggeries that are co-located with human development. We considered this phenomenon when selecting sample data for this and the precursor to this study , and although it would be most informative to sample streams in multiple locations, including both stream mouths and in higher reaches, budget limitations often preclude such efforts. Where limitations exist, the sampling methodology of always sampling near the stream mouth during low tide, can be recommended to ensure that measurements remain comparable between different streams.

The Importance of Different Sources and Hydrologic Pathways
One of the major benefits of developing the model in a way that accounts for loading from individual sources and through individual pathways, is the ability to separate impacts from specific activities for prioritization of management actions. While we promoted the synthesis of the results through the simplified watershed prioritization ranking, the model output can also be expanded into individual components to gain a more refined picture of where nutrients originate and how they are transported. Specifically, the model shows that OSDS loading is generally two to four times higher than DIN loading from other sources. This has also been shown on similar islands through individual site investigations (Bishop et al. 2017;Richardson et al. 2017). Overall, this supports trends in water-quality management throughout the territory, which are beginning to focus on OSDS impacts more. During the last decade in American Samoa, management of pig waste originating from widespread traditional piggeries became a top priority for water resource managers. Coordinated scientific, educational, and regulatory efforts have since significantly reduced the number of pigs and modified pig waste management practices (AS-EPA-American Samoa Environmental Protection Agency 2005, 2014. Because pigs are no longer the management priority they once were, it has become clear that other human activities, such as wastewater discharge from cesspools, also require management attention. Shuler et al. (2017) found a similar result, suggesting that on the Tafuna-Leone Plain, OSDS units account for 300% more total N loading to groundwater than N sourced from pigs. The results of this study and Shuler et al. (2017) both indicate that N loads from agriculture were relatively insignificant, suggesting that OSDS units are currently the highest-priority anthropogenic nutrient source for coastal resource manager's attention.
The effects of each DIN source on each hydrologic pathway can also be considered separately. On a conceptual level, piggeries should disproportionately affect surface water quality, and likely have a stronger effect on surface runoff quality, since pig waste is disposed directly on the land surface (Menzi et al. 2010;Shuler et al. 2017). On the other hand, the water-quality threats posed by cesspools and septic tanks are more underreported and unknown because OSDS effluent does not directly affect surface water quality. Instead, it impacts the groundwater, which then discharges to streams or to the ocean as SGD. Nonetheless, linear regressions between loading from different sources and loading in different hydrologic pathways in Fig. 9 suggest that all of Tutuila's anthropogenic DIN sources may control a portion of the variability in all hydrologic pathways. However, this result is confounded by significant collinearity between the three anthropogenic nutrient sources. Thus, we are not currently able to discount any of the sampled hydrologic pathways from being important coastal DIN transport mechanisms.

Assumptions and Limitations of the DIN-Loading Model
It is well established that surface water nutrient concentrations and thus DIN loads are highly temporally variable and subject to effects that complicate the relationship between flow and concentration such as hysteresis (Pellerin et al. 2014) or bias in regression equations (Medalie et al. 2012;Hirsch 2014). However, the high costs, time, and effort required to sample at an appropriate resolution for characterizing this variability are often prohibitive. Therefore, in this study, we had to statistically characterize the natural variability of stream flows and DIN concentrations by averaging often low numbers of samples, creating wide uncertainty bounds in our results. In addition, streams have discrete outlets, and SGD is often channelized into individual springs. However, the watershedscale resolution of the model is also unable to account for the spatial variability of discharge within each watershed. It should be kept in mind that although this work is useful for assessing the relative differences between watersheds, and for showing "hotspots" where additional management actions may be warranted, there are nonetheless high uncertainties on absolute modeled DIN loads, likely on the order of 55-74%. The relatively low number of surface runoff samples compared with baseflow-stage samples collected for this work is another limitation. This likely reduces the accuracy of the model's representation of runoff-stage DIN fluxes. Unfortunately, runoff-stage data are more difficult to collect than baseflow-stage data because runoff conditions occur less frequently than baseflow conditions, and also because fieldwork conditions are usually less pleasant during runoff conditions, thereby adding human bias to sampling schedules. In addition, nutrient concentrations in runoff are subject to hysteretic patterns where peak nutrient concentrations may lag behind or precede the runoff peak (depending on nutrient species), making stream-stage-concentration relationships significantly more complicated (e.g., Evans and Davies 1998;Kumar 2011). These issues might be alleviated through additional collection of runoff-stage data.
Other specific assumptions and limitations of the loading model included: (1) our partitioning of baseflow and SGD in gauged watersheds relied on the assumption that all net infiltration was either extracted by wells or lost as baseflow or SGD, and neglected any potential for flow between watersheds.
(2) For separation of baseflow samples and surface runoff samples, we relied on a stage-based indicator of streamflow. While this was the best available information, stage is clearly not equivalent to discharge. For future work, collecting direct measurements of discharge would be far more reliable for this task. (3) Extrapolation of SGD/GW-R ratios from gauged to ungauged watersheds relied on the assumption that these ratios were similar for all watersheds. Again, collection of additional discharge measurements would allow for a more spatially distributed assessment of this parameter. (4) We assumed the three modeled anthropogenic DIN sources: OSDS, pigs, and agriculture are the island's only significant anthropogenic sources of N. It was assumed that all other natural sources release DIN evenly throughout the landscape. (5) While DIN source locations were reasonably well known, loading rates, and actual fractions of released DIN escaping attenuation processes, such as sorption, volatilization, denitrification, and uptake, were represented by a single lumped parameter for each DIN release rate. Though this method appeared empirically robust, it should be remembered that the processes represented by this simplified approach are complex, nonlinear, interdependent, and spatially distributed.
Despite the model's limitations, the loading model achieved a reasonable linear-regression fit to available data, with an r 2 of 0.74 and a standard error of 1.09-kg-DIN/day when compared with modeled DIN loads. While each loading rate interpretation is subject to different biases, integrating them all into a single prioritization-ranking scheme helped to smooth out these biases and also provided land use managers with a single easy-to-understand metric to identify DIN-loading "hotspots" (Fig. 10).

Data Availability
Recent advancements in cloud-computing technologies, particularly in open-source sharing of online projects, significantly increase methodological transparency and reproducibility of models. We provided dynamic opensource access to this project for managers, researchers, regulators, and others by developing the project with GitHub, which manages versioning and retains all necessary raw data files, model code, descriptive information, and output files in a public repository (Shuler and Comeros-Raynal 2019). We also archived the most recent version of the model for long-term storage in an open-access digital artifact repository (https://doi.org/10. 5281/zenodo.3462869). Note that sensitive information or datasets not intended to be publicly available, are not posted in raw forms. The model code is licensed under the GNU General Public License v3.0, which is an open-access license designed to explicitly affirm any user's unlimited permission to run, copy, and use the unmodified code from this repository. By designing the model as an interactive live-code document (Jupyter Notebook), it can be modified to potentially address new management questions that come up as we continue to work closely with stakeholders and managers toward the betterment of American Samoa's terrestrial and marine environments. Most importantly, we hope that applying this open-source paradigm will provide a blueprint for others to improve upon and translate this method to other locations throughout Oceania.