1. Intorduction
The Upper Hunter Valley is located within the northern end of the Hunter region, around 200 km north of Sydney and 50 km northwest of Newcastle in the State of New South Wales (NSW), Australia (
Figure 1; Olguin and Everingham, 2015). On a broad scale the valley is oriented northwest-southeast (NW-SE), approximately 30 km wide and with the terrain elevation of around 300~380 m from the lower (Singleton South) end to the upper (Merriwa) end (Connor et al., 2008; OEH, 2017a). It is a major coal mining area in NSW, with significant coal mines located between Aberdeen in the north to Bulga in the south, containing approximately 40 percent of the currently identified total coal reserves in the state (ABARES, 2023). It also has a significant agriculture industry (dairy, beef, horse breeding, and viticulture) and two large coal-fired electricity generation plants (DPI, 2018; Olguin and Everingham, 2015), with the Liddell Power Station fully decommissioned during April 2022 to April 2023. There are multiple medium-to-small towns in the region. The largest population centres are Singleton (population ~ 24,577) and Muswellbrook (population ~ 16,357) (ABS, 2023), with many small settlements and isolated rural residences scattered throughout the valley.
The prevailing surface winds tend to follow the NW-SE orientation of the valley. The most frequent winds are north-westerlies in winter and south-easterlies in summer, with wind directions less defined in autumn and spring (DPE, 2022; OEH, 2017a; Holmes, 2008; Bridgman and Chambers, 1981). Nocturnal and early morning down-valley drainage flows, daytime up-valley winds or south-easterly sea breezes are also observed from time to time (OEH, 2017a; Hibberd et al., 2013; Holmes, 2008; Physick et al., 1991; Hyde et al., 1981). Wind strengths vary across the valley with most locations experiencing annual average wind speeds in the range of around 2~4 m/s. The precipitation in the region is generally low (compared to coastal regions to the east), and varies significantly across years, for example with the annual rainfall ranging from around 549 mm to 853 mm at Singleton and Muswellbrook during 2011–2015 (OEH, 2017). Higher rainfall tends to occur in summer and early autumn, and lower rainfall in winter and early spring.
Particle pollution is known to have adverse effects on human health and the environment (e.g., Hertzog et al., 2024; Huang et al., 2018; Segersson et al., 2017; Bartnet et al., 2006; Keywood et al., 2016; CSIRO, 2013; EPHC, 2010; Anderson et al., 2012). PM10 (air-borne particles with an aerodynamic diameter less than 10 micrometres) pollution has been a major air quality concern for local communities in the Upper Hunter Valley (NSWEPA, 2023; POEO Regulation, 2021; Keywood et al., 2020; NSWEPA, 2019; Morrison and Nelson, 2011; Conner et al., 2008). The main sources of PM10 emissions in the region include local open-cut coal mining activities (e.g., wheel generated dust, windblown dust from overburden) and surface soil erosions (windblown dust). For example, coal mining contributes around 88% of PM10 emissions in the combined Muswellbrook, Singleton and Upper Hunter local government areas (NSWEPA, 2019). Coal-fired electricity generation, agriculture, bushfires, prescribed hazard reduction burnings and state-wide dust storms also contribute to PM10 pollution in the region (DPE, 2022).
There have been a few studies examining PM10 pollution in the Upper Hunter Valley. Of these, most early investigations were based on data collected (with different types of instruments) at limited (small) number of locations through short-term campaign monitoring projects. These include those reported in SPCC (1982, 1983), Holmes and Associates (1996), Morrison and Nelson (2011), Hibberd et al. (2013) and relevant references therein. For example, SPCC (1982) reported a few observational and modelling studies, which concluded that localised dust pollution from open-cut coal mines and related developments continued to be issue of concern, and that there was unlikely to be serious cumulative, region-wide problems resulting from dust emissions from mines. Holmes and Associates (1996) found that the increase in both deposition and concentration levels of dust over 1984-1994 were due to the increase in coal production and the severe drought affecting much of eastern Australia, and that the land affected by cumulative effects appeared to be primarily that owned by the coal mines. Holmes (2008) and Hyde et al. (1981) also suggested that local PM10 emissions in the valley can impact air quality in areas away from sources - that is, north-westerlies can transport dust generated in the upper end of the valley to areas near the bottom of the valley or further down over the metropolitan areas of Newcastle.
In partnership with the Upper Hunter coal and power industries, the NSW Government commissioned the Upper Hunter Air Quality Monitoring Network (UHAQMN) during 2010-2012 (
Figure 1), to provide the community, industry and government with reliable and up-to-date information on air quality within the valley (OEH, 2017b). Pollution (including PM10) data from the network are reported as air quality categories (AQCs) in near real-time and in quarterly or annual data reports on the NSW government website (
https://www.airquality.nsw.gov.au). Multi-year PM10 data summaries were available in two main reports, i.e., OEH (2017b) for data in 2011-2015 and DPE (2022) for data in 2011-2021. Three main findings are worth highlighting: 1) the annual PM10 concentrations in the UHAQMN were observed to be amongst the highest across the NSW Air Quality Monitoring Network (NSW AQMN, over 90 stations across NSW; Riley et al., 2020); 2) PM10 levels at some sites can exceed the national benchmark (i.e., Australian standard for PM10; NEPC, 2021) from time to time, and vary significantly from year to year due to impacts of draught conditions and occurrence of hazard reduction burnings or bushfires; 3) emissions from open-cut mines can lead to elevated PM10 pollution typically in the lower (southeast) end of the valley, in particular under the north-westerly winds.
OEH (2017a) performed a detailed analysis of the 2012-2015 (4-year) PM10 data from the UHAQMN in an air quality management campaign project. The results showed that: 1) poor air quality days generally occurred in spring and secondarily summer and autumn, with winter and February/March having relatively good air quality; 2) there were more poor air quality days in 2012–2013 but significantly reduced number of events in 2014–2015; and 3) the correlations between high PM10 pollution and individual meteorological variables were complex and non-linear, varying with time and location. The author also proposed the existence of two air quality clusters, i.e., the south-east (SE) and west-north-west (WNW) air quality subregions in the valley. Drawing up the project findings, the author indicated that the application of more sophisticated (holistic) analytic methods such as pattern recognition techniques may provide increased understanding of PM10 pollution in the study region.
Globally there are relatively few studies on air quality issues in rural valley environments, with much of the air quality literature primarily focusing on urban air pollution problems. Of those few studies, most work was undertaken with data for a small number of sites and based on correlation analysis, e.g., on spatial and/or temporal variations of PM10 by Mohd Shafie et al. (2022) and NPS (2023), and on correlations between PM10 levels and selected atmospheric parameters by Mannis (1988), Giri et al. (2008), Fortelli et al. (2016), Reisen et al. (2017), Czernecki et al. (2017) and Quimbayo-Duarte (2021). In summary, elevated PM10 pollution in those valleys are associated with (prolonged) dry conditions (low rainfall and humidity), low winds, thermal inversions (low mixing heights), and/or under the influence of high-pressure systems. Till now, to the best of our knowledge there is little or no research in the literature on the topic of examining spatial and temporal variability modes of PM10 pollution.
This project extends OEH (2017a), based on the 11-year (2012-2022) PM10 data from 14 stations in the UHAQMN and by applying advanced analytic methods, to holistically examine: 1) the spatial-temporal variation patterns of PM10 pollution in the Upper Hunter Valley; and 2) how elevated pollution days are related to local- and synoptic-scale meteorological configurations on the region. This paper is focused on presenting the investigation results of 1), i.e., on the spatial and temporal variation modes from the long-term multi-site PM10 data. The investigation is unique in at least three aspects: a) the air quality subregions initially proposed in OEH (2017a) are verified and analysed based on a longer dataset using the rotated principal component analysis (RPCA); b) the temporal modes of PM10 pollution were identified for the air quality subregions through wavelet analysis and were illustrated via heat map visualisations; and c) the impact of exceptional events including dust storms and bushfires on PM10 pollution were also examined in some depth for the subregions. The results on the linkage between PM10 pollution and local and synoptic meteorological features will be reported separately in a companion paper of this text. The findings will used for air quality forecasting in NSW, and the methods and results can also be useful for air quality research in similar regions elsewhere.
3. Methods
The investigation was conducted in four steps to examine the spatial and temporal variability modes of daily PM10 pollution. The first step was to describe the general properties of PM10 pollution in the Upper Hunter Valley, based on boxplots and general summary statistics for the long-term (2012-2022) PM10 measurements from 14 stations in the UHAQMN (
Section 2.1). This sets out the context for the subsequent analyses and interpretation of results from other steps. The second step was to identify the spatial co-variation structure (i.e., spatial regionalisation) of PM10 pollution by applying the RPCA (rotated principal component analysis) to the multi-site PM10 data in 2012-20122 (
Section 3.1). This analysis verifies the air quality clustering (subregions) previously identified in OEH (2017a) from a shorter (2012-2015) dataset. In the third step, wavelet analysis was applied to the principal component (PC) time series obtained from RPCA to determine the dominant temporal modes of PM10 pollution in each subregion, and how those modes changed over time (
Section 3.2). Then in the fourth step, heat maps were used to illustrate the variation patterns of average PM10 levels and number of exceedance days in each subregion on the annual and interannual scales.
Throughout steps 1 to 4, the effects of exceptional events (vegetation fires and dust storms) on the findings were examined by repeating the relevant analyses for the all-day dataset and where appropriate the dataset for exceptional event days only. Due to the space limit, this text is focused on findings from the normal-day dataset (exceptional event days excluded), with some results from the all-day dataset given as
Supplementary Materials to support our interpretations of findings. The implementation of RPCA and wavelet analysis are described next.
3.1. Rotated Principal Component Analysis (RPCA)
Principal component analysis (PCA) can be used for data reduction, variation mode identification, or feature classification (Jiang, 2010). In mode identification or feature classification, a rotation technique is often applied to PCs for easy interpretation of results (Richman, 1986). Here PCA was applied to the correlation matrix of the daily PM
10 concentration time series (2012–2022) for 14 air quality stations to examine the structure of inter-site covariations, i.e., spatial regionalisation of PM10 pollution in the valley. Missing data points were not used, i.e., ignored listwise across stations, when calculating the correlation matrix to suppress the potential for introducing new noise due to the data imputation process (note: PCA was applied differently in
Section 3.2 for the purpose of wavelet analysis). The retention of the number of principal components (PCs) was decided through a scree-test, following the combined use of the Cattell (1966) and North et al. (1982) methods (Jiang, 2011). A plot of initial eigenvalue and sampling error by PC-number was used for this purpose, keeping in mind that
degenerated multiplets (i.e., unrotated PCs) should not be separated from one another and as much variance in the dataset as possible should be explained by the PCs retained. Varimax (orthogonal) rotation was then applied to the retained PCs to facilitate physically meaningful interpretations, while preserving the linear independency between PCs (Jiang et al., 2005).
In addition, an obliquely rotated PCA (without the orthogonality constraint for PC rotation; Jiang, 2011) was performed on the same data to confirm if the linearly independent PCs from the Varimax rotation could approximate the true (underlying) simple structure in the data (Harman, 1976). The RPCA was repeated for the daily PM10 data with or without exceptional event days included, i.e., all-day dataset vs normal-day dataset, to test the stability of the RPCA findings.
3.2. Wavelet Analysis
Wavelet analysis can generate a representation of a signal (stationary or non-stationary) simultaneously in the time and frequency domains, thereby allowing access to localised information about the signal (Adamowski and Chan, 2011). Wavelet analysis was performed to identify the dominant temporal variability modes of PM10 pollution if existing in the study region. Wavelet analysis requires that no missing data exist in the time series. Hence, an RPCA was performed on a processed daily PM10 dataset where the missing records (if any) at each station were filled with the overall median of the PM10 measurements in the 2012-2022 period for that station, resulting in PC time series (scores) without missing values but otherwise identical to those from the RPCAs described in
Section 3.1. A wavelet analysis was then applied to these PC scores to determine the dominant modes of co-variability in PM10 pollution for each subregion, and how those modes changed over time.
We followed the procedure outlined in Torrence and Compo (1998), by using the Morlet wavelet with the following parameter configurations: sampling rate/resolution = 1 (day); frequency resolution = 0.25, i.e., four suboctaves (voices) per octave; lower and upper Fourier periods (scales) for wavelet decomposition set to 2 days) and 2048 days (~5.6 years), respectively. The lower and upper Fourier periods were chosen this way so that variability modes on the sub-weekly, annual (seasonality) and multi-year (5-6 years) scales, and anything in between will be considered. We assumed a red-noise process with the lag-1 serial correlation for each PC time series when testing the significance of wavelet spectrum power at the 0.05 level for a chi-square test.
The above analysis process was repeated on the PCs derived from the RPCA on the normal-day dataset, where both the missing data and the readings for exceptional event days were replaced with the median of PM10 measurements for normal days by individual stations. This was to assess the (potential) impact from the inclusion of high particle measurements in exceptional event days on the wavelet analysis results.
4. Results and Discussions
4.1. General Description of Daily PM10 Pollution and Impacts of Exceptional Events
Box plots and summary statistics for the normal-day dataset (i.e., exceptional event days excluded) are given in
Figure 2 and
Figure 3 to illustrate the distributional properties of the long-term (2012-2022) PM10 concentration data from 14 stations in the UHAMQN. In general, relatively higher daily PM10 levels were recorded at four (direct) source-impacted locations, i.e., the Camberwell, Maison Dieu, Mt Thorley and Singleton NW stations, which are relatively close to open-cut mining sites in the southeast (lower end) of the valley (
Figure 2). These stations had larger variabilities (indicated by larger inter-quartile ranges, i.e., box lengths) and more outlier/extreme values (indicated by dots and asterisks). PM10 levels at the Warkworth station were comparably high. In contrast, Merriwa (background station near the top of the valley) and secondarily Wybong, Jerrys Plains, Bulga and Aberdeen recorded relatively lower and less variable PM10 pollution, as indicated by the lower box positioning and relatively fewer outlier/extreme values. PM10 levels at the two largest population centres, Singleton and Muswellbrook, appeared in between the above two cases, with extreme values also observed at these locations. Minimum daily PM10 levels were generally similar across 14 stations, ranging 1.5 ~ 3.5 µg/m
3 (
Figure 3). Overall, these results are generally consistent with the distributional properties reported by OEH (2017a) for a shorter (normal days in 2012-2015) PM10 dataset. Holmes and Associates (1996) noted that the land affected by cumulative effects appeared to be primarily that owned by the coal mines. Essentially, as is for other regions (e.g., Jiang et al., 2014; Jiang et al., 2017), the significant inter-site variability in PM10 pollution reflects the combined effects of changes in local emissions (primarily from open-cut mining and soil erosion) and environmental factors such as meteorological conditions on different time scales.
An analysis was also performed for the all-day (full) dataset (including exceptional event days), as illustrated in
Figure 3 and
Figure S1 in
Supplementary Material. The distributional properties of PM10 data were found very similar to those described above for the normal-day dataset. The exception is that the full dataset showed slightly higher means, medians and standard deviations, significantly (around two or more times) higher maximum values (
Figure 3), and many more days with outlier or extreme values (
Figure S1). These differences between datasets were expected, essentially reflecting the significant impact of exceptional events on regional air quality, such as the widespread dust storms in 2018 and the spring-summer bushfires across large areas of NSW in 2019-2020 (DPE, 2019, 2020; Watt et al., 2019), as is further illustrated on the interannual scale in
Section 4.3 and
Section 4.4.
We also compared the summary statistics between the normal-day and exceptional event-day datasets (
Figure 4). The mean and median of PM10 concentrations for exceptional event days were above the national benchmark (50 µg/m
3) at all stations, equivalent to around 2~3 times increase over those for normal days. The minimum PM10 pollution for exceptional event days reached around half of the national benchmark level, or around 5~10 times increase over the that for normal days. The maximum and standard deviation values for exceptional event days were also around 1~4 and 2~7 times higher, respectively, compared to normal days. Overall, the four source-impacted stations were most significantly affected, but with Merriwa recording the highest maximum PM10 level due to the impact of a wide-spread dust storm on 11 January 2020 (DPE, 2021). This finding indicates the accumulated (combined) impacts on air quality from local (primarily associated with open-cut mining and soil erosion), remote (mainly continental dust storms) and incidental (HRBs or bushfires) particle emissions.
4.2. Spatial Pattern - Identification of Two Air Quality Subregions
The RPCA for the normal-day dataset (
Section 3.1) led to the retention of two leading principal components (PC1 and PC2). The two Varimax rotated PCs explain around 88% of the total variance in the dataset, with 45% by PC1 and 43% by PC2.
Figure 5 shows the loadings of two PCs, equivalent to Pearson correlations between each PC time series and daily PM10 concentrations at individual stations. The PC loadings (correlations) identify two distinct air quality clusters: 1) the WNW subregion, with high loadings on PC1 for stations in the northern and western parts of the valley; 2) the SE subregion, with high loadings on PC2 for stations in the south-eastern part of the valley. Hence, the variability of PM10 pollution in the valley is summarised into two linearly independent (dimensionless) time series, i.e., PC1 and PC2 scores, despite of the significant inter-site difference in PM10 concentrations (as shown in the previous section).
RPCA on the all-day dataset (including exceptional event days) could produce similar results, except that the two leading PCs are reversed in order and two clusters show slightly reduced separations (
Figure S2). Obliquely rotated PCA (Jiang, 2011) on the relevant datasets generated very similar (almost identical) results, confirming the existence of a strong
underlying simple structure (clusters) (Harman, 1976) in the daily PM10 dataset, i.e., the existence of two distinct air quality subregions in the valley.
These results are highly consistent with OEH (2017a), who initially proposed the two-subregion property of PM10 pollution from a shorter (4-year) normal-day dataset for the study region. Hence, based on the longer (11-year) dataset, the present analysis has verified the stability and robustness of two air quality subregions in the Upper Hunter Valley, despite of the impact of (a small number of) exceptional events.
The division of the valley into two air quality subregions can somehow be expected, primarily due to the valley’s NW-SE oriented slope terrain and the prevalence of valley-following air follows in the lower boundary layer. Previous work revealed that the most frequent winds are north-westerlies and south-easterlies in the valley (e.g., DPE, 2022; OEH, 2017b; Holmes, 2008; Bridgman and Chambers, 1981). Northwesterly winds can blow dust generated in the upper (NW) part of the valley southeastward (down slope), contributing to elevated PM10 concentrations in areas near the bottom end (i.e., the SE subregion) of the valley (Hyde et al., 1981). In contrast, southeasterly winds may transport pollutants from the lower valley northwestward (up-slope), resulting in elevated PM10 pollution at the upper end (i.e., the WNW subregion) of the valley. Some recent observational case studies (OEH, 2017b; DPE, 2022) also indicated that emissions from open-cut mines can lead to elevated PM10 pollution typically in the lower (southeast) part of the valley in particular under north-westerly winds.
4.3. Temporal Pattern – Identification of Key Variation Modes
The previous section has shown that the variability of PM10 pollution in the study region can be expressed with two linearly independent PC time series, that is, PC1 and PC2 scores represent the distinct covariational features of PM10 pollution at stations in the WNW and SE subregions. Wavelet analysis was used to identify the dominant variability modes in the PC time series and how those modes changed over time.
Figure 6 and
Figure 7 illustrate the wavelet analysis results for PC1 (
Figure 6a; WNW subregion) and PC2 (
Figure 7a; SE subregion) scores from the RPCA of the normal-day dataset (
Section 3.2). It is most distinguishable that the wavelet power spectrum peaks near the 360-day time scale (annual mode, significant at the 0.05 level for a Chi-square test) persistently across all years for both subregions (
Figure 6b and
Figure 7b). The higher variance occurred in spring to summer for the WNW subregion (
Figure 6c), but in winter to spring for the SE subregion (
Figure 7c).
The annual variability mode in PM10 pollution can be linked to the seasonal changes of weather and climatic conditions that influence pollutant (PM10) emission and dispersion conditions in the valley. Higher rainfall (hence lower dust generation and higher wet deposition) tends to occur in summer and early autumn, and lower rainfall (hence higher dust generation and lower wet deposition) in winter and early spring (OEH, 2017b). The most frequent winds in the valley are north-westerlies in winter and south-easterlies in summer, with wind directions less defined in autumn and spring (Holmes, 2008; OEH, 2017a). Consequently, the prevailing northwesterly winds and higher PM10 emissions in winter and spring provide a high potential for transporting or accumulating PM10 pollution over the SE subregion. In contrast, the prevailing southeasterly flows or sea breezes in summer tends to transport pollutants northwestward, potentially resulting in elevated PM10 pollution in the upper part of the valley (i.e., in the WNW subregion). This seasonality in PM10 pollution will be further illustrated in
Section 4.4 for two air quality subregions.
There are also intermittent wavelet power peaks at around 120 days, 30~90 days, and shorter (less than 30 days) time scales (
Figure 6b,
Figure 7b), indicating the signals of triannual, intra-seasonal and shorter term variation modes in the PC time series. These variation modes appeared stronger in the SE subregion (PC2 time series) than the WNW subregion (PC1 time series). The signal strengths changed dramatically across years, manifesting phase difference between two subregions. For example, the signal for the variability mode of around 120 days in the WNW subregion was more intense in summertime across years 2013-2015 and 2017-2018, but a lot weaker during some periods in 2021-2022 (
Figure 7b). In comparison, this variability mode in the SE subregion showed relatively higher power during wintertime in 2012, 2013 and 2018, but lower power during some periods in 2016 and 2019-2022. Of note is the high wavelet power at around 5-6 years for both subregions (
Figure 6b;
Figure 7b), implying a signal at the interannual scale in the PC time series. However, most of the spectra at this time scale fall within the cone of influence (COI) region (under the curved line), which are within the uncertainty range associated with the edge effect in Fourier transforms (Torrence and Compo, 1998).
The variability modes at times scales of 30-90 days and 120 days in PM10 pollution are not yet readily understood. Many studies have examined the signal of intraseasonal oscillations at time scales of 30-90 or 30-60 days in tropical atmospheric or oceanic variables in tropical atmospheric or oceanic variables, which can often be linked to the Madden-Julian Oscillation (MJO) in the tropics and the extra-tropical teleconnection patterns such as El Niño-Southern Oscillation (ENSO) (e.g., Zhang, 2005; Weickmann and Berry, 2009; Gushchina and Dewitte, 2012). Such oscillations are also investigated extensively in atmospheric variables associated with monsoon activities (e.g., Zhou and Chan, 2008; Kikuchi et al., 2012) and to a lesser extend in those at higher latitudes (e.g., Rimbu et al., 2012; 2013). For example, Rimbu et al., (2012) revealed two intraseasonal variability patterns in synoptic observations (temperature, wind speed, sea level pressure) at a high-latitude Antarctic research station, Neumayer (70°S, 8°W). The dominant pattern manifests with out-of-phase variations of temperature and wind speed with sea level pressure at time scales of around 40 and 80 days, which can be related to tropical intraseasonal oscillations via large-scale eastward propagating atmospheric circulation wave-trains. In contrast, the second pattern was characterised by the in-phase variations between temperature, wind and sea level pressure at time scales of around 35, 60 and 120 days, which can be attributed to the Southern Annular Mode/Antarctic Oscillation (SAM/AAO). Drawing upon these studies, one may speculate whether the PM10 variability modes at time scales of 30-90 and 120 days (and potentially at the multi-year time scale) could be related to the influence of large-scale climate drivers such as MJO, ENSO and SAM, which are known to modulate the weather and climate in Australia. This aspect deserves further attention in future work.
A wavelet analysis was also performed on the two leading PCs from the RPCA on the all-day PM10 dataset (measurements for exceptional event days included), with results shown in
Figures S3 and S4 (note that the order of two PCs was reversed). Overall, the wavelet power of the PC time series could identify generally similar dominant modes to those described above. However, clearly there are significant distortions (increases) in the wavelet power peaks during mid-2017 to mid-2020 for the SE subregion (PC1 times series;
Figure S3) and during mid-2019 to mid-2020 for the WNW subregion (PC2 time series;
Figure S4). The distortions appear more intense in the SE subregion than the WNW subregion, essentially reflecting the significant impact of high PM10 measurements during the widespread dust storms in 2018 and the unprecedented Black Summer bushfires in 2019-2020 across large areas of NSW.
In addition, the shorter-term (under 30 days) variability in PM10 pollution is attributable to day-to-day changes in emission conditions and the effects of local and synoptic weather variability, as is demonstrated in some previous studies for other regions (e.g., Jiang et al., 2014; Jiang et al., 2017). This aspect will be discussed in a companion paper of this text.
4.4. Illustrating the Annual and Interannual Variability in PM10 Pollution
The knowledge of the spatial and temporal variability patterns (
Section 4.2 and 4.3) can be used to facilitate the summarisation of PM10 data with increased clarity. To demonstrate this utility, we use PC scores to visualise the general annual (seasonality) and interannual variation patterns of PM10 pollution for two subregions in the same (one) framework.
Figure 8 shows the heat maps of mean PC1 and PC2 scores for normal days (exceptional event days excluded), by month and year separately for each subregion. The analysis on PC scores for the all-day (full) dataset makes no significant difference to the findings reported here.
It is visually clear that mean PC scores identify very distinct annual variation patterns (seasonality) in PM10 pollution between two subregions. In the WNW subregion (
Figure 8a), positive PC1 mean scores (hence high mean PM10 levels) occurred mostly in warmer months (late spring to summer, highest in November to February), but negative scores (low mean PM10 levels) in cooler months (May to September, lowest in June). In the SE subregion, however, the seasonal variability pattern appears complex (
Figure 8b). Positive mean PC2 scores (thus higher mean PM10 levels) tended to occur in some cooler months (in particular, May and July-October) and negative scores (thus lower PM10 levels) in some warmer months (November-April) in most years. As expected, the mean scores for November and December in 2019 are also high, reflecting the broad-scale higher mean PM10 pollution associated with impacts of the unprecedented 2019-2020 spring-summer bushfires in NSW (DPE, 2020).
The interannual changes in PC scores appear similar between two subregions (
Figure 8). Higher values occurred in 2018-2019 and secondarily 2012-2013 (this variability signal is slightly weaker in the WNW subregion), but lower values in 2021-2022 and secondarily 2015-2016, with 2022 being the cleanest year on record (DPE, 2023). Of note is that June has generally negative mean scores across all years in both subregions, indicating generally better air quality for this time of the year.
4.4.1. Mean PM10 Levels and Total Number of Poor Air Quality Days
In this section, the annual and interannual variability patterns in mean PM10 concentrations and total number of poor air quality days are illustrated at the station level for two air quality subregions (
Figure 9 and
Figure 10). The results confirm the above identified variability patterns in PC scores, but with greater details on PM10 pollution at individual locations/stations.
In the WNW subregion (
Figure 9a), the mean PM10 levels and number of poor air quality days were generally higher in warmer months (i.e., late spring to summer, highest in November to January) but lower in cooler months (lowest in June). In the SE subregion (
Figure 9b), however, two statistics were generally higher in August-November (late winter to spring) and May (highest in September and October) but lower in February, March and June (lowest in June), with near average values in December to January.
Broadly the interannual variation patterns are similar between two subregions, with higher values in 2018-2019 and secondarily 2012-2013 (this variability signal is slightly weaker in the WNW subregion) but lower values in 2021-2022 and secondarily 2015-2016 (
Figure 10). Consistently, OEH (2017a) also noted the higher PM
10 pollution in 2012 and 2013 than other years during 2012-2015. OEH (2014) and DPE (2019, 2020) suggested that the (prior) higher-than-average temperature and prolonged drought conditions across NSW (broadly most of the Australian continent) had contributed to the poorer air quality observed in years of 2012-2013 and 2018-2019. DPE (2022, 2023) noted that the cooler and wetter climate conditions contributed to improved air quality in 2021-2022.
Overall, (as expected) the SE subregion has generally higher mean PM10 pollution and more exceedance days (highest at the four source-impacted sites) than the WNW subregion, as is consistent with local air quality experience (DPE, 2022; Holmes and Associates, 1996). Of the three larger population centre sites, higher PM10 pollution occurred at Muswellbrook and Singleton. In contrast, Aberdeen experienced relatively better air quality, with the mean PM10 levels and total number of exceedance days comparable to Merriwa (background site). Also of note is that June appears the cleanest across all months, recording the lowest mean pollution and least number of poor air quality days across almost all stations.
The distribution patterns identified for the normal-day dataset were further confirmed through a similar analysis on the all-day dataset (
Figures S5 and S6). In comparison, however, the mean PM10 levels and poor air quality days for the all-day dataset show significant increase for November-January and 2018-2020, primarily associated with the broad-scale impact of dust storms in 2018 and the Black Summer Bushfires in 2019-2020 spring-summer months (DPE, 2020).
Figure 1.
Upper Hunter Valley - locations of air quality monitoring stations. Source of base map source: Google.
Figure 1.
Upper Hunter Valley - locations of air quality monitoring stations. Source of base map source: Google.
Figure 2.
Box plots by site for daily PM10 measurements in 2012–2022 (excluding exceptional event days). The lower and upper boundaries of the box are respectively the 25th and 75th percentile; the horizontal line inside the box represents the median; asterisks represent extreme values, cases with values more than 3 box-lengths from the upper or lower edge of the box; dots denote outliers, cases with values between 1.5 and 3 box-lengths from the upper or lower edge of the box; horizontal lines connected to two ends of the box correspond to the largest or smallest observed values that are not outliers. Red dashed line: the Australian national standard of 50 µg/m3 for daily PM10.
Figure 2.
Box plots by site for daily PM10 measurements in 2012–2022 (excluding exceptional event days). The lower and upper boundaries of the box are respectively the 25th and 75th percentile; the horizontal line inside the box represents the median; asterisks represent extreme values, cases with values more than 3 box-lengths from the upper or lower edge of the box; dots denote outliers, cases with values between 1.5 and 3 box-lengths from the upper or lower edge of the box; horizontal lines connected to two ends of the box correspond to the largest or smallest observed values that are not outliers. Red dashed line: the Australian national standard of 50 µg/m3 for daily PM10.
Figure 3.
Summary statistics by station for the normal-day dataset (exceptional days excluded listwise) compared with the all-day dataset (exceptional days included). Values are colour coded for each dataset separately: green – relatively low value; yellow – near medium value; red – relatively high value. Data: daily PM10 concentrations (µg/m3) for stations in the UHAQMN.
Figure 3.
Summary statistics by station for the normal-day dataset (exceptional days excluded listwise) compared with the all-day dataset (exceptional days included). Values are colour coded for each dataset separately: green – relatively low value; yellow – near medium value; red – relatively high value. Data: daily PM10 concentrations (µg/m3) for stations in the UHAQMN.
Figure 4.
Summary statistics by station for exceptional event days (totalling N=130 days) and proportional increases over the normal-day dataset given in
Figure 3. Values are colour coded separately for the exceptional event dataset and the proportional increase ratios: green – relatively low value; yellow – near medium value; red – relatively high value. Data: daily PM10 measurements in 2012-2022.
Figure 4.
Summary statistics by station for exceptional event days (totalling N=130 days) and proportional increases over the normal-day dataset given in
Figure 3. Values are colour coded separately for the exceptional event dataset and the proportional increase ratios: green – relatively low value; yellow – near medium value; red – relatively high value. Data: daily PM10 measurements in 2012-2022.
Figure 5.
Identification of two air quality subregions in the Upper Hunter Valley based on Varimax rotated principal component analysis (RPCA) of daily PM10 data in 2012-2022 (exceptional event days excluded). Left panel: key of station number; middle panel: scatter plot of loadings for first two rotated principal components (PC1, PC2); right panel: map showing two air quality subregions with red balloons indicating station locations in the UHAQMN. Source of base map source: Google.
Figure 5.
Identification of two air quality subregions in the Upper Hunter Valley based on Varimax rotated principal component analysis (RPCA) of daily PM10 data in 2012-2022 (exceptional event days excluded). Left panel: key of station number; middle panel: scatter plot of loadings for first two rotated principal components (PC1, PC2); right panel: map showing two air quality subregions with red balloons indicating station locations in the UHAQMN. Source of base map source: Google.
Figure 6.
WNW subregion temporal variability patterns. (a) The first principal component (PC1) scores used for the wavelet analysis, derived from RPCA on the normal-day dataset where missing data and data for exceptional event days were replaced with overall median for each station. (b) The local normalised wavelet power spectrum of (a) using the Morlet wavelet. The contour lines are at normalised variances of low to high values shown in dark (blue) to bright (light) colours. The thick black contour encloses regions of greater than 95% confidence for a red-noise process with lag-1 serial correlation coefficient. Regions under the bowl-shape curve on either end indicates the “cone of influence”, where edge effects become important. (c) The scale-averaged wavelet power (variance) over the 1-365 days band for PC1 scores.
Figure 6.
WNW subregion temporal variability patterns. (a) The first principal component (PC1) scores used for the wavelet analysis, derived from RPCA on the normal-day dataset where missing data and data for exceptional event days were replaced with overall median for each station. (b) The local normalised wavelet power spectrum of (a) using the Morlet wavelet. The contour lines are at normalised variances of low to high values shown in dark (blue) to bright (light) colours. The thick black contour encloses regions of greater than 95% confidence for a red-noise process with lag-1 serial correlation coefficient. Regions under the bowl-shape curve on either end indicates the “cone of influence”, where edge effects become important. (c) The scale-averaged wavelet power (variance) over the 1-365 days band for PC1 scores.
Figure 7.
SE subregion temporal variability patterns. (a) The second principal component (PC2) scores used for the wavelet analysis, derived from RPCA on the normal-day dataset where missing data and data for exceptional event days were replaced with overall median for each station. (b) The local normalised wavelet power spectrum of (a) using the Morlet wavelet. The contour lines are at normalised variances of values shown in dark (blue) to bright (light) colours. The thick black contour encloses regions of greater than 95% confidence for a red-noise process with lag-1 coefficient. Regions under the bowl-shape curve on either end indicates the “cone of influence”, where edge effects become important. (c) The scale-averaged wavelet power (variance) over the 1-365 days band for PC2 scores.
Figure 7.
SE subregion temporal variability patterns. (a) The second principal component (PC2) scores used for the wavelet analysis, derived from RPCA on the normal-day dataset where missing data and data for exceptional event days were replaced with overall median for each station. (b) The local normalised wavelet power spectrum of (a) using the Morlet wavelet. The contour lines are at normalised variances of values shown in dark (blue) to bright (light) colours. The thick black contour encloses regions of greater than 95% confidence for a red-noise process with lag-1 coefficient. Regions under the bowl-shape curve on either end indicates the “cone of influence”, where edge effects become important. (c) The scale-averaged wavelet power (variance) over the 1-365 days band for PC2 scores.
Figure 8.
Mean PC1 and PC2 scores by month and year for two subregions. PC1 and PC2 are from RPCA on the normal-day PM10 dataset (exceptional event days excluded). Blank cell indicates there are insufficient data point for valid calculation. Colour scale: green - relatively low value; yellow – near medium value; red - relatively high value.
Figure 8.
Mean PC1 and PC2 scores by month and year for two subregions. PC1 and PC2 are from RPCA on the normal-day PM10 dataset (exceptional event days excluded). Blank cell indicates there are insufficient data point for valid calculation. Colour scale: green - relatively low value; yellow – near medium value; red - relatively high value.
Figure 9.
Monthly mean PM10 levels and total number of poor air quality days (with PM10 levels > 50 µg/m3) for stations in the (a) WNW and (b) SE subregions. Rows are sorted by the “All months” column (multi-year station means or total number of poor air quality days). Data: daily PM10 measurements in 2012–2022 (excluding exceptional event days). Colour scale: green – relatively low value; yellow – near medium value; red – relatively high value.
Figure 9.
Monthly mean PM10 levels and total number of poor air quality days (with PM10 levels > 50 µg/m3) for stations in the (a) WNW and (b) SE subregions. Rows are sorted by the “All months” column (multi-year station means or total number of poor air quality days). Data: daily PM10 measurements in 2012–2022 (excluding exceptional event days). Colour scale: green – relatively low value; yellow – near medium value; red – relatively high value.
Figure 10.
Annual mean PM10 levels and total number of poor air quality days (with PM10 levels > 50 µg/m3) for stations in the (a) WNW and (b) SE subregions. Rows are sorted by the “All years” column (multi-year station means or total number of poor air quality days). Data: daily PM10 measurements in 2012–2022 (excluding exceptional event days). Colour scale: green - relatively low value; yellow – near medium value; red - relatively high value.
Figure 10.
Annual mean PM10 levels and total number of poor air quality days (with PM10 levels > 50 µg/m3) for stations in the (a) WNW and (b) SE subregions. Rows are sorted by the “All years” column (multi-year station means or total number of poor air quality days). Data: daily PM10 measurements in 2012–2022 (excluding exceptional event days). Colour scale: green - relatively low value; yellow – near medium value; red - relatively high value.
Table 1.
Air quality station details and PM10 data used in this study. Source: adapted from OEH (2012).
Table 1.
Air quality station details and PM10 data used in this study. Source: adapted from OEH (2012).
Station type |
Station purpose |
Station name* |
Total number of days with valid daily PM10 data in 2012-2022 |
Total number of days with invalid or missing daily PM10 data in 2012-2022 |
Larger population centre |
Monitoring air quality in the larger population centres |
Aberdeen |
3984 |
34 |
Muswellbrook |
3968 |
50 |
Singleton |
3978 |
40 |
Smaller population centre |
Monitoring air quality in the smaller communities |
Bulga |
3967 |
51 |
Camberwell |
3970 |
48 |
Jerrys Plains |
3950 |
68 |
Maison Dieu |
3953 |
65 |
Warkworth |
3940 |
78 |
Wybong |
3962 |
56 |
Diagnostic |
Providing diagnostic data that helps to diagnose the likely sources and movement of particles across the region as a whole; they do not provide information about air quality at population centres |
Mt Thorley |
3876 |
142 |
Muswellbrook NW |
3986 |
32 |
Singleton NW |
3978 |
40 |
Background |
Providing background data at the top (Merriwa) and bottom (Singleton South) ends of the valley |
Merriwa |
3862 |
156 |
Singleton S |
3951 |
67 |