Submitted:
04 November 2024
Posted:
06 November 2024
You are already at the latest version
Abstract
This paper describes the specification of spatially varying coefficient (SVC) regression models using Generalized Additive Models (GAMs). The GAMs include Gaussian Process (GP) splines (smooths) for each covariate parameterised with observation location and generate SVC estimates that capture spatially varying relationships. The ability of the proposed GAM approach to estimate true spatially varying coefficients was compared with that of Multiscale Geographically Weighted Regression (MGWR) using simulated data with complex spatial heterogeneities. The geographical GP GAM (GGP-GAM) was found to out-perform MGWR across a range of fit metrics and resulted in more accurate coefficient estimates and lower residual errors. The model for one of simulated datasets was investigated in detail to illustrate GAM diagnostics, model checks, spline / smooth convergence and basis evaluations, and tuning via the number knots. A larger simulated case study was investigated to explore the trade-offs between GGP-GAM complexity, performance and computation. Finally the GGP-GAM and MGWR approaches were applied to an empirical case study. The resulting models had very similar accuracies and fits, and generated subtly different spatially varying coefficient estimates. A number of areas of further work are identified.
Keywords:
1. Introduction
2. Literature Review
2.1. SVC Models
2.2. GAM Based SVC Models
3. Methodology
3.1. Data
3.2. GAM-Based SVC Models
3.3. Analysis I: Comparing GGP-GAM and MGWR
3.4. Analysis II: GGP-GAM Tuning with a Larger Dataset
3.5. Empirical Example: Brexit Vote
4. Results
4.1. Comparing GPP-GAM and MGWR
4.2. A Single GGP-GAM in Detail
4.3. GGP-GAM Tuning with a Larger Dataset

4.4. Empirical Example: Brexit Vote
- Intercept (): This is not locally significant in the GGP-GAM model but is in its parametric form (not shown). The MGWR model indicates that it has a highly localised (i.e. spatially varying) relationship with a relatively small bandwidth. Both sets of coefficient estimates are positive and have similar values and ranges.
- Christian: Both sets of coefficient estimates indicate a negative association with the Leave vote in Scotland and parts of North Wales and a positive one in England. It is locally significant in the GGP-GAM model and exhibits moderate local variation in the MGWR model, with a bandwidth of 159 km.
- Degree: This is locally significant in the GGP-GAM model and is negatively associated with the Leave vote throughout the study area in both models. It indicates moderate local variation in the MGWR bandwidth (204 km).
- No Car: This is locally significant in the GGP-GAM model. It is mostly negatively associated with the Leave vote share in both models and indicates similar areas of positive association with the Leave vote share in the North. The MGWR bandwidth indicates moderate local variation (172 km).
- Younger: This is not locally significant in the GGP-GAM model In the MGWR, its bandwidth (1196 km) indicates that it is globally (evenly) associated with the Leave vote share.
5. Discussion
6. Conclusion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Openshaw, S. Developing GIS-relevant zone-based spatial analysis methods. Spatial analysis: modelling in a GIS environment 1996, pp. 55–73.
- Brunsdon, C.; Fotheringham, A.S.; Charlton, M.E. Geographically weighted regression: a method for exploring spatial nonstationarity. Geographical Analysis 1996, 28, 281–298. [Google Scholar] [CrossRef]
- Yang, W. An extension of geographically weighted regression with flexible bandwidths. PhD thesis, University of St Andrews, 2014.
- Fotheringham, A.S.; Yang, W.; Kang, W. Multiscale geographically weighted regression (MGWR). Annals of the American Association of Geographers 2017, 107, 1247–1265. [Google Scholar] [CrossRef]
- Comber, A.; Brunsdon, C.; Charlton, M.; Dong, G.; Harris, R.; Lu, B.; Lü, Y.; Murakami, D.; Nakaya, T.; Wang, Y.; others. A route map for successful applications of geographically weighted regression. Geographical Analysis 2023, 55, 155–178. [Google Scholar] [CrossRef]
- Sachdeva, M.; Fotheringham, A.S.; Li, Z.; Yu, H. On the local modeling of count data: multiscale geographically weighted Poisson regression. International Journal of Geographical Information Science 2023, 37, 2238–2261. [Google Scholar] [CrossRef]
- Hastie, T.; Tibshirani, R. Generalized Additive Models. Statistical Science 1986, 1, 297–310. [Google Scholar]
- Hastie, T.; Tibshirani, R. Generalized Additive Models; Chapman and Hall / CRC Press, 1990.
- Gelfand, A.E.; Kim, H.J.; Sirmans, C.; Banerjee, S. Spatial modeling with spatially varying coefficient processes. Journal of the American Statistical Association 2003, 98, 387–396. [Google Scholar] [CrossRef]
- Finley, A.O. Comparing spatially-varying coefficients models for analysis of ecological data with non-stationary and anisotropic residual dependence. Methods in Ecology and Evolution 2011, 2, 143–154. [Google Scholar] [CrossRef]
- Kim, H.; Lee, J. Hierarchical spatially varying coefficient process model. Technometrics 2017, 59, 521–527. [Google Scholar] [CrossRef]
- Finley, A.O.; Banerjee, S. Bayesian spatially varying coefficient models in the spBayes R package. Environmental Modelling & Software 2020, 125, 104608. [Google Scholar]
- Murakami, D.; Griffith, D.A. Random effects specifications in eigenvector spatial filtering: a simulation study. Journal of Geographical Systems 2015, 17, 311–331. [Google Scholar] [CrossRef]
- Murakami, D.; Griffith, D.A. Balancing Spatial and Non-Spatial Variation in Varying Coefficient Modeling: A Remedy for Spurious Correlation. Geographical Analysis 2023, 55, 31–55. [Google Scholar] [CrossRef]
- Murakami, D.; Yoshida, T.; Seya, H.; Griffith, D.A.; Yamagata, Y. A Moran coefficient-based mixed effects approach to investigate spatially varying relationships. Spatial Statistics 2017, 19, 68–89. [Google Scholar] [CrossRef]
- Griffith, D.A. Spatial-filtering-based contributions to a critique of geographically weighted regression (GWR). Environment and Planning A 2008, 40, 2751–2769. [Google Scholar] [CrossRef]
- Mu, J.; Wang, G.; Wang, L. Estimation and inference in spatially varying coefficient models. Environmetrics 2018, 29, e2485. [Google Scholar] [CrossRef]
- Dambon, J.A.; Sigrist, F.; Furrer, R. Maximum likelihood estimation of spatially varying coefficient models for large data with an application to real estate price prediction. Spatial Statistics 2021, 41, 100470. [Google Scholar] [CrossRef]
- Fan, Y.T.; Huang, H.C. Spatially varying coefficient models using reduced-rank thin-plate splines. Spatial Statistics 2022, 51, 100654. [Google Scholar] [CrossRef]
- Comber, A.; Brunsdon, C.; Charlton, C.; Harris, P.; Harris, Lu, B.; Malleson, N. gwverse: a template for a new generic Geographically Weighted R package. arXiv preprint arXiv:2109.14542 2021. [Google Scholar] [CrossRef]
- Wolf, L.J.; Oshan, T.M.; Fotheringham, A.S. Single and multiscale models of process spatial heterogeneity. Geographical Analysis 2018, 50, 223–246. [Google Scholar] [CrossRef]
- Bivand, R.S.; Pebesma, E.J.; Gómez-Rubio, V.; Pebesma, E.J. Applied spatial data analysis with R; Vol. 74724 8717, Springer, 2008. [Google Scholar]
- Fahrmeir, L.; Kneib, T.; Lang, S.; Marx, B.D. Regression models. In Regression; Springer, 2021; pp. 23–84.
- Wood, S.N. Generalized additive models: an introduction with R; CRC press, 2017.
- Friedman, J.H. Greedy function approximation: a gradient boosting machine. Annals of statistics 2001, pp. 1189–1232.
- Zschech, P.; Weinzierl, S.; Hambauer, N.; Zilker, S.; Kraus, M. GAM (e) changer or not? An evaluation of interpretable machine learning models based on additive model constraints. arXiv preprint arXiv:2204.09123 2022. [Google Scholar]
- Stasinopoulos, D.M.; Rigby, R.A. Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software 2008, 23, 1–46. [Google Scholar]
- Stasinopoulos, M.D.; Rigby, R.A.; Heller, G.Z.; Voudouris, V.; De Bastiani, F. Flexible regression and smoothing: using GAMLSS in R; CRC Press, 2017.
- Umlauf, N.; Adler, D.; Kneib, T.; Lang, S.; Zeileis, A. Structured additive regression models: An R interface to BayesX 2015.
- Umlauf, N.; Klein, N.; Zeileis, A. BAMLSS: Bayesian additive models for location, scale, and shape (and beyond). Journal of Computational and Graphical Statistics 2018, 27, 612–627. [Google Scholar] [CrossRef]
- Tobler, W.R. A computer movie simulating urban growth in the Detroit region. Economic geography 1970, 46, 234–240. [Google Scholar] [CrossRef]
- Williams, C.K.; Rasmussen, C.E. Gaussian processes for machine learning; Vol. 2, MIT press Cambridge, MA, 2006.
- Comber, A.; Harris, P.; Brunsdon, C. Multiscale spatially varying coefficient modelling using a Geographical Gaussian Process GAM. International Journal of Geographical Information Science 2024, 38, 27–47. [Google Scholar] [CrossRef]
- Murakami, D. spmoran: An R package for Moran’s eigenvector-based spatial regression analysis. arXiv preprint arXiv:1703.04467, 2017. [Google Scholar]
- Wood, S.; Wood, M.S. Package `mgcv’. R package version 2015, 1, 729. [Google Scholar]
- Lu, B.; Harris, P.; Charlton, M.; Brunsdon, C. The GWmodel R package: further topics for exploring spatial heterogeneity using geographically weighted models. Geo-spatial Information Science 2014, 17, 85–101. [Google Scholar] [CrossRef]
- Gollini, I.; Lu, B.; Charlton, M.; Brunsdon, C.; Harris, P.; others. GWmodel: An R Package for Exploring Spatial Heterogeneity Using Geographically Weighted Models. Journal of Statistical Software 2015, 63. [Google Scholar] [CrossRef]
- Lu, B.; Hu, Y.; Yang, D.; Liu, Y.; Liao, L.; Yin, Z.; Xia, T.; Dong, Z.; Harris, P.; Brunsdon, C.; others. GWmodelS: A software for geographically weighted models. SoftwareX 2023, 21, 101291. [Google Scholar] [CrossRef]
- Oshan, T.M.; Li, Z.; Kang, W.; Wolf, L.J.; Fotheringham, A.S. mgwr: A Python implementation of multiscale geographically weighted regression for investigating process spatial heterogeneity and scale. ISPRS International Journal of Geo-Information 2019, 8, 269. [Google Scholar] [CrossRef]
- Beecham, R.; Slingsby, A.; Brunsdon, C. Locally-varying explanations behind the United Kingdom’s vote to leave the European Union. Journal of Spatial Information Science 2018, 16, 117–136. [Google Scholar] [CrossRef]
- Comber, L.; Harris, P.; Brunsdon, C. stgam: Spatially and Temporally Varying Coefficient Models Using Generalized Additive Models, 2024. R package version 0.0.1.0.
- Geniaux, G.; Martinetti, D. A new method for dealing simultaneously with spatial autocorrelation and spatial heterogeneity in regression models. Regional Science and Urban Economics 2018, 72, 74–85. [Google Scholar] [CrossRef]
- Comber, A.; Harris, P.; Brunsdon, C. Multiscale Spatially and Temporally Varying Coefficient Modelling Using a Geographic and Temporal Gaussian Process GAM (GTGP-GAM)(Short Paper). 12th International Conference on Geographic Information Science (GIScience 2023). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2023.
- Brunsdon, C.; Harris, P.; Comber, A. Smarter Than Your Average Model-Bayesian Model Averaging as a Spatial Analysis Tool (Short Paper). Proceedings of 12th International Conference on Geographic Information Science (GIScience 2023). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 2023, Vol. 277, p. 17.








| Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | |
|---|---|---|---|---|---|---|
| GGP-GAM | -0.130 | -0.019 | 0.000 | 0.000 | 0.019 | 0.133 |
| MGWR | -0.520 | -0.052 | 0.000 | 0.001 | 0.052 | 0.600 |
| Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | |
|---|---|---|---|---|---|---|
| GGP-GAM | -0.199 | -0.155 | -0.002 | -0.054 | 0.000 | 0.000 |
| MGWR | -0.002 | -0.002 | 0.000 | 0.109 | 0.313 | 0.413 |
| Min. | 1st Qu. | Median | 3rd Qu. | Max. | |
|---|---|---|---|---|---|
| GGP-GAM SP | 9.5e-04 | 1.5e-02 | 8.7e+00 | 2.5e+01 | 4.5e+01 |
| GGP-GAM SP | 4.7e-06 | 7.1e-06 | 7.9e-06 | 8.6e-06 | 1.2e-05 |
| GGP-GAM SP | 7.0e-07 | 8.0e-07 | 8.8e-07 | 9.7e-07 | 1.2e-06 |
| GGP-GAM SP | 1.5e-07 | 1.8e-07 | 1.9e-07 | 2.1e-07 | 2.8e-07 |
| MGWR BW | 8 | 10 | 10 | 10 | 12 |
| MGWR BW | 22 | 30 | 34 | 42 | 73 |
| MGWR BW | 17 | 24.75 | 30 | 34 | 42 |
| MGWR BW | 4 | 17 | 17 | 21 | 598 |
| k’ | edf | k-index | p-value | |
|---|---|---|---|---|
| s(X,Y): | 154 | 2.748 | 1.192 | 1 |
| s(X,Y): | 155 | 47.469 | 1.192 | 1 |
| s(X,Y): | 155 | 93.353 | 1.192 | 1 |
| s(X,Y): | 155 | 136.112 | 1.192 | 1 |
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| 2.136 | 0.012 | 180.69 | 0 | |
| 0.000 | 0.000 | NaN | NaN | |
| 0.000 | 0.000 | NaN | NaN | |
| 0.000 | 0.000 | NaN | NaN |
| edf | Ref.df | F | p-value | |
|---|---|---|---|---|
| s(X,Y): | 2.748 | 3.307 | 0.638 | 0.596 |
| s(X,Y): | 47.469 | 60.164 | 121.679 | 0.000 |
| s(X,Y): | 93.353 | 110.613 | 77.813 | 0.000 |
| s(X,Y): | 136.112 | 142.870 | 80.537 | 0.000 |
| Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | |
|---|---|---|---|---|---|---|
| 2.091 | 2.123 | 2.137 | 2.136 | 2.149 | 2.164 | |
| -1.942 | -0.711 | 0.010 | 0.010 | 0.744 | 1.929 | |
| -1.919 | -0.893 | 0.024 | -0.005 | 0.833 | 1.924 | |
| -2.121 | -0.695 | -0.013 | -0.020 | 0.666 | 1.995 |
| k | Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. |
|---|---|---|---|---|---|---|
| 100 | -0.3007 | -0.0257 | 4e-04 | 0 | 0.0263 | 0.2395 |
| 250 | -0.2508 | -0.0209 | 1e-04 | 0 | 0.0212 | 0.2003 |
| 500 | -0.2069 | -0.0194 | 0e+00 | 0 | 0.0194 | 0.1828 |
| 750 | -0.1748 | -0.0186 | 0e+00 | 0 | 0.0187 | 0.1476 |
| 1000 | -0.1510 | -0.0182 | 0e+00 | 0 | 0.0183 | 0.1357 |
| 1500 | -0.1284 | -0.0179 | -1e-04 | 0 | 0.0179 | 0.1278 |
| 2000 | -0.1208 | -0.0177 | -1e-04 | 0 | 0.0178 | 0.1256 |
| k | ||||||||
|---|---|---|---|---|---|---|---|---|
| 100 | 2.119 | 0.000 | 0.000 | NaN | 62.613 | 0.000 | 0.000 | NaN |
| 250 | 2.119 | 0.000 | 0.826 | 0.789 | 24.283 | 0.010 | 130.407 | 0.000 |
| 500 | 2.119 | 0.000 | -0.567 | 0.847 | 0.000 | NaN | 5.441 | 0.784 |
| 750 | 2.120 | 0.000 | 0.228 | 0.938 | 0.000 | NaN | 87.764 | 0.000 |
| 1000 | 2.120 | 0.000 | 0.000 | NaN | 0.000 | NaN | 0.000 | NaN |
| 1500 | 2.120 | 0.000 | -0.617 | 0.831 | 5.618 | 0.547 | -5.470 | 0.818 |
| 2000 | 2.120 | 0.000 | -0.830 | 0.775 | 0.000 | NaN | -20.802 | 0.381 |
| k | ||||||||
|---|---|---|---|---|---|---|---|---|
| 100 | 2.011 | 0.521 | 95.002 | 0.000 | 97.872 | 0.000 | 99.889 | 0.000 |
| 250 | 4.512 | 0.594 | 196.866 | 0.000 | 239.852 | 0.000 | 247.733 | 0.000 |
| 500 | 3.983 | 0.705 | 266.541 | 0.000 | 429.945 | 0.000 | 490.232 | 0.000 |
| 750 | 2.006 | 0.636 | 294.026 | 0.000 | 553.743 | 0.000 | 724.258 | 0.000 |
| 1000 | 2.022 | 0.643 | 309.444 | 0.000 | 636.787 | 0.000 | 938.711 | 0.000 |
| 1500 | 2.012 | 0.807 | 329.487 | 0.000 | 771.339 | 0.000 | 1344.097 | 0.000 |
| 2000 | 2.017 | 0.718 | 338.441 | 0.000 | 839.436 | 0.000 | 1674.279 | 0.000 |
| k | s(X,Y): | s(X,Y): | s(X,Y): | s(X,Y): |
|---|---|---|---|---|
| 100 | 15.8204 | 4.86e-07 | 3.01e-08 | 2.95e-09 |
| 250 | 0.0403 | 5.07e-07 | 4.93e-08 | 5.49e-09 |
| 500 | 0.0577 | 4.57e-07 | 5.05e-08 | 5.51e-09 |
| 750 | 32.2305 | 4.20e-07 | 4.19e-08 | 4.46e-09 |
| 1000 | 8.0443 | 4.06e-07 | 3.88e-08 | 4.48e-09 |
| 1500 | 15.3684 | 3.89e-07 | 3.69e-08 | 5.28e-09 |
| 2000 | 10.4797 | 3.83e-07 | 3.66e-08 | 5.29e-09 |
| Model | R2 | MAE | RMSE | AIC |
|---|---|---|---|---|
| MGWR | 0.940 | 0.017 | 0.025 | -1701.7 |
| GGP-GAM | 0.938 | 0.018 | 0.026 | -1685.1 |
| Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | Smooth p-values | |
|---|---|---|---|---|---|---|---|
| 0.346 | 0.778 | 0.826 | 0.829 | 0.886 | 1.143 | 0.617 | |
| -0.326 | 0.082 | 0.146 | 0.137 | 0.217 | 0.429 | 0.001 | |
| -1.532 | -1.193 | -1.088 | -1.084 | -0.958 | -0.029 | 0.000 | |
| -0.497 | -0.229 | -0.122 | -0.154 | -0.078 | 0.254 | 0.001 | |
| -0.597 | -0.181 | -0.130 | -0.138 | -0.083 | 0.385 | 0.102 |
| Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | Bandwidth (km) | |
|---|---|---|---|---|---|---|---|
| 0.421 | 0.807 | 0.841 | 0.850 | 0.903 | 1.196 | 68.8 | |
| -0.403 | 0.092 | 0.133 | 0.118 | 0.189 | 0.805 | 158.5 | |
| -1.275 | -1.121 | -0.995 | -1.027 | -0.951 | -0.766 | 204.0 | |
| -1.510 | -0.156 | -0.028 | -0.092 | -0.009 | 0.113 | 171.8 | |
| -0.250 | -0.244 | -0.243 | -0.244 | -0.243 | -0.243 | 1196.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).