The purpose of this study was to use spatiotemporal data to measure China's provincial employment quality, further determine the spatial correlation of China's provincial employment quality, and determine the size and direction of the factors affecting employment quality. (1) PTA3 method was designed to decompose the comprehensive interaction of multi-dimensions and multi-features, and could reduce the dimensionality of high-order and high-dimensional data without changing its order. It overcomed the disadvantage of processing spatiotemporal multi-dimensional data in a sequential manner while ignoring its multi-linear structure. Therefore, this study applied the PTA3 method to measure employment quality from a spatio-temporal perspective for the first time, making up for the shortcomings of existing research that ignored the comprehensive role of multi-dimensional and multi-features in the employment quality evaluation system. (2) Global spatial autocorrelation was selected to analyze the spatial correlation of employment quality. (3) The spatial econometric model could analyze the direction and magnitude of the impact of factors affecting employment quality on itself and neighboring provinces from a spatial perspective.
3.1. PTAk Model of k-order Tensor (K>2)
PTAk represents a technique for decomposing complex, high-dimensional array data, known as N (N>2)-order tensors. This approach extends the well-known principal component analysis to higher dimensions. PTAk's core objective is to approximate intricate high-order tensors using simpler, low-order ones, facilitating feature extraction from intricate high-dimensional datasets. Essentially, it functions as a generalized singular value decomposition model. The method relies on the alternating least squares technique to compute the principal tensor efficiently. This process enables the extraction of orthogonal sub-tensors from the original data, providing an effective approximation of the high-dimensional space. The significance of the principal tensor is evaluated based on the magnitude of its singular value, ensuring reliability in its selection.
Historically, the processing of spatiotemporal multidimensional data has often involved converting it into a sequential vector format for analysis within a linear subspace. However, this approach neglects the inherent multilinearity of such data. Analogous to the limitations of a two-way analysis table that can only be collapsed or expanded in two dimensions, traditional methods often overlook the rich interactions present in higher-order data. PTAk's framework addresses this by incorporating duality principles, thereby broadening the scope of multidimensional analysis to better capture the complexities of spatiotemporal datasets. The structural form of PTAk for tensors of order k is described as follows:
The first principal quantity, the optimized form of singular value is as follows:
In this context, "⊗" signifies the tensor product, while ".." indicates the contraction operation, which is akin to the inner product operation in the tensor space. The letter "X" serves as a placeholder for matrix or tensor data from the same dataset. The symbols represent the first principal components. When combined, is designated as the primary component. This approach allows us to determine the most precise rank-one approximation, along with the singular value, for a given tensor X. The specific calculation technique used in this step is known as the RPVSCC algorithm. This algorithm is equivalent to TUCKALS3, which focuses on selecting only one component per module. The uniqueness of the tensor solution given by Eq. (1) under orthogonal transformation was first established by Leibovici in 1999 (Leibovici D., 1999).
The PTAk model offers a more convenient calculation method because, instead of directly employing algebraic methods to calculate the tensor product of vectors, it employs the contraction operator, as described in Eq.(2). This approach streamlines the computational process and makes it more efficient.
To solve for the second principal tensor, an orthogonality constraint is introduced to Eq.(1). This optimization process mirrors the preceding step in the solution methodology, focusing on the projection of tensor X onto the orthogonal tensor of the initial argument, denoted as . Notably, represents the orthogonal tensor of the first principal tensor and can alternatively be expressed as .
According to Eq.(3), the PTAk decomposition offers a method for synthesizing data from a set of uncorrelated components. In the schemas utilized for PTA3(X) and PTAk(X), it's possible to distinguish between the principal tensor and its associated principal tensor. The associated tensors are connected to the principal tensor because they display one or more of its components within the first principal tensor's component set. After a tensor of rank
k performs a contraction operation on a given component, the associated principal tensor quantity can be decomposed using PTA(k-1). This makes the PTAk algorithm a recursive algorithm. When
k=3, there are specific instances where this decomposition applies.
The notation means that in this full tensor product operation, the vector on the left will occupy the ith position in k positions, such as .
PTA3 Model Algorithm
Before introducing the algorithm of the PTAk model, a brief overview of the RPVSCC algorithm is provided. The objective of the RPVSCC algorithm is to identify the principal tensor quantity of the initial tensor
X. Detailed pseudocode for this algorithm will be provided later.
| RPVSCC algorithm(k=3) |
| Input: Tensor X, maximum iteration step size MAX and stop iteration threshold ɛ
|
| Output: singular value σ and its principal tensor components α, β, γ |
| step: |
| 1: Initialize a set of principal tensor components , , |
| 2: for i from 1 to MAX: |
|
|
|
| If the extreme value of , , is less than ɛ, jump out of the iteration loop and output , , , 。 |
The pseudocode of the complete PTAk algorithm is as follows:
| PTAk algorithm(k=3) |
| Input: tensor X, order k=3, maximum iteration step size MAX and stop iteration threshold ɛ
|
| Output: principal tensor ,i=1,2,,m, associated principal tensor ,j=1,2,,k
|
| step: |
| 1: Run the RPVSCC algorithm to get the principal tensor |
| 2: On the orthogonal tensor space of the solution obtained in step 1, repeat step 1 to obtain all principal tensors. |
| 3: for i from 1 to m: |
| For j from 1 to k: |
|
|
| return |
In the PTAk algorithm's pseudocode, for a given i, j, may not always represent only the associated principal tensor quantity. This is because the optimization can select multiple associated principal tensors. Additionally, the singular value of each principal tensor quantity obtained may not be larger than the singular value of all associated principal tensors. This is because the singular value of the associated principal tensor of the ith principal tensor may be larger than that of the i+1th principal tensor.
3.2. Global spatial Autocorrelation analysis
Global spatial autocorrelation analysis is to analyze whether a phenomenon exists spatial correlation in the study area, that is, to describe the spatial distribution of carbon emissions in our country from the perspective of space. Moran's I is a commonly used index, and the formula of Moran's I is as follows:
In Formula (1),
is any element of the binary space weight matrix,
and
is the carbon emission score value of the
and
regions,
is the total number of regions, and
is the sample variance. In this study, the definition
is as follows:
The value range of Moran's I is -1≤Moran's I≤1. If Moran's I is positive, it indicates that the carbon emission of each province presents a positive spatial correlation. If Moran's I is negative, it means that the carbon emission of each province presents a spatial negative correlation. If Moran's I is zero, it means that the carbon emissions of each province are irrelevant. Moran's scatter plot can divide our provincial carbon emissions into four spatial dependence patterns, which are located in four quadrants. In the first quadrant, provinces with high carbon emissions are surrounded by provinces with also high carbon emissions (HH), in the second quadrant, provinces with high carbon emissions are surrounded by provinces with low carbon emissions (HL), in the third quadrant, provinces with low carbon emissions are surrounded by provinces with high carbon emissions (LH), and in the fourth quadrant, provinces with low carbon emissions are surrounded by provinces with also low carbon emissions (LL).
For the calculation results of Moran's I index, two hypotheses of asymptotic normal distribution and random distribution can be used respectively to test the standardized formula as follows:
The expected value formula of standardized Moran's I can be calculated according to the distribution of geospatial data as follows:
3.3. Spatial metering model
According to the "first law of geography", all things are interrelated with other things, and things near are more relevant than things far away. Spatial metrology is to study the correlation between things. Currently, the commonly used spatial metrology models include spatial lag model (SLM), spatial error model (SEM), and spatial Dubin model (SDM). In this study, a spatial econometric model was established to analyze the influencing factors of China's inter-provincial carbon emissions.
(1) Spatial lag Model (SLM)
The spillover effect of carbon emissions from neighboring provinces to their own province is mainly analyzed, that is, the size and direction of the spatial influence of carbon emissions from neighboring provinces on carbon emissions of a certain province. The model is as follows:
In formula: is the explained variable; , is the province; is the number of provinces; is the year; is the space weight matrix; is the spatial autoregressive coefficient; is the explanatory variable; is the coefficient of the explanatory variable; is the number of explanatory variables; is the spatial fixed effect; is time fixed effect; is random error.
(2) Spatial Error model (SEM)
This paper mainly analyzes the differences of carbon emissions among provinces affected by geographical location, and represents the impact of carbon emission error impact of neighboring provinces on regional carbon emissions. The model is as follows:
In formula: is the spatial autoregressive error term; is the spatial autocorrelation coefficient of the error term.
(3)Spatial Durbin Model (SDM)
This paper mainly analyzes the influence of influencing factors of carbon emissions in the province on carbon emissions in the local and neighboring provinces, and can investigate the influence of spatial lag term on carbon emissions. The model is as follows:
In formula: is the coefficient of spatial lag explanatory variable.