Preprint
Article

This version is not peer-reviewed.

Geometric Residual Projection in Linear Regression: Rank-Aware Operators and a Geometric Multicollinearity Index

Submitted:

27 January 2026

Posted:

28 January 2026

You are already at the latest version

Abstract
Residuals play a central role in linear regression, but their geometric structure is often obscured by formulas built from matrix inverses and pseudoinverses. This paper develops a rank-aware geometric framework for residual projection that makes the underlying orthogonality explicit. When the design matrix has codimension-one, the unexplained part of the response lies on a single unit normal to the predictor space, so the residual projector collapses to a rank-one operator nn and no matrix inversion is needed. For general, possibly rank-deficient, designs the residual lies in a higher-dimensional orthogonal complement spanned by an orthonormal basis N, and the residual projector factorizes as NN. Using generalized cross products, wedge products, and Gram determinants, we give a basis-independent characterization of this residual space. On top of this, we introduce the Geometric Multicollinearity Index (GMI), a scale-invariant diagnostic derived from the polar sine that measures how the volume of the predictor space shrinks as multicollinearity increases. Synthetic examples and an illustrative real-data experiment show that the proposed projectors reproduce ordinary least squares residuals, that GMI responds predictably to controlled collinearity, and that the geometric viewpoint clarifies the different roles of regression projection and principal component analysis in both full-rank and rank-deficient settings.
Keywords: 
;  ;  ;  ;  ;  ;  ;  

1. Introduction

Linear regression remains one of the most widely used modeling tools in statistics, data science, and machine learning [2]. In its familiar form
Y = X β + ε , E [ ε ] = 0 , X R n × p , Y R n , β R p ,
The ordinary least squares (OLS) estimator chooses coefficients that minimize the squared discrepancy between observations and predictions. When the design matrix X has the full column-rank, the fitted values and the associated orthogonal projector onto the column space C ( X ) take the classical form
β ^ = ( X X ) 1 X Y , Y ^ = X β ^ , P = X ( X X ) 1 X ,
with P symmetric and idempotent [5]. More generally, the orthogonal projector onto C ( X ) can be written as P = X X + in terms of the Moore–Penrose pseudoinverse X +  [5].
The fitted component and residual are
Y ^ = P Y , r = ( I P ) Y ,
so that r C ( X ) and
Y = P Y + r ,
is the orthogonal decomposition of the response into a part explained by C ( X ) and a part orthogonal to it. This algebraic formulation is standard, but it can obscure the underlying geometry that governs the behavior of the residuals, and inversion-based formulas can be fragile in high-dimensional or nearly singular regimes [5].
A classical identity from numerical linear algebra makes the geometry explicit:
I P = N N ,
where the columns of N form an orthonormal basis of C ( X )  [5]. When ρ = n 1 , the orthogonal complement is one-dimensional and the projector reduces to the rank-one form
I P = n n ,
where n is the unit vector (ambiguous sign) perpendicular to C ( X ) . In this codimension-one setting, the residual is the projection of Y into a single direction, and the operator (5) completely avoids matrix inversion. For higher-codimension, the residual lies in a subspace of dimension k = n ρ , and the factorization (4) makes this structure explicit.
The construction of normal directions motivates generalizations of the classical cross-product to R n . Belhaouari et al. formalize a determinant-based vector product that yields a vector orthogonal to n 1 linearly independent vectors in R n  [7]. In regression problems with dim C ( X ) = 1 , a single normal vector uniquely defines the rank-one projector (5). For higher-codimensions, these constructions naturally combine with null-space and QR-based methods to form an orthonormal residual basis N.
Multicollinearity is a persistent challenge in regression modeling, as highly dependent predictors inflate variances, destabilize coefficient estimates, and impede interpretation [11]. Classical diagnostics such as the variance inflation factor (VIF) quantify this effect but offer limited geometric insight. We therefore introduce the Geometric Multicollinearity Index (GMI), a scale-invariant, volume-based measure derived from the polar sine and the Gram determinant. For a full column-rank matrix X = [ x 1 , , x p ] with p n , the polar sine normalizes det ( X X ) by the product of the column norms, taking values in [ 0 , 1 ] : it equals 1 for orthogonal predictors and approaches 0 as multicollinearity increases.
In rank-deficient or over-parameterized regimes, the Gram determinant vanishes and the GMI saturates at 1, indicating extreme geometric degeneracy. As a normalized volume measure of the parallelotope spanned by the columns of X, the GMI provides a compact, scale-invariant geometric diagnostic across dimensions. In this regime, GMI = 1 should be interpreted as a diagnostic flag for complete geometric collapse, rather than as a measure of severity beyond rank deficiency.
Although both regression and principal component analysis (PCA) rely on orthogonal projections, their objectives differ fundamentally [4]. PCA projects X onto directions of maximum variance, yielding residuals ( I P PCA ) X that remain in C ( X ) . By contrast, regression projects the response, decomposing Y into a fitted component P Y and a residual r = ( I P ) Y C ( X )  [5]. This distinction between feature-space and response-space residuals is essential for interpreting linear-model diagnostics.
Scope of the present paper. The identities I P = N N and I P = n n are standard results from orthogonal projection theory. The contribution of this paper lies in a rank-aware geometric interpretation of these identities via multivector and cross-product constructions, together with the introduction of the volume-based Geometric Multicollinearity Index (GMI) as a complementary geometric diagnostic.
This work is theoretical in nature. We develop a rank-aware residual projector, characterize the associated residual subspaces, and formalize GMI as an intrinsic measure of multicollinearity. Algorithmic, large-scale, regularized, and nonlinear extensions are beyond the scope of this paper and are left for future work.
Contributions. The main contributions of this theoretical paper are as follows:
  • Rank-aware geometric formulation. We develop a unified, rank-aware description of the residual projector based on the identity I P = N N , which reduces to the rank-one form n n when dim C ( X ) = 1 and generalizes to arbitrary rank.
  • Multivector perspective on residual spaces. We reformulate C ( X ) and the associated projector using multivectors, wedge products, and cross-products, yielding a basis-independent geometric characterization valid in both codimension-one and higher-codimension settings.
  • Geometric Multicollinearity Index (GMI). We introduce a scale-invariant, volume-based diagnostic derived from the Gram determinant and polar sine, which quantifies geometric degeneracy in both full-rank and rank-deficient regimes and complements classical VIF-type measures.
  • Regression versus PCA residuals. We clarify the conceptual distinction between regression residuals and PCA reconstruction residuals, emphasizing the impact of projecting the response versus the predictors on residual geometry and interpretation.
  • Illustrative examples. We provide simple numerical examples illustrating rank-one and multivector residual projections and the behavior of GMI under controlled perturbations.
Organization of the paper.Section 2 introduces the notation and basic geometric concepts. Section 3 develops the main theoretical results on residual projectors and the geometric structure of C ( X ) . Section 4 presents numerical illustrations, including the behavior of GMI under controlled multicollinearity. Section 5 and Section 6 discuss implications and conclude.

2. Notation and Preliminaries

We briefly fix notation and recall the algebraic and geometric concepts that underpin the proposed framework. We follow standard linear algebra conventions as in Strang [1], Lay et al. [3], and Golub and Van Loan [5].

2.1. Basic Notation and Subspaces

The transpose of a matrix or vector is denoted by the superscript . Lowercase letters (e.g., x, n) denote column vectors, and uppercase letters (e.g., X, N) denote matrices. For a matrix X R n × p , we write C ( X ) for its column space and denote its rank by  ρ . Throughout, r denotes the residual vector, while ρ denotes the rank of X, to avoid notational ambiguity. All notation is defined once and used consistently throughout the manuscript; in particular, N denotes an orthonormal basis of the orthogonal complement C ( X ) .
We reserve r exclusively for the residual vector in (8).
The orthogonal complement of a subspace V R n is denoted by V , and the null-space of a matrix A is
N ( A ) = { z R m : A z = 0 } .
A summary of all symbols and subspace conventions is provided in Abbreviations.
Given n R n , the matrix n n R n × n denotes its outer product. The identity matrix in R n × n is denoted by I.

2.2. Projection and Residual Operators

Let X R n × p have rank ρ min { n , p } ; when X has full column-rank ( ρ = p ), the orthogonal projector admits the classical closed form P = X ( X X ) 1 X .
The orthogonal projector onto the column space C ( X ) is the unique symmetric idempotent matrix P that satisfies
C ( P ) = C ( X ) , P = P , P 2 = P .
When X has full column-rank ρ , this projector admits the well-known closed form
P = X ( X X ) 1 X ,
which is symmetric and idempotent [5]. In the general case, including rank-deficient designs, P can be written as
P = X X + ,
where X + denotes the Moore–Penrose pseudoinverse of X [5].Throughout, X + denotes the Moore–Penrose pseudoinverse of X, computed via an SVD-based routine.
Both (6) and (7) implement the same geometric operation: orthogonal projection onto C ( X ) .
When X X 0 , the same orthogonal projector can equivalently be constructed by Cholesky factorization of X X , producing identical residuals while requiring positive definiteness of the Gram matrix.
For any response vector Y R n , the fitted component is
Y ^ = P Y ,
and the residual is
r = ( I P ) Y ,
which obeys X r = 0 and hence lies in the orthogonal complement C ( X )  [5].
Let ρ = rank ( X ) and set k = n ρ = dim C ( X ) . Choose any matrix N R n × k whose columns form an orthonormal basis of C ( X ) :
N N = I k , X N = 0 .
Then the residual projector admits the geometric factorization
I P = N N ,
and the null-space of X is
N ( X ) = { z R n : X z = 0 } = C ( X ) .
This characterization is standard in numerical linear algebra and regression theory [5] and forms the algebraic backbone of our geometric residual formulation. Equation (10) makes explicit that the residual r lies in a k-dimensional orthogonal complement spanned by the columns of N.

2.3. Cross-Product and Cross–Wedge–QR Method

To construct explicit orthogonal directions, especially in the case of codimension-one where dim C ( X ) = 1 , we employ the cross-product in R n  [7]. Given n 1 linearly independent vectors
x 1 , , x n 1 R n ,
the determinant-based construction of Belhaouari et al. produces
cross ( x 1 , , x n 1 ) R n ,
which is orthogonal to each x i :
x i cross ( x 1 , , x n 1 ) = 0 , i = 1 , , n 1 .
This extends earlier n-dimensional vector products and related constructions in eigenanalysis and geometry [8,9].
In the regression setting, when ρ = n 1 the predictor space C ( X ) has codimension one and C ( X ) is spanned by a single unit vector n. Choosing any basis of C ( X ) and applying the cross-product yields a nonzero vector orthogonal to C ( X ) ; normalizing it gives a unit normal n spanning C ( X ) . In this case, the residual projector reduces to the rank-one form
I P = n n ,
and the residual is simply r = ( n Y ) n .
In this codimension-one setting, the rank-one projector I P = n n coincides with the classical construction of the cross-product, and we refer to this case as the residual projector of the cross-product (rank-one).
A key geometric property of the cross-product is that its norm equals the ( n 1 ) -dimensional volume of the parallelotope spanned by its arguments [7]:
cross ( v 1 , v 2 , , v n 1 ) = vol n 1 ( v 1 , , v n 1 ) .
Thus, the cross-product simultaneously provides an orthogonal direction and encodes a volume. When dim C ( X ) > 1 , a single cross-product yields at most one null direction; in that case additional null directions must be obtained by other means (e.g., null-space methods and QR), as discussed later.

2.4. Wedge Products, Gram Matrices, and Polar Sine

To describe volumes and linear independence in a coordinate-free manner, we rely on standard constructions from linear algebra and multilinear geometry [1,3,5]. Given vectors u 1 , , u k R n , their wedge product u 1 u k encodes the oriented k-dimensional parallelotope spanned by { u i } . Geometrically, u 1 u k = 0 if and only if the vectors are linearly dependent, and its norm u 1 u k equals the volume k of the associated parallelotope. This interpretation underlies many geometric treatments of subspaces, volumes, and orthogonality in R n [1,5].
For computational purposes, such volumes can be expressed using Gram matrices. Given vectors v 1 , , v m R n , the associated Gram matrix is
G = v 1 v 1 v 1 v m v m v 1 v m v m = [ v 1 , , v m ] [ v 1 , , v m ] .
The m-dimensional volume of the parallelotope spanned by { v i } is given by
vol m ( v 1 , , v m ) = det ( G ) ,
a classical identity that plays a central role in covariance analysis and principal component analysis [4,5].
We quantify the angular separation of p predictor vectors using the polar sine. Let X = [ x 1 , , x p ] R n × p with Gram matrix G = X X . Following geometric constructions based on determinants and cross-products [7,8,9], the polar sine is defined as
psin ( x 1 , , x p ) = det ( X X ) x 1 x p .
When p n and the columns of X are linearly independent, det ( X X ) > 0 and psin ( x 1 , , x p ) ( 0 , 1 ] ; it equals 1 if and only if the vectors are mutually orthogonal. If the columns of X are linearly dependent or p > n , then X X is singular and det ( X X ) = 0 , and we adopt the convention psin ( x 1 , , x p ) = 0 . Thus, the polar sine takes values in [ 0 , 1 ] , is scale-invariant, and decreases toward zero as the set of columns becomes geometrically degenerate.
Within the present framework, the polar sine provides a normalized geometric measure of predictor-space collapse under multicollinearity and forms the basis of the Geometric Multicollinearity Index (GMI) introduced in the following section.

2.5. Projection Strategies: QR and Cross-Product Formulations

Two projection strategies play a central role in the remainder of the paper.
First, QR decomposition is used when an entire orthonormal basis of C ( X ) or C ( X ) is required. If X = Q R is a (thin) QR factorization with orthonormal columns in Q, then
P = Q Q ,
is the orthogonal projector onto C ( X ) , and orthonormal complements can be obtained by extending Q to an orthogonal basis of R n  [5]. QR-based projectors are numerically stable and well suited for high-dimensional or nearly singular problems.
Second, when the residual space is one-dimensional (for example, when ρ = n 1 ), the cross-product provides a compact alternative. It produces a normal vector n directly from which the residual projector is derived:
I P = n n .
This follows without forming or inverting X X  [7]. In higher-codimension settings ( dim C ( X ) > 1 ), cross-products can still be used to generate individual null directions, but a full residual basis is obtained more naturally from N ( X ) and orthonormalised via QR.
When null directions are generated sequentially via cross-products and accumulated into a multivector basis that is subsequently orthonormalised using QR, we refer to the resulting construction as the Recursive Cross–Wedge–QR method.
These geometric primitives—orthonormal bases, cross-products, wedge products, Gram matrices, and polar sine—constitute the toolkit on which the residual projection framework developed in the following sections is built.
Here s denotes the sketch size (number of random probe vectors) used in the Gaussian range-finding procedure.

2.5.0.1. Sketch–QR Baseline (Reproducibility)

The Sketch–QR baseline follows a standard randomized range-finding procedure. Given X R n × p and a sketch size s, we draw a Gaussian sketching matrix Ω R p × s with i.i.d. N ( 0 , 1 ) entries and form Y = X Ω R n × s . We then compute a thin QR factorization Y = Q ˜ R , where Q ˜ R n × s has orthonormal columns. The approximate projector is P ˜ = Q ˜ Q ˜ , and the approximate residual is r ˜ = ( I P ˜ ) y . Unless stated otherwise, we use a single sketch without power iterations or oversampling [6,16].

3. Core Theory

3.1. Rank-One Residual Projection

Assume that X R n × p has rank n 1 . Then dim C ( X ) = n 1 and dim C ( X ) = 1 . There exists a (unique up to sign) unit vector n R n such that
X n = 0 , n = 1 .
Proposition 1 (Rank-One Residual Projection).
Let X R n × p have ρ = n 1 , and let P denote the orthogonal projector onto C ( X ) (for example, P = X X + in terms of the Moore–Penrose pseudoinverse X + ). Then the orthogonal residual projector is
I P = n n .
Proof. 
Because C ( X ) is one-dimensional, any residual r C ( X ) can be written as r = α n for some scalar α . Since n C ( X ) and P is the orthogonal projector onto C ( X ) , we have P n = 0 and therefore ( I P ) n = n . For any Y R n , we may write Y = u + v with u C ( X ) and v C ( X ) = span { n } . Then
( I P ) Y = ( I P ) ( u + v ) = v ,
while
n n Y = n n ( u + v ) = n n u + n n v = 0 + v = v ,
because n u = 0 and v is a scalar multiple of n. Thus ( I P ) Y = n n Y for all Y, and therefore I P = n n , which is symmetric and idempotent.    □

3.2. Generalization to n Dimensions

The rank-one projector I P = n n extends naturally to the general setting in which the predictor matrix X R n × p has ρ < n . In this case the orthogonal complement C ( X ) has dimension k = n ρ , and the residual is no longer confined to a single direction but lies in a k-dimensional subspace.
Theorem 2 (Multivector Residual Projection).
Let X have rank ρ < n and set k : = n ρ . Let N R n × k have orthonormal columns spanning C ( X ) (so N N = I k ). Then the residual projector is I P = N N and r = ( I P ) y = N N y .
Proof. 
Since N N = I n ρ , the matrix N N is symmetric and satisfies
( N N ) 2 = N ( N N ) N = N I n ρ N = N N ,
so N N is an orthogonal projector. Its range equals C ( N ) = C ( X ) .
On the other hand, P is the orthogonal projector onto C ( X ) , so I P is the orthogonal projector onto C ( X ) . For any Y R n , write Y = u + v with u C ( X ) and v C ( X ) . Then
( I P ) Y = v .
Since v C ( N ) and the columns of N are orthonormal, we also have
N N Y = N N ( u + v ) = N N u + N N v = 0 + v = v ,
because N u = 0 and N N acts as the identity on C ( N ) . Thus ( I P ) Y = N N Y for all Y, and hence I P = N N .    □
Figure 1 illustrates the situation in a simple three-dimensional example. The predictor space occupies a lower-dimensional subspace (blue), while the residual lies in its orthogonal complement (orange), spanned by multiple orthonormal directions when dim C ( X ) > 1 .

3.2.1. Geometric Multicollinearity Index (GMI)

The multivector formulation of the residual projector reflects the angular configuration of the columns of X. When the predictors are nearly linearly dependent, the volume of the parallelotope spanned by these columns collapses, signaling instability. To capture this behavior geometrically, we introduce the Geometric Multicollinearity Index (GMI), a scale-invariant measure based on the polar sine.
Definition 1 (Polar Sine and GMI).
Let X = [ x 1 , , x p ] R n × p have nonzero columns x j 0 , and let psin ( x 1 , , x p ) be the polar sine defined in Equation (16)
. We adopt the shorthand
psin ( X ) : = psin ( x 1 , , x p ) ,
and define the Geometric Multicollinearity Index by
GMI ( X ) = 1 psin ( X ) .
Lemma 3 (Basic properties).
Let X = [ x 1 , , x p ] R n × p have nonzero columns, and let G = X X denote its Gram matrix. Then:
i) 
psin ( X ) [ 0 , 1 ] and hence GMI ( X ) [ 0 , 1 ] ;
ii) 
psin ( X ) = 1 (and GMI ( X ) = 0 ) if and only if the columns of X are mutually orthogonal;
iii) 
psin ( X ) = 0 (and GMI ( X ) = 1 ) whenever the columns are linearly dependent or p > n .
Proof. 
By Hadamard’s inequality,
det ( G ) j = 1 p x j 2 ,
with equality if and only if the columns of X are mutually orthogonal. Substituting this bound into the polar-sine definition Equation (16) yields
0 psin ( X ) = det ( G ) x 1 x p 1 ,
which proves (i). Moreover, psin ( X ) = 1 holds precisely when equality in Hadamard’s inequality holds, i.e., when the columns are pairwise orthogonal, giving (ii).
If the columns of X are linearly dependent or p > n , then G is singular and det ( G ) = 0 , so by definition psin ( X ) = 0 in that case, and (iii) follows immediately. The corresponding statements for GMI ( X ) are then direct consequences of the definition GMI ( X ) = 1 psin ( X ) .    □
Lemma 4 (Scale invariance).
Let A = diag ( a 1 , , a p ) with each a j 0 . Then psin ( X A ) = psin ( X ) and hence GMI ( X A ) = GMI ( X ) .
Proof. 
We have ( X A ) ( X A ) = A G A and det ( A G A ) = det ( A ) 2 det ( G ) , while a j x j = | a j | x j for each j. The scale factors cancel in the quotient (17), so psin ( X A ) = psin ( X ) , and (18) yields the claim for GMI.    □
Remark 1.
For orthonormal predictors, psin ( X ) = 1 and GMI ( X ) = 0 . For nearly collinear predictors, det ( G ) 0 , so psin ( X ) 0 and GMI ( X ) 1 . Intermediate angular configurations (e.g., vectors separated by 45 ) yield values such as psin 0.71 and GMI 0.29 , as illustrated in Figure 2. In this sense, GMI is a normalized version of determinant-based multicollinearity measures (such as the generalized variance det ( G ) or det of the correlation matrix), rescaled to lie in [ 0 , 1 ] and made explicitly scale-invariant via the polar-sine normalization.
GMI provides a direct geometric diagnostic that complements classical VIF-type measures. Whereas VIF assesses how much the variance of each individual regression coefficient is inflated by linear dependence among the predictors, GMI treats the entire predictor set as a single geometric object and summarizes its angular degeneracy in a single number in [ 0 , 1 ] . Because it depends only on the angular configuration of the predictors and is invariant under independent rescaling of the columns, it can be compared meaningfully across differently scaled datasets.

3.2.2. Uniqueness of the Residual Projection

The multivector formulation I P = N N expresses the residual projector in terms of any orthonormal basis of the orthogonal complement C ( X ) . The next result confirms that this projector is independent of the specific basis chosen.
Remark 2 (Independence of the residual projector from the basis).
Let X R n × p have rank r < n , and let N , N ˜ R n × ( n ρ ) have orthonormal columns spanning C ( X ) . Then there exists an orthogonal Q R ( n ρ ) × ( n ρ ) such that N ˜ = N Q , and hence
N ˜ N ˜ = N Q Q N = N N .
In particular, the representation I P = N N from Theorem 2 is independent of the orthonormal basis chosen of C ( X ) : it is simply the (unique) orthogonal projector onto C ( X ) .

3.3. Generalized Projection in Rank-Deficient Spaces via Null-Spaces and Wedge Products

When the design matrix X is rank-deficient or when n ρ > 1 , the orthogonal complement C ( X ) has dimension greater than one, and a single normal vector—as in the rank-one case—is no longer sufficient to describe the residual direction. In such settings, a full orthonormal basis of C ( X ) is required. This subsection explains how standard null-space computations, combined with QR orthonormalization, yield such a basis, and how wedge products provide a coordinate-free way to reason about independence and volume.
Definition 2 (Residual Projection in the Rank-Deficient Case).
Let X R n × p have rank r < n , and set k = n ρ = dim C ( X ) . If N R n × k has orthonormal columns spanning C ( X ) , then the residual projector is
I P = N N .
Proposition 5 (Limitation of the cross-product).
The cross-product of Belhaouari [7] produces a single vector orthogonal to n 1 independent vectors in R n . This suffices when dim C ( X ) = 1 , but for n r > 1 it yields only one null direction. Additional directions are needed to span the entire residual space.
Proof. 
If n ρ = 1 , then C ( X ) is one-dimensional, and a single cross-product applied to a basis of C ( X ) recovers the unique (up to sign) normal vector n, giving I P = n n as in the rank-one case. When n ρ > 1 , the orthogonal complement has dimension k > 1 , so multiple linearly independent null directions are required. A single cross-product produces at most one such direction and therefore cannot span a k-dimensional space. Additional vectors must be constructed, for instance, via null-space computations and subsequent orthonormalization.    □
Theorem 6 (Construction of a Residual Basis via null-space and QR).
Let X R n × p have rank r and set k = n ρ , so that C ( X ) = N ( X ) has dimension k. Then there exists an orthonormal basis N R n × k of C ( X ) such that
I P = N N ,
where P is the orthogonal projector onto C ( X ) . Moreover, such an N can be obtained by the null-space + QR procedure in Algorithm 1.
Proof. 
By construction, the columns of B span N ( X ) , which is equal to C ( X ) . If b 1 b k 0 then the b i ’s are linearly independent and form a basis of this k-dimensional subspace. The thin QR factorization B = Q R with orthonormal columns in Q then produces N : = Q with N N = I k and C ( N ) = C ( B ) = C ( X ) . The matrix N N is therefore the orthogonal projector onto C ( X ) , so I P = N N by the uniqueness of the orthogonal projector onto a fixed subspace.    □
Algorithm 1 Residual Projector via Null-Space and QR (Explicit SVD/RRQR)
Preprints 196389 i001
Remark 3.
In practice, the null-space basis in Step 1 is typically obtained from a numerically stable factorization such as SVD or a rank-revealing QR. The wedge product need not be computed explicitly; it serves primarily as a geometric way to interpret the independence check. When dim C ( X ) = 1 , the construction reduces to the case of rank-one , where a normal vector can be obtained either from the null-space of X or by applying a cross-product to a basis of C ( X ) .The numerical tolerance used for the determination of rank (e.g., ρ = τ = 10 12 relative to the largest singular value) follows standard practice and does not affect the conceptual geometric results, which hold exactly at the algebraic level.
Remark 4 (Implementation note: computing N ( X ) in floating-point arithmetic).
The instruction “compute a basis of N ( X ) ” requires (i) a numerical rank decision and (ii) an explicit extraction rule. In this work we use either:
(a) SVD. Compute X = U Σ V , choose the numerical rank ρ = # { i : σ i τ σ 1 } , and set B = V ( : , r + 1 : n ) .
(b) RRQR. Compute a column-pivoted QR factorization X Π = Q R , choose rank ρ from the diagonal of R (e.g., | R i i | τ | R 11 | ), partition R = R 11 R 12 0 R 22 , and set B = Π R 11 1 R 12 I n ρ .
SVD is typically more robust under severe ill-conditioning, while RRQR is often faster when pivoted QR is sufficient. Row-reduction is used only for symbolic/toy demonstrations.
Figure 3 illustrates the residual subspace and two independent null directions before orthonormalization.

3.4. Projection Method Variants and Complexity

Definition 3 (Variants of Residual Projection).
Let X R n × p be a real-valued design matrix of rank ρ < n . The residual projection operator I P onto the orthogonal complement C ( X ) can be constructed by several exact procedures, each suited to different regimes of dimension, conditioning, and interpretability.
The following variants highlight that null-space, wedge-based, and determinant-based approaches provide geometrically transparent alternatives to inversion-based formulas in linear regression.
Proposition 7 (Wedge–QR Null-Space Variant).
Let X R n × p have rank ρ, and let k = n ρ . Using the null-space + QR construction of Theorem 6, we obtain an orthonormal residual basis N R n × k spanning C ( X ) and hence the projector I P = N N . If, in addition, we compute a thin QR factorization X = Q R , then the Gram matrix satisfies
G = X X = R R , det ( G ) = det ( R ) 2 ,
so the squared volume of the parallelotope spanned by the columns of X is given directly by det ( R ) 2 . The general complexity of forming N and R is O ( n p 2 + n k ) .
Proof. 
The existence of an orthonormal basis N of C ( X ) and the identity I P = N N follow from Theorem 6. The factorization X = Q R implies G = X X = R R , so det ( G ) = det ( R ) 2 , which yields the stated volume formula. The complexity estimate is inherited from the null-space + QR construction.    □
When geometric interpretations involving volume are necessary-such as evaluating multicollinearity through the Geometric Multicollinearity Index (GMI)—factorizations of the Gram matrix are particularly convenient.
Proposition 8 (Cholesky-Based Volume and Projection).
Let G = X X and assume that X has full-column-rank so that G 0 . Compute the Cholesky factorization G = L L , which yields
det ( G ) = det ( L ) 2 = i L i i 2 .
Let N be any orthonormal basis of C ( X ) constructed as in Theorem 6. Then I P = N N is the orthogonal residual projector. The dominant cost is O ( p 3 + n k ) : O ( p 3 ) for the Cholesky factorization and O ( n k ) for forming and applying N.
Proof. 
The positive definiteness of G ensures the existence of a unique Cholesky factor L with positive diagonal entries, and the determinant formula follows from det ( G ) = det ( L ) 2 = i L i i 2 . The construction of N and the identity I P = N N is provided by Theorem 6; any such N produces an orthogonal projector onto C ( X ) .    □
In the special case where the residual subspace is one-dimensional, the projector admits an especially simple representation.
Proposition 9 (cross-product (Rank-One Case)).
If dim C ( X ) = 1 , then
I P = n n ,
where n R n satisfies X n = 0 and n = 1 . The one-time cost of building n is O ( n 2 ) , and each subsequent projection ( I P ) Y costs O ( n ) .
Proof. 
When C ( X ) is one-dimensional, any nonzero vector in this space is a scalar multiple of a unit normal n. The orthogonal projector onto the line spanned by n is precisely n n . Symmetry and idempotence follow from n = 1 , and orthogonality is ensured by X n = 0 .    □
Example 1.
Consider n = 4 , p = 3 , and the residual-codimension k = n p = 1 . In this codimension-one setting, the classical OLS residual projector based on ( X X ) 1 (assuming that X has full column-rank) involves a matrix inversion together with several matrix–vector multiplications, which require on the order of a few hundred floating-point operations in this small example. By contrast, computing a unit normal vector n via the geometric cross-product costs O ( n 2 ) operations, and each application of the rank-one projector n n to a new response Y costs only O ( n ) operations. Once n has been constructed, the projection of additional responses is therefore substantially cheaper than recomputing residuals through ( X X ) 1 , illustrating the efficiency of geometric formulation in symbolic or resource-limited regimes.
Unless stated otherwise, we use a single Gaussian sketch without power iterations or oversampling; orthonormalization is performed via a standard thin QR factorization.
Algorithm 2: Sketch–QR Baseline (Gaussian Range Finder)
Preprints 196389 i002
Table 1 compares four exact ways to build the residual projector ( I P ) for a given design matrix X. Classical OLS and the Cholesky-based variant both work through ( X X ) 1 and treat the residual subspace only implicitly, with a cubic dependence on the number of predictors p and the restriction that X X be invertible. In contrast, the Wedge–QR and rank-one cross-product methods construct ( I P ) from an explicit basis of C ( X ) ; the former handles general rank-deficient cases, while the latter is tailored to the codimension-one setting and achieves the lowest per-response cost when many responses must be projected.
In summary, Wedge–QR and Cholesky-based constructions are well-suited when one needs both an exact residual projector and access to volume-based quantities such as det ( X X ) or the GMI. The formulation of a rank-one cross-product is preferable whenever dim C ( X ) = 1 , as it produces a simple geometric projector n n with linear complexity per response and, when p n , is usually the cheapest option to repeatedly project many responses. For general rank-deficient designs, a QR/null-space construction offers a numerically safer alternative to explicitly forming and inverting X X . Together, the four variants in Table 1 provide a practical menu of exact geometric projectors that avoid unnecessary matrix inversion and can be chosen according to the rank of X, the conditioning of X X and the available computational budget.
We emphasize that the exact residual projector I P can be constructed via several algebraically equivalent routes (normal-equations OLS when applicable, rank-one cross products, null-space/QR, wedge-based constructions and Cholesky-based variants), which differ only in numerical assumptions and computational cost; see Table 1.

4. Illustrative Numerical Examples

When the design matrix X is rank-deficient, the normal equations X X β = X Y do not admit a unique solution. In all rank-deficient experiments, we therefore compute the minimum-norm least-squares solution using the Moore–Penrose pseudoinverse, i.e., β ^ = X + Y , and refer to this baseline as OLS (Moore–Penrose pseudoinverse).
Evaluation protocol. Predictive performance is assessed using K-fold cross-validation with K = 5 , reporting CV MSE and CV R 2 . Unless stated otherwise, all randomized procedures use a fixed random seed (seed = 42) to ensure reproducibility.
Coefficient-stability summary. Coefficient stability is evaluated via B = 100 perturbation trials with the design matrix X held fixed. In each trial, the response is perturbed as
y ˜ ( b ) = y + η ( b ) , η ( b ) N 0 , σ y 2 I , σ y = 0.01 std ( y ) , b = 1 , , B .
The model is refit for each trial to obtain coefficients β ^ ( b ) . Let SD j denote the sample standard deviation of the jth coefficient { β ^ j ( b ) } b = 1 B across trials. We summarize coefficient stability by
medianSD ( β ^ ) : = median j = 1 , , p ( SD j ) ,
i.e., the median across coefficients of their across-trial standard deviations.
To validate the geometric projector, we verify agreement between the classical OLS residual r OLS = ( I P ) y and the geometric residual r geo = N N y by monitoring r OLS r geo 2 up to numerical tolerance.
Sketch–QR [6] yields an approximate basis Q ˜ and projector P ˜ = Q ˜ Q ˜ , producing the approximate residual r ˜ = ( I P ˜ ) y . Agreement with the exact residual is summarized using cosine similarity cos ( r OLS , r ˜ ) and the residual discrepancy r ˜ r OLS 2 (and, when reported, a principal-angle distortion between C ( X ) and C ( Q ˜ ) ).

4.1. Rank-One Residual Projection in R 3

This example explicitly shows how the normality of the cross-product reproduces the ordinary least-squares (OLS) residual in the rank-one case, in line with Proposition 1.
We consider a rank-one residual setting where the orthogonal complement of the column space is one-dimensional. Let us
X = 1 2 1 3 1 5 , Y = 1 2 11 2 ,
with X R 3 × 2 of full-column-rank (so X X is invertible). The ordinary least-squares projection matrix is
P = X ( X X ) 1 X , r OLS = ( I P ) Y .
A direct calculation yields
r OLS = 3 14 9 28 3 28 , r OLS C ( X ) .
To recover the same residual in purely geometric form, we construct a unit normal vector to the column space of X . Let us
x 1 = 1 1 1 , x 2 = 2 3 5 , n 0 = x 1 × x 2 = 2 3 1 , n = n 0 n 0 = 1 14 2 3 1 .
Then X n = 0 and n = 1 , so Proposition 1 applies, and
I P = n n .
The residual can therefore be written as the rank-one projection
r = ( I P ) Y = n n Y = ( n Y ) n = 3 14 9 28 3 28 ,
which coincides with r OLS and lies on the line spanned by n .
From a computational point of view, this example highlights the attraction of the geometric formulation in low dimension: the cross-product and the multiplication of a scalar vector are sufficient to obtain the residual, avoiding explicit matrix inversion. Once n is known, applying the projector ( I P ) = n n to a new response vector reduces to one dot product and one scaled copy of n , i.e., O ( n ) work per residual.

4.2. Multivector Residual Projection in R 4

This example illustrates the multivector residual projection via an orthonormal basis of C ( X ) as in Theorem 6, and links it to the Geometric Multicollinearity Index.
We now consider a multivector setting in which the residual subspace has a dimension greater than one. Let us
X = 1 1 1 2 1 3 1 4 , Y = 1 2 3 5 , X R 4 × 2 , ρ = 2 .
The ordinary least-squares projector is again
P = X ( X X ) 1 X , r OLS = ( I P ) Y ,
and a straightforward computation gives
r OLS = 1 5 1 10 2 5 3 10 .
Since ρ = 2 , its orthogonal complement has dimension dim C ( X ) = 2 . A basis for the null-space of X is
n 1 = 1 2 1 0 , n 2 = 2 3 0 1 , X n i = 0 , i = 1 , 2 .
Applying Gram–Schmidt yields an orthonormal basis
u 1 = 1 6 1 2 1 0 , u 2 = 1 30 2 1 4 3 , N = [ u 1 u 2 ] R 4 × 2 ,
with N N = I 2 and X N = 0 . The general residual projection theorem then gives
I P = N N , r geom = N N Y .
Evaluating r geom for this X and Y reproduces the residual OLS:
r geom = r OLS = 1 5 1 10 2 5 3 10 ,
making explicit that the residual lives in the two-dimensional subspace spanned by u 1 and u 2 and realizing the null-space + QR construction in Theorem 6, in a concrete low-dimensional setting.
From a complexity point of view, building N on a null-space basis followed by QR (or Gram–Schmidt) is a one-off cost of order O ( n p 2 + n k ) for X R n × p with residual dimension k = n ρ , while applying the projector ( I P ) = N N to a new response vector requires only O ( n k ) operations.

4.2.0.1. GMI for the R 4 Example

For this design matrix X = [ x 1 , x 2 ] , the Gram matrix is
G = X X = 4 10 10 30 , det ( G ) = 20 .
The column norms are x 1 = 2 and x 2 = 30 , so the polar sine and the Geometric Multicollinearity Index are
psin ( x 1 , x 2 ) = det ( X X ) x 1 x 2 = 20 2 30 = 6 6 , GMI = 1 6 6 0.59 .
In this small example, the GMI value GMI 0.59 indicates a moderate degree of collinearity between the two regressors; the residual subspace C ( X ) has dimension two (as n r = 2 ), and, compared to the orthonormal case ( GMI = 0 ), the fitted coefficients are more sensitive to perturbations of Y .
This example isolates the effect of controlled collinearity on GMI in a minimal symbolic setting.
Consider the family of matrices
X ε = 1 0 0 1 ε ε , ε [ 0 , 1 ] .
For every ε [ 0 , 1 ] the two columns are linearly independent, so X ε X ε is positive definite, and the polar-sine is well-defined. A direct computation gives
psin ( X ε ) = 1 + 2 ε 2 1 + ε 2 ,
and hence
GMI ( ε ) = 1 1 + 2 ε 2 1 + ε 2 .
As ε increases, the columns become more collinear, psin decreases, and GMI increases, reflecting the growing multicollinearity in a purely geometric way. In families where columns eventually become exactly collinear, the Gram determinant vanishes, the polar sine drops to zero, and GMI reaches 1, in agreement with the rank-deficient discussion in Section 3.

4.2.0.2. Large-Scale Multivector Residual Projection

To complement the low-dimensional symbolic example above, we consider a larger synthetic design in which the residual subspace has a dimension greater than one and the multivector formulation I P = N N is essential.
We generate a rank-deficient design matrix X R n × p with n = 1000 , p = 50 , and ρ = 45 , so that dim C ( X ) = n ρ = 955 . All predictors are standardized, and the response is constructed so that the exact residual r exact has a prescribed norm, ensuring a controlled comparison between methods.
Table 2 reports residual norms, cosine similarity to the exact residual, and run times for the variants of the Article 1 projector only. The exact constructions (null-space/QR, rank-one cross-product when applicable, and the Moore–Penrose pseudoinverse baseline for OLS) reproduce the same residual up to numerical precision, confirming the multivector identity I P = N N in a high-dimensional setting.
Sketch–QR yields an approximate residual, whose agreement with the exact residual is summarized by cosine similarity.Entries marked NaN or N/A indicate methods that are not applicable in this regime (e.g., rank-one cross-product when dim C ( X ) 1 ), while FAIL indicates that the required algebraic conditions (e.g., X X 0 ) are violated.
This experiment demonstrates that the geometric residual formulation extends naturally from symbolic low-dimensional examples to realistic large-n, rank-deficient designs, without relying on regularization or nonlinear modeling.
Together with the symbolic R4 example, this large-scale synthetic experiment confirms that the multivector residual projector N N is exact in dimensions, while Sketch–QR provides a principled approximation when a reduced basis is desired, setting the stage for the real-data illustrations in Section 4.3.

4.3. Illustrative Real Data Use of GMI

Real-data experiments use two regression benchmarks: Boston Housing and Auto MPG. Both are standardized via centering and scaling before forming X, and they provide realistic multicollinearity patterns suitable for illustrating GMI as a purely geometric, response-independent diagnostic.
This example illustrates how GMI can serve as a purely geometric multicollinearity diagnostic on real data and how it can be interpreted alongside coefficient stability.
We use the Boston Housing dataset because it is a widely used regression benchmark with well-known predictor correlations, making it a convenient setting for illustrating multicollinearity diagnostics and residual geometry on realistic data.
To illustrate the geometric diagnostics on real data, we use the classical Boston Housing dataset, with median house price as the response and a set of standardized socio-economic and structural predictors. The protocol used in this subsection is as follows:
(a)
Standardise all predictors to zero mean and unit variance, and form the design matrix X.
(b)
For selected subsets of predictors, form the corresponding design matrix X, compute the Gram matrix G = X X , and evaluate the polar sine and GMI using the definitions in (18).
(c)
Report predictive metrics using 5-fold cross-validation (CV) so that MSE and R 2 are out-of-sample summaries rather than in-sample fit statistics.
(d)
Fit ordinary least-squares models on these subsets and monitor the variation of the estimated coefficients under small perturbations of Y (e.g., additive Gaussian noise or resampling).
The purpose of this subsection is to demonstrate (i) that the geometric residual projector ( I P ) = N N reproduces the classical OLS residuals on a realistic dataset, and (ii) that GMI provides a compact, response-independent summary of predictor-space degeneracy.

4.3.0.3. Boston Housing and Auto MPG: Increasing Model Size

Real-data experiments use two regression benchmarks: Boston Housing and Auto MPG. Both are standardized through centering and scaling before forming X, and provide realistic multicollinearity patterns suitable to illustrate GMI as a purely geometric and response-independent diagnostic.
Table 3 and Table 4 report results for increasing the size of the model ( p = 1 , p = 4 , p 6 , and in full where applicable). For each subset, we report out-of-sample predictive metrics using 5-fold cross-validation (CV MSE and CV R 2 ), geometric multicollinearity (GMI), and classical diagnostics (max VIF and κ ( X X ) ), along with a summary of coefficient-stability.
We restrict algorithmic comparisons to the projector variants defined in Table 1 (OLS/Cholesky when applicable, null-space/QR, rank-one cross-product when codim = 1 , and Sketch–QR).
Table 3Table 4 show that increasing the size of the model improves predictive performance (lower CV MSE, higher CV R 2 ) while typically increasing multicollinearity. In Boston, moving from p = 1 to the full model reduces CV MSE from 44.02 to 23.67 and increases CV R 2 from 0.467 to 0.712 , while GMI increases from 0 to 0.988 and κ ( X X ) increases from 1 to 96.47 . In Auto MPG the collinearity is even stronger: the full model attains CV MSE 11.31 and CV R 2   0.810 , but the corresponding diagnostics are GMI 0.980 , max VIF 21.84 , and κ ( X X ) 136.84 .
These results illustrate the intended role of GMI: it is computed purely from X (independent of Y) and tracks familiar multicollinearity measures (VIF and condition number), while the coefficient-stability summary provides a concrete link between predictor-space degeneracy and sensitivity of fitted parameters.

4.3.0.4. Implementation Note (Stable GMI Computation)

When the Gram matrix G = X X is poorly conditioned, the evaluation of direct determinants is numerically fragile. We therefore compute the polar sine and GMI in the log-domain with a small jitter ε > 0 :
log psin ε ( G ) = 1 2 log det ( G + ε I ) 1 2 i = 1 p log ( G i i + ε ) , GMI ε ( G ) = 1 exp log psin ε ( G ) .
In implementation, log det ( G + ε I ) is evaluated using a numerically stable log-determinant routine (equivalently, a Cholesky factorization when G + ε I 0 ). Unless stated otherwise, we use ε = 10 12 .
Figure 4 makes the equivalence of the exact geometric projector and classical OLS residuals explicit: the discrepancy curve for the exact QR-basis construction is at a numerical noise level in both the p = 1 and p = 4 models. In contrast, Sketch–QR constructs an approximate projector P ˜ , so its residual r ˜ differs slightly from r OLS ; the magnitude of this discrepancy decreases as the size of the sketch s increases. Overall, this visualization complements Table 3Table 4 by separating the exact residual-geometry identities (Article 1 methods) from the approximate Sketch–QR construction while keeping the presentation compact and reproducible.

5. Discussion

This paper presents a rank-aware geometric interpretation of linear regression residuals via orthogonal projectors. In the case of codimension-one, the residual is fully determined by a unit normal n C ( X ) , resulting in r = ( I P ) Y = n n Y ; in general rank settings, the residual lies in a higher-dimensional orthogonal complement spanned by an orthonormal basis N, with ( I P ) = N N . This formulation makes the residual subspace explicit and decouples basis construction from repeated projection.
An immediate consequence is that all exact computational routes—normal-equation OLS, Cholesky-based solvers, and null-space or QR constructions—must yield identical residuals up to numerical tolerance, as they implement the same projector onto C ( X ) . We confirm this numerically through residual agreement metrics and discrepancy visualizations, showing that differences arise only from conditioning and the computational regime. In contrast, Sketch–QR yields an approximate projector and residual, whose deviation from the exact solution is quantified using cosine similarity and norm-based discrepancies.
We further link residual geometry to predictor-space conditioning through the Geometric Multicollinearity Index (GMI), a scale-invariant, normalized volume measure derived from the polar sine. Across the Boston Housing and Auto MPG benchmarks, higher GMI values consistently align with larger max VIF and κ ( X X ) , while coefficient-stability summaries illustrate increased parameter sensitivity under geometric degeneracy.
Although both regression and PCA rely on orthogonal projections, their objectives differ fundamentally: PCA projects predictors to minimize reconstruction error, with residuals remaining in C ( X ) , whereas regression projects the response, yielding residuals in C ( X ) .
Finally, this work focuses on unregularized least squares and exact orthogonal projection. With regularization, the operators are no longer orthogonal and the residual subspace becomes non-Euclidean or data-adaptive, motivating future work on regularized and sketch-aware residual geometry while preserving the geometric diagnostics enabled by GMI.

6. Conclusions

This paper presented a rank-aware geometric interpretation of linear regression residuals. In the case of codimension-one, the residual projector reduces to the rank-one form ( I P ) = n n , where n spans C ( X ) . For general rank, the residual lies in a k = n ρ dimensional orthogonal complement, with a projector ( I P ) = N N for an orthonormal basis N of C ( X ) . Although algebraically equivalent to the classical OLS formulation, this representation makes the residual subspace explicit and decouples basis construction from repeated projection.
Building on this viewpoint, we introduced the Geometric Multicollinearity Index (GMI), a scale-invariant diagnostic derived from the polar sine and Gram determinant that quantifies predictor-space degeneracy. Synthetic experiments confirm predictable behavior under controlled perturbations, while real-data benchmarks on Boston Housing and Auto MPG show that increasing model size typically improves predictive performance while increasing multicollinearity; GMI captures this trade-off and aligns with max VIF and κ ( X X ) . Across all experiments, exact projector constructions yield identical residuals up to numerical tolerance, whereas Sketch–QR produces a controlled approximation whose deviation is quantified via cosine similarity and residual discrepancy norms.
In general, these results elevate C ( X ) to a central geometric object in regression analysis and position GMI as a concise indicator of departure from orthogonal designs. Extensions to sketch-based, large-scale, streaming, and nonlinear settings are deferred to companion work.

Author Contributions

Conceptualization, M.H.A. and S.B.B.; methodology, M.H.A.; formal analysis, M.H.A.; writing—original draft preparation, M.H.A.; writing—review and editing, M.H.A. and S.B.B.; supervision, S.B.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analysed in this study. The Boston Housing dataset is available from Kaggle [12] and was originally introduced by Harrison and Rubinfeld [14]. The Auto MPG dataset is available from the UCI Machine Learning Repository [15]. No new data were created. The code used to reproduce the experiments and tables in this manuscript is publicly available as a GitHub repository [13].

Acknowledgments

The authors gratefully acknowledge the academic support of Hamad Bin Khalifa University and Lusail University. They also thank Qatar National Library (QNL) for access to scholarly resources and research databases essential to this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
OLS Ordinary Least Squares
GMI Geometric Multicollinearity Index
PCA Principal Component Analysis
VIF Variance Inflation Factor
CV Cross-Validation (K-fold)
QR QR (Orthogonal–Triangular) Decomposition
RRQR Rank-Revealing QR Decomposition
SVD Singular Value Decomposition
FLOPs Floating-Point Operations
GRP Geometric Residual Projection

References

  1. Strang, G. Introduction to Linear Algebra; Wellesley–Cambridge Press: Wellesley, MA, USA, 1993. [Google Scholar]
  2. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
  3. Lay, D.C.; Lay, S.R.; McDonald, J.J. Linear Algebra and Its Applications, 5th ed.; Pearson: Boston, MA, USA, 2016. [Google Scholar]
  4. Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R. Soc. A 2016, 374, 20150202. [Google Scholar] [CrossRef] [PubMed]
  5. Golub, G.H.; Van Loan, C.F. Matrix Computations, 4th ed.; Johns Hopkins University Press: Baltimore, MD, USA, 2013. [Google Scholar]
  6. Mahoney, M.W. Randomized algorithms for matrices and data. Found. Trends Mach. Learn. 2011, 3, 123–224. [Google Scholar] [CrossRef]
  7. Belhaouari, S.; Kahalan, Y.C.; Aksikas, I.; Hamdi, A.; Belhaouari, I.; Haoudi, E.M.N.; Bensmail, H. Generalizing the cross-product to N Dimensions: A Novel Approach for Multidimensional Analysis and Applications. Mathematics 2025, 13, 514. [Google Scholar] [CrossRef]
  8. Mortari, D. n-Dimensional cross-product and its application to matrix eigenanalysis. J. Guid. Control Dyn. 1996, 20, 509–515. [Google Scholar] [CrossRef]
  9. Walsh, B. The scarcity of cross-products on Euclidean spaces. Am. Math. Mon. 1967, 74, 188–194. [Google Scholar] [CrossRef]
  10. Kim, J.H. Multicollinearity and misleading statistical results. Korean J. Anesthesiol. 2019, 72, 558–569. [Google Scholar] [CrossRef] [PubMed]
  11. Chan, J.Y.-L.; Leow, S.M.H.; Bea, K.T.; Cheng, W.K.; Phoong, S.W.; Hong, Z.-W.; Chen, Y.-L. Mitigating the multicollinearity problem and its machine learning approach: A review. Mathematics 2022, 10, 1283. [Google Scholar] [CrossRef]
  12. Kaggle. Boston Housing Dataset. Available online: https://www.kaggle.com/datasets/altavish/boston-housing-dataset (accessed on 21 November 2025).
  13. Alkhateeb, M.H.; Belhaouari, S.B. Geometric Residual Projection (GRP): Reproducible Code and Experiments; GitHub repository, 2025. Available online: https://github.com/MaisAlkhateeb/Geometrics-Residual-Projection (accessed on 16 January 2026).
  14. Harrison, D.; Rubinfeld, D.L. Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. 1978, 5, 81–102. [Google Scholar] [CrossRef]
  15. UCI Machine Learning Repository. Auto MPG Data Set. University of California, Irvine, School of Information and Computer Sciences, 1993. Available online: https://archive.ics.uci.edu/dataset/9/auto+mpg (accessed on 16 January 2026).
  16. Halko, N.; Martinsson, P.-G.; Tropp, J.A. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 2011, 53, 217–288. [Google Scholar] [CrossRef]
Figure 1. Geometric interpretation of the residual projection in a three-dimensional example with dim C ( X ) = 1 and dim C ( X ) = 2 . The design matrix X has a one-dimensional column space C ( X ) (blue line), while the residual space C ( X ) is a two-dimensional plane (orange), spanned by an orthonormal basis n 1 , n 2 . For a response vector Y , the residual r = ( I P ) Y = N N Y is its orthogonal projection onto C ( X ) .
Figure 1. Geometric interpretation of the residual projection in a three-dimensional example with dim C ( X ) = 1 and dim C ( X ) = 2 . The design matrix X has a one-dimensional column space C ( X ) (blue line), while the residual space C ( X ) is a two-dimensional plane (orange), spanned by an orthonormal basis n 1 , n 2 . For a response vector Y , the residual r = ( I P ) Y = N N Y is its orthogonal projection onto C ( X ) .
Preprints 196389 g001
Figure 2. Geometric Multicollinearity Index (GMI) as a function of the polar sine. The index is zero for orthonormal predictors, increases steadily as angular separation decreases, and approaches one under near-collinearity.
Figure 2. Geometric Multicollinearity Index (GMI) as a function of the polar sine. The index is zero for orthonormal predictors, increases steadily as angular separation decreases, and approaches one under near-collinearity.
Preprints 196389 g002
Figure 3. Two-dimensional schematic of the residual subspace generated by the cross–wedge–QR construction. The vectors x 1 and x 2 span the predictor space C ( X ) , while n 1 and n 2 indicate two independent directions in the orthogonal complement C ( X ) prior to QR orthonormalization.
Figure 3. Two-dimensional schematic of the residual subspace generated by the cross–wedge–QR construction. The vectors x 1 and x 2 span the predictor space C ( X ) , while n 1 and n 2 indicate two independent directions in the orthogonal complement C ( X ) prior to QR orthonormalization.
Preprints 196389 g003
Figure 4. Residual agreement diagnostics on Boston Housing (standardized). For each model, we plot the pointwise absolute residual discrepancy | r method r OLS | across observations. Exact projector variants (null-space/QR and, when applicable, Cholesky/OLS) agree with OLS up to numerical precision, while Sketch–QR (Gaussian sketch, s = 150 ) yields a small but nonzero approximation error. For readability, we visualize discrepancies rather than overlaying multiple residual traces.
Figure 4. Residual agreement diagnostics on Boston Housing (standardized). For each model, we plot the pointwise absolute residual discrepancy | r method r OLS | across observations. Exact projector variants (null-space/QR and, when applicable, Cholesky/OLS) agree with OLS up to numerical precision, while Sketch–QR (Gaussian sketch, s = 150 ) yields a small but nonzero approximation error. For readability, we visualize discrepancies rather than overlaying multiple residual traces.
Preprints 196389 g004
Table 1. Exact residual projection variants for X R n × p with rank ρ and residual dimension k = n ρ . The “one-off” cost refers to constructing the projector, while the “per-response” cost refers to applying the residual mapping ( I P ) Y to a new response vector Y. Methods involving ( X X ) 1 assume that X has full column-rank, so that X X is invertible.Notes: Classical OLS and the Cholesky-based variant treat the residual subspace implicitly. The QR/null-space basis method also applies to rank-deficient X and explicitly constructs a basis N of C ( X ) . The rank-one cross-product method applies when dim C ( X ) = 1 and is particularly efficient once the normal n is known.
Table 1. Exact residual projection variants for X R n × p with rank ρ and residual dimension k = n ρ . The “one-off” cost refers to constructing the projector, while the “per-response” cost refers to applying the residual mapping ( I P ) Y to a new response vector Y. Methods involving ( X X ) 1 assume that X has full column-rank, so that X X is invertible.Notes: Classical OLS and the Cholesky-based variant treat the residual subspace implicitly. The QR/null-space basis method also applies to rank-deficient X and explicitly constructs a basis N of C ( X ) . The rank-one cross-product method applies when dim C ( X ) = 1 and is particularly efficient once the normal n is known.
Method Residual ( I P ) Y One-off cost Per-response cost
Classical OLS (via ( X X ) 1 ) I X ( X X ) 1 X Y O ( n p 2 + p 3 ) O ( n p )
QR / null-space basis N N Y O ( n p 2 + n k ) O ( n k )
Cholesky-based (when X X 0 ) I X ( X X ) 1 X Y O ( p 3 ) (form X X and L L ) O ( n p )
Rank-one cross-product ( dim C ( X ) = 1 ) n n Y O ( n 2 ) (construct n) O ( n )
Table 2. Large-scale synthetic multivector residual projection (core methods). The design matrix has n = 1000 , p = 50 , ρ = 45 , and residual codimension dim C ( X ) = 955 . All predictors are standardized. Exact projector variants reproduce the same residual up to numerical precision, while Sketch–QR produces a controlled approximation whose agreement is summarized by cosine similarity. Entries marked as FAIL indicate that the algebraic assumptions of the method (e.g., positive definiteness of X X for Cholesky-based constructions) are violated, rather than an algorithmic failure.
Table 2. Large-scale synthetic multivector residual projection (core methods). The design matrix has n = 1000 , p = 50 , ρ = 45 , and residual codimension dim C ( X ) = 955 . All predictors are standardized. Exact projector variants reproduce the same residual up to numerical precision, while Sketch–QR produces a controlled approximation whose agreement is summarized by cosine similarity. Entries marked as FAIL indicate that the algebraic assumptions of the method (e.g., positive definiteness of X X for Cholesky-based constructions) are violated, rather than an algorithmic failure.
# Method r 2 cos ( r , r exact ) Runtime (s) Status
1 OLS (Moore–Penrose pseudoinverse) 0.5092 1.000 0.0002 OK
2 cross-product (rank-one) NaN NaN NaN N/A
3 Multivector QR (Exact) 0.5092 1.000 0.0000 OK
4 Recursive Cross–Wedge–QR 0.5092 1.000 0.0008 OK
5 Wedge–QR (+GMI) 0.5092 1.000 0.0000 OK
6 Cholesky–QR NaN NaN 0.0016 FAIL
7 Sketch–QR (Gaussian, s = 150 ) 0.5092 1.000 0.0111 OK
Dataset diagnostics:  cond ( X ) 6.28 × 10 13 , κ ( X X ) 3.95 × 10 15 , GMI ( X ) = 1.000 .
Table 3. Boston Housing illustration (standardized predictors). Predictive metrics are reported using 5-fold cross-validation (CV MSE and CV R 2 ). Geometric diagnostics (GMI) are response-independent and depend only on the design matrix X. We additionally report max VIF and κ ( X X ) , and summarize coefficient stability by medianSD ( β ^ ) under resampling/perturbation, to connect GMI with classical multicollinearity measures.
Table 3. Boston Housing illustration (standardized predictors). Predictive metrics are reported using 5-fold cross-validation (CV MSE and CV R 2 ). Geometric diagnostics (GMI) are response-independent and depend only on the design matrix X. We additionally report max VIF and κ ( X X ) , and summarize coefficient stability by medianSD ( β ^ ) under resampling/perturbation, to connect GMI with classical multicollinearity measures.
Model p CV MSE CV R 2 GMI max VIF κ ( X X ) medianSD ( β ^ )
p = 1 (Area/RMPrice/MEDV) 1 44.02 0.467 0.000 1.00 1.00 0.342
p = 4 (Area/RM, TAX, PTRATIO, LSTAT) 4 28.02 0.655 0.434 2.09 7.85 0.257
p 6 (Area/RM, LSTAT, PTRATIO, TAX, NOX, INDUS) 6 28.20 0.653 0.790 3.25 17.22 0.266
Full model (all predictors) 13 23.67 0.712 0.988 9.01 96.47 0.235
Table 4. Auto MPG illustration (standardized predictors), using the same preprocessing and evaluation protocol as in Section 4.3. Predictive metrics are reported using 5-fold cross-validation (CV MSE and CV R 2 ). Geometric diagnostics (GMI) are response-independent and depend only on the design matrix X. We additionally report max VIF and κ ( X X ) , and summarize coefficient stability by medianSD ( β ^ ) under resampling/perturbation.
Table 4. Auto MPG illustration (standardized predictors), using the same preprocessing and evaluation protocol as in Section 4.3. Predictive metrics are reported using 5-fold cross-validation (CV MSE and CV R 2 ). Geometric diagnostics (GMI) are response-independent and depend only on the design matrix X. We additionally report max VIF and κ ( X X ) , and summarize coefficient stability by medianSD ( β ^ ) under resampling/perturbation.
Model p CV MSE CV R 2 GMI max VIF κ ( X X ) medianSD ( β ^ )
p = 1 model (Auto) 1 18.91 0.682 0.000 1.00 1.00 0.118
p = 3 engine-core (Auto) 3 18.22 0.693 0.843 10.31 45.06 0.367
p 6 model (Auto) 6 12.05 0.797 0.973 19.64 117.03 0.325
Full model (Auto) 7 11.31 0.810 0.980 21.84 136.84 0.355
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated