A Global Spatio-Temporal Relationship- and Dynamic Friendship-Aware POI Recommendation Method

Xiaoyu Ji; Yibing Cao; Jiangshui Zhang; Minjie Chen; Pengyu Cui; Yuan Yang

doi:10.20944/preprints202606.1956.v1

Submitted:

25 June 2026

Posted:

26 June 2026

You are already at the latest version

Abstract

Point-of-interest (POI) recommendation in location-based social networks (LBSNs) predicts a user’s next visit using check-in records. As critical factors influencing user decision-making, geographic and social contexts are incorporated to deliver higher-quality recommendation services. While current methods suffer from two limitations: geographical modeling is confined by distance thresholds, ignoring long-range spatio-temporal transitions; and social graphs remain static, failing to capture dynamic behavioral similarities among unconnected users. To address these gaps, we propose GSTRDFA (Global Spatio-Temporal Relationship- and Dynamic Friendship-Aware), a novel model comprising three layers. First, we construct spatio-temporal KGs (STKGs) that encode four relationship types: global spatio-temporal and local geospatial POI-POI links, dynamic user-user friendships, and static social ties. Second, four dedicated encoders—STSEncoder (spatio-temporal state embedding), GeoEncoder (geographical convolution), DFEncoder (graph attention network), and SocEncoder (GraphSAGE)—propagate and aggregate user and POI embeddings along these STKG relations. Third, a GRU-based sequence predictor uses the fused embeddings to match candidate POIs to the user. Evaluations on Foursquare datasets (NYC, JK, CA) show that GSTRDFA outperforms existing methods, improving Acc@1/5/10 and MRR by 2.79-6.67%. Key contributions include: (1) unifying spatial, temporal, and dynamic social signals via STKGs; (2) jointly modeling global spatio-temporal transitions and dynamic friendships; and (3) enabling balanced short-/long-range and short-/long-term transition prediction.

Keywords:

point-of-interest recommendation

;

spatio-temporal knowledge graph

;

spatio-temporal state embedding (STSE)

;

global spatio-temporal relationship modeling

;

dynamic friendship modeling

Subject:

Environmental and Earth Sciences - Other

1. Introduction

Driven by breakthroughs in artificial intelligence and ubiquitous sensing technologies, LBSNs have evolved into comprehensive application platforms integrating spatial perception and intelligent decision-making, delivering more precise and intelligent location-based services to users [1,2]. As the core component LBSNs, POI recommendation captures the influence mechanisms of diverse factors in real-time scenarios on user decision-making, its performance directly dictates the user experience and commercial viability of LBSN applications, such as Foursquare and Gowalla [3].

Traditional POI recommendation approaches modeled check-ins as user-based or POI-based matrices and leveraged MF (Matrix Factorization) [4,5] to obtain the embeddings of users and POIs, or directly predicted the embedding of the next POI based on sequence models such as HMM (Hidden Markov Model) [6], DM (Diffusion Model) [7,8,9] or LSTM [10,11,12], Transformer [13,14,15] or LLMs [16,17]. A user's decision is typically the outcome of a complex interplay of numerous subjective and objective factors, especially geospatial relations (e.g., POIs with smaller distances are more likely to be chosen by current user) and social links (e.g., some users with social links will visit the same POI at certain "agreed-upon" times). Nevertheless, above methods either rely solely on check-in data or implicitly modeling other factors, which reflect user preferences and habits, but ignore the significant influence of other relational data—such as POI-POI geospatial relationships and user-user social links—on the final decision-making process [18,19,20].

KGs can explicitly transform disordered and unstructured relational data into coherent, interpretable graph representations, enabling efficient integration with Graph Neural Networks (GNNs) to facilitate relation propagation and vector space learning [21]. As a result, an increasing number of studies have sought to model relational data as knowledge graphs to achieve more accurate predictions. Representative approaches include LBSN2Vec++ [22], Graph-Flashback [23], DSGNN [24], GKGMSTC [25], KGNext [26], and SNPM [27]. While KGs have driven advancements in POI recommendation, two critical challenges remain unresolved: (1) geospatial distance constraints on POI-POI relationships, geographical modeling limited by distance thresholds, neglecting long-range spatio-temporal transitions; (2) Non-real-time nature of user-user social links, static social graphs failing to capture dynamic behavioral similarities among non-connected users.

To this end, we propose a STKG enhanced model, GSTRDFA (short for Global Spatio-Temporal Relationship- and Dynamic Friendship-Aware), the primary innovation aspects are illustrated in Figure 1. To tackle the first gap, we design two types of POI-POI relationships, global spatio-temporal and local geospatial relationship. Inspired by STSE [28], we construct a spatio-temporal state representation space based on time and geospatial ranges and project the (POI, time) tuple of each check-in record to the corresponding spatio-temporal state, the global spatio-temporal relationships between POIs are indirectly represented via the relationships among spatio-temporal states. A local POI-POI graph based on geospatial proximity is also constructed to preserve the constraints governing short-distance transitions. Two types of user-user relationships, dynamic friendship and static social link, are also established to bridge the second gap. We construct dynamic user-user friendship subgraphs among users who visit the same POI based on temporal interval, to capture spatio-temporal proximity from users sharing short-term interest similarity with the current user. A static user-user social graph is also built to preserve the limitations on long-term agreements among users. Four graph encoders including a STSEncoder based on STSE (Spatio-Temporal State Embedding), a GeoEncoder based on Geographical Convolution (GeoConv), a DFEncoder based on GAT (Graph Attention Network), and a SocEncoder based on GraphSAGE (Graph Sample and AggregatE), are designed to propagate and aggregated embeddings of users and POIs along the four relationships respectively. A sequence predictor based on GRU (Gated Recurrent Unit) is connected after the encoders to generate the matching errors between the candidate POIs and the current user. The contributions of this paper are summarized as follows:

STKG, a prior knowledge representation architecture in which four types of relationships, global spatio-temporal POI-POI relationship, local geospatial POI-POI relationship, dynamic user-user friendship, and static user-user social links, are organized to represent prior geographical and social knowledge;
A global spatio-temporal relationship-aware encoder demonstrated to be effective in the trade-off between short- and long-range transitions, and a dynamic friendship-aware encoder demonstrated to be effective for balancing short- and long-term agreements;
GSTRDFA, a STKG enhanced model simultaneously models and captures global/local geospatial relationships between POIs and dynamic/static friendship relationships between users, demonstrated to achieve a 2.79-6.67% improvement in Acc@1/5/10 and MRR.

2. Related Work

GNNs have been established as a powerful framework for jointly modeling complex contextual relationships in POI recommendation systems [29,30]. STGCN [29] pioneered the integration of GCNs (Graph Convolutional Networks) with spatio-temporal dynamic modeling by constructing a heterogeneous spatio-temporal graph encompassing four types of relational structures: User-POI, POI-POI, POI-Region, and Region-Region. ADQ-GNN [31] enhances modeling capacity through hierarchical relation modeling and time-interval-aware weighting mechanisms, while Dynaposgnn [32] introduces two dynamically evolving graphs—namely, the "User-POI graph" and the "POI-POI graph"—to better capture the temporal dynamics of relational structures. GATs incorporate the attention mechanism, enabling dynamic modeling of node and edge importance and thereby enhancing the capacity to capture complex graph structures [33,34]. ASGNN [30] employs a hierarchical attention network to simultaneously model users' long-term behavioral preferences and short-term contextual interests. Zhang et al. [35] propose a hybrid approach that applies GNNs with attention to aggregate user representations over heterogeneous social relations, while using Bi-LSTM to model POI representations along sequential check-in trajectories. HS-GAT [36] adopts a dual-attention architecture within a GNN framework to facilitate feature propagation across both heterogeneous attribute graphs (User-POI) and homogeneous structural graphs (POI-POI, User-User). MobGT [37] employs separate temporal and spatial graph encoders to disentangle spatio-temporal movement patterns and utilizes a Graph Transformer to aggregate higher-order relational information among POIs.

KG-enhanced methods uniformly model diverse prior relationships as KGs, derive node embeddings via KGE (Knowledge Graph Embedding) techniques or GNNs, and feed these embeddings into sequence models for next-POI prediction. Graph-Flashback [23] and KGNext [26] explicitly models heterogeneous relationships as a STKG to achieve hybrid enhancement of social and geographic dimensions, visiting, temporal, spatial, social in Graph-Flashback and POI-POI, category-category, user-POI, user-category in KGNext. SNPM [27] introduces POI-POI and POI region dynamic subgraphs to describe time-varying knowledge. HyperKGR [38] defines mobility pattern, social, POI side-information relations as hyper-relational KG. KG-enhanced methods have become the dominant paradigm in POI recommendation by enabling comprehensive modeling of multi-source, heterogeneous relationships, thereby significantly alleviating data sparsity. Nevertheless, two critical challenges still remain underdeveloped: the representation of high-dimensional, implicit spatial relationships—particularly global spatio-temporal dependencies among POIs, and the temporal evolution of relational structures—such as dynamic shifts in user social influence.

3. Problem Formulation

The basic definition of next POI recommending is: given a user and a POI set

U

,

I

, a check-in sequence of a certain user

u

with a length of

L

, <

u

,

i_{1}

,

t_{1}

>, <

u

,

i_{2}

,

t_{2}

>, …, <

u

,

i_{L}

,

t_{L}

>, solve for <

u

,

i_{L + 1}

,

t_{L + 1}

>. GSTRDFA integrates the relationships between users and POIs as prior knowledge into the sequence prediction model for POI recommendation. The social relationship graph between users is denoted as

G_{s o c}

, the local geographic spatial relationship graph based on the latitude and longitude coordinates of each POI is denoted as

G_{g e o}

, the dynamic friend relationship graph between users based on the similarity of short-term check-in behaviors is denoted as

G_{d f}

, the set of spatio-temporal states obtained by dividing the spatio-temporal range of all check-in behaviors is denoted as

P

, and the global spatio-temporal relationship between spatio-temporal states is denoted as

G_{s t s}

. Let the embeddings of

U

,

I

, and

P

be

\vec{U} = {{\vec{u}}_{1}, {\vec{u}}_{2}, \dots, {\vec{u}}_{| U |}}

,

\vec{I} = {{\vec{i}}_{1}, {\vec{i}}_{2}, \dots, {\vec{i}}_{| I |}}

, and

\vec{P} = {{\vec{p}}_{1}, {\vec{p}}_{2}, \dots, {\vec{p}}_{| P |}}

, respectively. In information propagation, the

\vec{u}

after propagation through

G_{s o c}

and

G_{d f}

are denoted as

{\vec{u}}_{s o c}

and

{\vec{u}}_{d f}

, respectively, and the two are coupled as

{\vec{u}}_{e n c o d e d}

. The

\vec{i}

after propagation through

G_{g e o}

and

G_{s t s}

are denoted as

{\vec{i}}_{g e o}

and

{\vec{i}}_{s t s}

, respectively, and the two are coupled as

{\vec{i}}_{e n c o d e d}

. In sequence prediction, the prediction result derived from

{\vec{u}}_{e n c o d e d}

and

{\vec{i}}_{e n c o d e d}

is denoted as

i_{p r e d}

, and the true result is denoted as

i_{t r u e}

. The negative sampling candidate set is denoted as

{O_{G}^{k} | G = {G_{s o c}, G_{d f}, G_{g e o}, G_{s t s}}, k = {1, 2}}

. The total error for fitting the user relationships of

G_{s o c}

and

G_{d f}

is defined as

L_{u}

, the error for fitting the POI relationship of

G_{g e o}

is defined as

L_{i}

, and the error for fitting the spatio-temporal state relationship of

G_{s t s}

is defined as

L_{p}

. The sequence prediction error is denoted as

L_{s}

, and the total error is denoted as

L_{t o t a l}

. Then, for a specific user

u

, the objective of GSTRDFA is to find the

i

from

I

that minimizes the total error

L_{t o t a l}

.

4. Methodology

GSTRDFA consists of three layers as illustrated in Figure 2: Representation Layer, Propagation Layer, and Prediction Layer. The Representation Layer describes two parts: symbolic representation and vector representation. The symbolic representation refers to three sets of prior knowledge graphs constructed based on existing data, using symbols to model the social relationships and dynamic friend relationships among users, as well as the local geographical spatial relationships and global spatio-temporal relationships among POIs. The vector representation refers to the embeddings of users, POIs, and spatio-temporal state sets, using numerical values to model the similarities or differences of nodes on the graph in the recommendation scenario. The Propagation Layer describes the propagation and interaction methods of embeddings on various relationships, integrating social and geographical information into the embeddings through the computational flow of the embeddings in the representation layer on each graph. The Prediction Layer performs temporal modeling of the historical spatio-temporal trajectories of specific users and predicts the embedding of the next POI that the user is most likely to access using the embeddings output by the Propagation Layer.

4.1. Representation Layer

1) POI-POI Relationships: global spatio-temporal and local geospatial relationship. We split the time and geospatial coordinate range of all check-in records into temporal slices and geospatial grids to produce the spatio-temporal state set

P

according to the custom time interval

Δ T_{s t s}

, longitude interval

Δ λ

, and latitude interval

Δ ϕ

. Relations between spatio-temporal states with a same temporal slice but different spatial grids are defined as temporal proximity relationship

p \leftrightarrow_{Δ T_{s t s}}^{t e m p o r a l} q

. Relations between spatio-temporal states with a same spatial grid but different temporal slices are defined as spatial proximity relationship

p \leftrightarrow_{Δ λ, Δ ϕ}^{s p a t i o} q

. Relations between spatio-temporal states with both different temporal slices and spatial grids is defined as the spatio-temporal proximity relationship

p \leftrightarrow_{Δ T_{s t s}}^{t e m p o r a l} m \leftrightarrow_{Δ λ, Δ ϕ}^{s p a t i o} q

or

p \leftrightarrow_{Δ λ, Δ ϕ}^{s p a t i o} n \leftrightarrow_{Δ T_{s t s}}^{t e m p o r a l} q

, indirectly expressed by the unique spatio-temporal state

m

that shares the temporal state with

p

and the spatial state with

q

, or the unique spatio-temporal state

n

that shares the spatial state with

p

and the temporal state with

q

. As shown in Figure 3, All the

p \leftrightarrow_{Δ T_{s t s}}^{t e m p o r a l} q

,

p \leftrightarrow_{Δ λ, Δ ϕ}^{s p a t i o} q

,

p \leftrightarrow_{Δ T_{s t s}}^{t e m p o r a l} m \leftrightarrow_{Δ λ, Δ ϕ}^{s p a t i o} q

, and

p \leftrightarrow_{Δ λ, Δ ϕ}^{s p a t i o} n \leftrightarrow_{Δ T_{s t s}}^{t e m p o r a l} q

among all spatio-temporal states jointly form the global spatio-temporal relationship graph

G_{s t s}

.

Each check-in record <

u

,

i

,

t

> can be mapped to a unique

p

through

P

, as shown in Figure 4. The global spatio-temporal relationship between any two different POIs is mapped to the direct temporal proximity, direct spatial proximity, and indirect spatio-temporal proximity between the spatio-temporal states

p

. It is worth noting that the same POI can be represented by different spatio-temporal states in different check-in records, effectively addressing the issue of time-varying nature. Meanwhile, different <

u

,

i

,

t

> may be mapped to the same

p

, failing to reflect the spatio-temporal differences between POIs.

space.

The local geospatial relationship is constructed using the method based on the distance threshold

Δ D_{g e o}

in DisenPOI [39] to build the local geospatial relationship graph between POIs. This geospatial relationship constrained by geospatial distance is denoted as

i \leftrightarrow_{Δ D_{g e o}}^{g e o} j

, and all such relationships form the local geospatial relationship graph

G_{g e o}

among POIs, which is a globally unique undirected weighted graph.

2) User-user Graphs: dynamic friendships and static social links. Inspired by the sequential relationship of the same user's dynamic visits to different POIs in Diff-POI [7], a time interval

Δ T_{d f}

is set. For the current check-in record <

u

,

i

,

t

>, the set of users who visited the POI

i

within the time range of [

t - Δ T_{d f}

,

t + Δ T_{d f}

] is

U_{d f}

(necessarily including the current

u

). Then, the relationship between

u

and any user in

U_{d f}

is defined as the dynamic friendship relationship

u \leftrightarrow_{Δ T_{d f}}^{d f} v

of the current <

u

,

i

,

t

> form the user dynamic friendship relationship graph

G_{d f}

, an undirected weighted graph. On the one hand, different users have different

G_{d f}

due to different check-in records. On the other hand, even the same user may have different

G_{d f}

in different check-in records. This effectively expresses the differences between users and realizes the modeling of dynamic similarity among users with similar check-in behaviors within a short time range.

Taking the real check-in record <

u_{1}

,

i_{37}

,2012-04-19> in the Foursquare dataset as an example and assuming

Δ T_{d f}

= 7 days in Figure 5, the left and right parts show the

G_{d f}

of

u_{1}

in <

u_{1}

,

i_{37}

,2012-04-19> and <

u_{1}

,

i_{37}

,2012-04-23> respectively.

A static social graph

G_{s o c}

comprises all social links

u \overset{s o c}{\leftrightarrow} v

derived from the native dataset, representing the static social relationship between users. Unlike

G_{d f}

,

G_{s o c}

is a globally unique undirected and unweighted graph.

3) Embeddings. The embeddings of the

U

,

I

, and

P

are respectively denoted as

\vec{U} \in ℝ^{| U | \times d}

,

\vec{I} \in ℝ^{| I | \times d}

, and

\vec{P} \in ℝ^{| X | \times | Y | \times | T | \times d}

, where

| U |

and

|I|

are the numbers of users and POIs respectively,

| X |

and

| Y |

represent the numbers of two-dimensional spatial grids in the longitude and latitude directions respectively,

| T |

represents the number of one-dimensional time slices in the time direction, and the total number of spatio-temporal states is

| P | = | X | \times | Y | \times | T |

, with d being the feature dimension. The embeddings of a single specified user

u

, POI

i

, and spatio-temporal state

p

are respectively denoted as

\vec{u}, \vec{i}, \vec{p} \in ℝ^{d}

.

4.2. Propagation Layer

1) Global Spatio-Temporal Relationship-Aware Encoder. As illustrated in Figure 6, Global Spatio-Temporal Relationship-Aware Encoder (GSTRAE) is an encoder designed to jointly capture the global spatio-temporal and local geospatial relationships among POIs, comprising two sub-encoders (STSEncoder and GeoEncoder) and a coupling component.

The STSEncoder illustrated in Figure 7 is tasked with propagating information of POI embeddings along global spatio-temporal relationship paths. Inspired by the approach of combining word vectors with positional encodings in the Transformer, we opt to construct a POI vector representation that encodes global spatio-temporal relationships by adding the POI embedding

\vec{i}

and the STS embedding

\vec{p}

. Specifically,

\vec{i}

captures the individual differences among POIs, while

\vec{p}

encodes the global spatio-temporal position of the POI corresponding to the current check-in record, thus reflecting group-level differences.

The GeoEncoder is responsible for the information propagation of POI embeddings along the local geographical spatial relationship paths. The method for capturing the similarity between POIs in DisenPOI [39], demonstrated to effectively improve the recommendation accuracy, is introduced in our research as (1).

W_{1}^{(l)}, W_{2}^{(l)} \in ℝ^{d \times d}

denote two learnable weights for the

l

-th layer,

w (d_{i j}) = e^{- d_{i j}^{2}}

is a geospatial distance-based similarity metric function, and

⊙

represents the Hadamard product.

\begin{matrix} {\vec{i}}_{g e o} = L e a k y R e L U (W_{1}^{(l)} {\vec{i}}^{(l - 1)} \\ + \frac{1}{\sqrt{| N (i) | | N (j) |}} \sum_{j \in N (i)} W_{1}^{(l)} {\vec{j}}^{(l - 1)} \\ + w (d_{i j}) W_{2}^{(l)} {\vec{j}}^{(l - 1)} ⊙ {\vec{i}}^{(l - 1)}) \end{matrix}

(1)

2) Dynamic Friendship-Aware Encoder. Dynamic Friendship-Aware Encoder (DFAE) is an encoder designed to jointly capture dynamic friendships and static social links among users, combining two sub-encoders (DFEncoder and SocEncoder) via the same method as GSTRAE. The DFEncoder incorporate time intervals into a multi-head attention GAT (Graph Attention Network), which makes dynamic friends with smaller time intervals share greater similarity with the current user. The attention score calculation for the

k

-th head is provided in (2), where

s

denotes the similarity function for user

u

and

v

with a time interval

Δ t_{u v}

on

G_{d f}

,

K

denotes the number of attention heads,

e_{u v}^{k}

represents the attention score between user

u

and its neighboring node

v

for the

k

-th head,

{\vec{α}}^{k} \in ℝ^{K / 2 d}

is the learnable attention coefficient for the

k

-th head;

W^{k} \in ℝ^{| d / K | \times | d / K |}

is the learnable weight matrix for the

k

-th head,

{\vec{u}}^{k}

and

{\vec{v}}^{k}

denote the components of

\vec{u}

and

\vec{v}

on the

k

-th head respectively.

\{\begin{array}{l} s = e^{- | Δ t_{u v} |} \\ a = L e a k y Re L U ({\vec{α}}^{k T} \cdot [W^{k} {\vec{u}}^{k} | | W^{k} {\vec{v}}^{k}) \\ e_{u v}^{k} = S o f t \max (s \cdot a) \\ {\vec{u}}_{d f} = | |_{k = 1}^{K} σ (\sum_{v \in N (u)} e_{u v}^{k} \cdot W^{k} {\vec{v}}^{k}) \end{array}

(2)

The SocEncoder employs 2 layers of GraphSAGE [40] to capture 1- or 2-order social relationships on

G_{s o c}

, formulated in (3), where

l

denotes the layer index,

{\vec{u}}^{(l - 1)}, {\vec{v}}^{(l - 1)} \in ℝ^{d}

represent the output features of nodes

u

and

v

at the (

l - 1

)-th layer, respectively, and serve as the input features for the

l

-th layer.

σ (\cdot)

denotes the nonlinear activation function ReLU,

W^{(l)} \in ℝ^{2 d \times d}

is the learnable weight matrix for the

l

-th layer,

| |

denotes the vector concatenation operation, and

N (u)

denotes the neighbor node set of node

u

.

{\vec{u}}_{s o c} = σ (W^{(l)} \cdot ({\vec{u}}^{(l - 1)} ‖ \frac{1}{| N (u) |} ({{\vec{v}}^{(l - 1)} ∣ v \in N (u)})))

(3)

4.3. Prediction Layer

We propose a Spatio-Temporal Gated GRU (STGRU) that incorporates time intervals, spatial distances, and dynamic weights to implement sequential modeling of dynamic user-POI interaction relationships. As detailed in (4), STGRU extends each vanilla GRU unit by adding influence factors of time intervals and spatial distances, along with learnable dynamic weights, to regulate the reset and update processes.

\{\begin{array}{l} {\vec{r}}_{s} = σ (W_{r} [(e^{- Δ t_{s, s - 1}} {\vec{t}}_{s}) ⊙ (e^{- Δ d_{s, s - 1}} {\vec{d}}_{s}) ⊙ {\vec{h}}_{s - 1}, {\vec{i}}_{s}] \\ {\vec{z}}_{s} = σ (W_{z} [(e^{- Δ t_{s, s - 1}} {\vec{t}}_{s}) ⊙ (e^{- Δ d_{s, s - 1}} {\vec{d}}_{s}) ⊙ {\vec{h}}_{s - 1}, {\vec{i}}_{s}] \end{array}

(4)

4.4. Negative Sampling

Inspired by the adversarial space approach from DIG [39,41],

O_{s t s}

,

O_{g e o}

,

O_{d f}

, and

O_{s o c}

, negative sampling sets of

G_{s t s}

,

G_{g e o}

,

G_{d f}

, and

G_{s o c}

, are built to update the embeddings of users, POIs, and spatio-temporal states, as well as the parameters of encoders during backpropagation.

O_{s t s}

,

O_{g e o}

,

O_{d f}

, and

O_{s o c}

all consist of two adversarial subsets:

O^{1}

and

O^{2}

.

O^{1}

indicates samples where the user-POI interaction exists but the user-user or POI-POI relationship does not hold, while

O^{2}

indicates samples where the user-POI interaction is absent but the user-user or POI-POI relationship holds.

O_{s t s}

is detailed in (5), where

p

,

q

denote the pair of positive and negative spatio-temporal states.

S_{u}^{+}

,

S_{u}^{-}

represent the sets of spatio-temporal states visited and unvisited by

u

;

\land

/

\lor

are logical and/or operators.

G_{s t s}^{t}

,

G_{s t s}^{s}

indicate temporal and spatial relations between spatio-temporal states on

G_{s t s}

.

\{\begin{array}{l} \begin{matrix} O_{s t s}^{1} = {(u, p, q) | u \in U, p \in S_{u}^{+}, \\ q \in S_{u}^{+} \land ((p, q) \notin G_{s t s}^{t} \land (p, q) \notin G_{s t s}^{s})} \end{matrix} \\ \begin{matrix} O_{s t s}^{2} = {(u, p, q) | u \in U, p \in S_{u}^{+}, \\ q \in S_{u}^{-} \land ((p, q) \in G_{s t s}^{t} \lor (p, q) \in G_{s t s}^{s})} \end{matrix} \end{array}

(5)

O_{d f}

is detailed in (6), where

u

,

v

denote the pair of positive and negative users.

U_{i}^{+}

,

U_{i}^{-}

represent the sets of users who have or not visited

i

;

G_{d f} (u, i, t)

indicates the dynamic friendship graph of

u

at the specified check-in record

< u, i, t >

.

\{\begin{array}{l} O_{d f}^{1} = {(u, v, i) | i \in I, u \in U_{i}^{+}, v \in U_{i}^{+} \land (u, v) \notin G_{d f} (u, i, t)} \\ O_{d f}^{2} = {(u, v, i) | i \in I, u \in U_{i}^{+}, v \in U_{i}^{-} \land (u, v) \in G_{d f} (u, i, t)} \end{array}

(6)

4.5. Loss and Optimization

\{\begin{array}{l} L_{p} = - \frac{1}{2 L} \sum_{k = 1}^{2} \sum_{(u, p, q) \in O_{s t s}^{k}} \ln σ (< \vec{u}, \vec{p} > - < \vec{u}, \vec{q} >) \\ L_{i} = - \frac{1}{2 L} \sum_{k = 1}^{2} \sum_{(u, i, j) \in G_{g e o}^{k}} \ln σ (< \vec{u}, \vec{i} > - < \vec{u}, \vec{j} >) \\ L_{u} = - \frac{1}{4 L} \sum_{k = 1}^{2} \sum_{G \in {G_{s o c}, G_{d f}}} \sum_{(u, v, i) \in O_{G}^{k}} \ln σ (< \vec{u}, \vec{i} > - < \vec{v}, \vec{i} >) \\ \begin{matrix} L_{s} = β (< \vec{u}, {\vec{i}}_{p r e d} > - < \vec{u}, {\vec{i}}_{t r u e} >) 2 \\ + \{\begin{array}{l} \frac{1}{2} ({\vec{i}}_{p r e d} - {\vec{i}}_{t r u e}) 2 & i f | {\vec{i}}_{p r e d} - {\vec{i}}_{t r u e} | < 1.0 \\ (| {\vec{i}}_{p r e d} - {\vec{i}}_{t r u e} | - \frac{1}{2}) & o t h e r w i s e \end{array} \end{matrix} \\ L_{t o t a l} = γ L_{s} + (1 - γ) \frac{L_{u} + L_{i} + L_{p}}{3} \end{array}

(7)

The total loss

L_{total}

is decomposed into four components:

L_{p}

,

L_{i}

,

L_{u}

, and

L_{s}

based on BPR (Bayesian Personalized Ranking) and Huber losses, as shown in (7), minimized by Adam so as to optimize the embeddings and parameters across all modules. BPR (Bayesian Personalized Ranking) loss is introduced into

L_{p}

,

L_{i}

, and

L_{u}

, where

L

indicates the length of current trajectory. Huber loss is introduced into

L_{s}

, where

{\vec{i}}_{p r e d}

,

{\vec{i}}_{t r u e}

denote the predicted and true value, and

β

denotes the weight assigned to the interaction factor component, which is typically set to 0.1.

L_{total}

is designed as a linearly weighted combination, where

γ

is set to 0.5 empirically.

5. Experiments

5.1. Datasets

To comprehensively assess the performance of our proposed GSTRDFA regarding geographical and social factors, three representative subsets (NYC, JK, and CA) were extracted from Foursquare based on the scope of geographic space, with their statistical details presented as Table 1. NYC corresponds to New York City, USA, characterized by extremely high check-in density, dense geographic distribution of Points of Interest (POIs), and high-density social connections among users. JK refers to Jakarta, the capital of Indonesia, which features a small regional area, low check-in density, moderate geographic distribution of POIs, and extremely sparse social ties between users. CA denotes the State of California, USA, with a broad geographic scope; its check-in density, POI geographic distribution, and social relationship density all fall between those of NYC and JK. All three subsets cover the check-in time window from December 8, 2011, to April 23, 2012, with longitude and latitude coordinates referenced to the WGS84 geographic coordinate system. In terms of regional area, NYC is slightly larger than JK—both belonging to the city-level scale—while CA, at the state level, is dozens of times the size of NYC and JK combined.

5.2. Metrics

Accuracy at K (Acc@K) and Mean Reciprocal Rank (MRR) are employed to evaluate the performance of our model and baseline methods. As shown in (8), Acc@K (K=1, 5, 10) reflects the practical utility of the recommendation system (e.g., whether users can find the target POI in a short list). MRR emphasizes the ranking precision, which is critical for applications like navigation where the top recommendation matters most.

Q

represents the set of all check-in sequences of users in the test set,

| Q |

is the size of

Q

, and

r a n k_{i}

is the rank of the true POI in the prediction result of the model for the

i

-th sequence.

Π (\cdot)

is the indicator function, which outputs 1/0 if the input parameter condition is true/false?

\{\begin{matrix} A c c @ K = \frac{1}{| Q |} \sum_{i = 1}^{| Q |} Π (r a n k_{i} \leq K) \\ M R R = \frac{1}{| Q |} \sum_{i = 1}^{| Q |} \frac{1}{r a n k_{i}} \end{matrix}

(8)

5.3. Baselines

Baseline models are categorized into four groups in Table 2 based on whether the user-user and POI-POI relationships were encoded. Specifically, group (a) focuses on capturing sequential patterns but omits explicit modeling of user-user and POI-POI relationships. Group (b) integrates user-user interaction information yet disregards POI-POI relationships. Group (c) encodes POI-POI relationships while neglecting user-user relationships. Group (d) consists of cutting-edge approaches that align with the research focus of this study.

5.4. Experimental Settings

GSTRDFA was implemented using PyTorch 2.2.0+cu118 and executed on an NVIDIA GeForce RTX 4090 GPU. All models were trained with identical hyperparameters: feature vector dimension

d = 128

, number of training

e p o c h s = 500

, learning rate

l r = 5 e - 4

. For model-specific structural parameters: (1) the time interval for dynamic friendship relations was set as

Δ T_{d f} = 7 d a y s

; (2) the distance threshold for local geospatial relations was configured as

Δ D_{g e o} = 1 k m

for the NYC and JK datasets, and

Δ D_{g e o} = 3 k m

for the CA dataset; (3) the spatio-temporal state partitioning intervals of

G_{s t s}

are aligned with those of

G_{d f}

and

G_{g e o}

, respectively, i.e.,

Δ T_{s t s} = Δ T_{d f}

and

Δ D_{s t s} = Δ D_{g e o}

.

5.5. Results & Analysis

Detailed experimental results are presented in Table 3. On the NYC and CA datasets, SNPM achieved the highest performance among all baseline models; On the JK dataset, GeoCo yielded the optimal results among all baseline models; GSTRDFA outperformed both SNPM and GeoCo on all datasets and all evaluation metrics. Relative to the state-of-the-art baseline models (SNPM and GeoCo), GSTRDFA achieved performance gains of 2.79%-4.48% on Acc@1, 2.91%-6.55% on Acc@5, 3.06%-6.67% on Acc@10, and 2.84%-5.17% on MRR.

1) Cross-models: performance analysis of different models on identical datasets. On the NYC and CA datasets, comparisons between LBSN2Vec++ vs. STGN and DSGNN vs. STAN reveal that social relationship-enhanced (soc-enhanced) models outperform pure sequential models. This result demonstrates the efficacy of explicitly modeling user-user relationships in boosting recommendation performance. However, the suboptimal performance of DSGNN on the JK dataset can be attributed to the unique data characteristics of JK. Across all datasets, comparisons between GeoSAN vs. STGN and GETNext vs. STAN indicate that geographical relationship-enhanced (geo-enhanced) models significantly outperform pure sequential models, validating that explicit modeling of POI-POI relationships effectively improves recommendation performance. Further comparisons—GeoSAN vs. LBSN2Vec++, GETNext vs. DSGNN, and GeoCo vs. Graph-Flashback—show that geo-enhanced models generally outperform soc-enhanced models. This finding suggests that explicit modeling of POI-POI geospatial relationships yields a more pronounced improvement than explicit modeling of user-user social relationships. Notably, Graph-Flashback, KGNext, and SNPM deliver standout results: Graph-Flashback outperforms GETNext on both NYC and CA datasets; KGNext outperforms all baseline models except SNPM and GeoCo across all datasets; SNPM achieves the best performance among baseline models on NYC and CA datasets.

These superior results stem from the fact that all three models leverage knowledge graphs (KG) to explicitly model rich user-user, POI-POI, and user-POI relationships. Performance comparisons of the top-performing models in each category are presented in Figure 8. Among these, SNPM outperforms GETNext and KGNext, confirming that dynamic graphs offer a more significant performance boost than static graphs. GSTRDFA not only jointly models user-user social relationships and POI-POI geospatial relationships explicitly but also incorporates global spatiotemporal relationships between POIs and dynamic friendship relationships between users. The experimental results validate the effectiveness of these enhancements.

2) Cross-datasets: performance analysis of the same model across different datasets. NYC Dataset: Moderate geographical coverage, large-scale users and POIs, rich user social relationships, and high POI density. JK Dataset: Small geographical coverage, moderate user/POI scale, extremely sparse user social relationships, and moderate POI density. CA Dataset: Large geographical coverage, moderate user/POI scale, moderate social relationship density and POI density, but with uneven POI distribution. Social-aware models (LBSN2Vec++, DSGNN, Graph-Flashback) achieve significantly better performance on NYC and CA than on JK. This indicates that the effectiveness of user social relationship modeling depends on a moderate social relationship density—on the JK dataset with extremely sparse social ties, soc-enhanced models exhibit suboptimal performance. Our GSTRDFA integrates both static social relationships and dynamic friendship relationships, and its performance is jointly influenced by social relationship density and check-in density, forming a specific trade-off pattern. Even on the JK dataset, GSTRDFA still outperforms soc-enhanced models by a large margin, which validates the effectiveness of dynamic friendship relationship modeling. Geo-aware models (GeoSAN, GETNext, GeoCo) perform better on CA than on NYC and JK. This suggests that the enhancement effect of POI geographical relationship modeling is correlated with POI spatial density, where a moderate density is more conducive to leveraging geographical factors. Notably, this effect is not inversely proportional to POI density: geo-enhanced models do not achieve optimal performance on NYC (the dataset with the highest POI density), indicating that both POI spatial density and distribution characteristics in check-in-dense areas affect the effectiveness of geographical enhancement. GeoCo adopts geographical grids to capture global spatial relationships between POIs, which mitigates the performance degradation caused by uneven POI spatial distribution to a certain extent. Its outstanding performance across all datasets confirms the value of modeling global spatial relationships between POIs. GSTRDFA leverages geographical distance thresholds and spatio-temporal state partitioning to jointly model global and local relationships between POIs, effectively alleviating issues of high POI density and uneven spatial distribution. Its superior performance on NYC (compared to CA and JK) validates the robustness of GSTRDFA across datasets with varying spatial densities and distribution patterns.

5.6. Ablation Study

The ablation results are summarized in Table 4 and visualized in Figure 9.

1) Impact of POI-POI modules. The 5.8% MRR reduction confirms that local geospatial modeling (e.g., explicit distance-based graphs encoding) is essential for capturing POI proximity and user mobility patterns. The largest performance drop (13.2% MRR) occurs when STSEncoder is removed, underscoring its dominance in modeling global spatio-temporal relationships between POIs. The absence of GSTRAE causes a 16.9% MRR decrease, which is greater than that of each individual POI-POI relationship encoder (5.8% of GeoEncoder or 13.2% of STSEncoder), indicating the effectiveness of jointly modeling and encoding of global spatiotemporal relationships and local geospatial relationships between POIs.

2) Impact of user-user modules. Removing the SocEncoder leads to a 3.7% drop in MRR, indicating that modeling static user-user social links is critical for capturing social homophily in LBSNs. The 7.6% MRR decline caused by removing the DFEncoder, highlights the importance of dynamic user-user friendship modeling, which adapts to temporal changes in user preferences. Removing the DFAE leads to a greater decline (10.6%) in MRR than that of each individual user-user relationship encoder (7.6% of DFEncoder or 3.7% of SocEncoder), demonstrating that coupled modeling and encoding of static and dynamic user-user relationships is more effective than solely considering static or dynamic relationships.

3) Comprehensive impact of all. Similarly, removing GSTRDFAE composed of all the proposed encoders leads to a catastrophic 32.1% drop, emphasizing their joint role in aggregating multi-modal features (static social links, dynamic friendships, local geospatial and global spatiotemporal relationships) via graph structures. All components contribute synergistically, with STSEncoder being the most critical. All the module ablation results on 3 datasets are shown in Table 4, it can be seen intuitively that the full model’s superiority validates our design choices, particularly the integration of POI-POI global spatiotemporal relationships and user-user dynamic friendships through graph-based aggregation.

5.7. Hyper Parameters Sensitivity Analysis

We evaluated the impact of three key hyper parameters on model performance: the embedding vector dimension

d

={32, 64, 128, 256}, the time interval

Δ T

={1 day, 5 days, 7 days, 14 days} used to determine dynamic friendship relations and spatiotemporal states, and the distance threshold

Δ D

={0.5 km, 1 km, 3 km, 5 km} used to determine local geographical spatial relations and spatiotemporal states. The experiments were conducted on three datasets: NYC, JK, and CA, with MRR as the evaluation metric. The default values were set as

d = 128

,

Δ T = 7 d a y s

, and

Δ D = 1 k m

on NYC/JK or

Δ D = 3 k m

on CA. When evaluating one of the hyper parameters, the other two were set to their default values.

={32, 64, 128, 256}.

1) Impact of Embedding Dimension

d

. As illustrated in Figure 10, when

d

increases from 32 to 128, the model performance on all datasets improves significantly: the MRR of NYC, JK, and CA increases by 12.3%, 9.8%, and 8.5%, respectively. This is because higher embedding dimensions enable more effective encoding of complex relational patterns (e.g., user-user, POI-POI, and user-POI interactions). When

d = 128

, the model achieves an optimal balance between computational efficiency and predictive performance, yielding the highest MRR across all datasets. However, as

d

exceeds 128, performance plateaus or even degrades (e.g., a 2.1% drop in CA’s MRR). This phenomenon can be attributed to overfitting caused by excessively high dimensionality—unnecessary feature redundancy in large embeddings leads to the model capturing noise rather than generalizable patterns from the data.

={1, 3, 5, 7, 14} days.

2) Impact of Time Interval

Δ T

. As depicted in Figure 11, the influence of

Δ T

on model performance exhibits a non-monotonic pattern, driven by the trade-off between the effectiveness of two core modules (DFEncoder and STSEncoder). When

Δ T < 7 d a y s

(e.g.,

Δ T = 1 d a y

): the explicit dynamic friend graph

G_{d f}

becomes extremely sparse due to the narrowed time window, leading to a sharp attenuation of DFEncoder’s enhancement effect. Concurrently, the temporal dependencies between spatio-temporal states grow excessively dense—this over-densification not only fails to leverage STSEncoder’s modeling capability but also introduces noise interference that increases model complexity and degrades recommendation accuracy (e.g., a 22.1% MRR drop on NYC). When

Δ T = 7 d a y s

: the model strikes an optimal balance between capturing users’ short-term preferences and long-term spatio-temporal pattern similarity (e.g., weekly commuting behaviors). This trade-off results in the highest MRR across all datasets: NYC (0.3497), JK (0.3366), and CA (0.3468). When

Δ T > 7 d a y s

(e.g.,

Δ T = 14 d a y s

):

G_{d f}

grows denser, strengthening contribution of DFEncoder. However, the overly sparse spatio-temporal states cause the loss of fine-grained details in global spatio-temporal relationships between POIs, weakening effectiveness of STSEncoder. Given STSEncoder’s dominant role (validated in our module ablation study), the net effect leads to a performance decline (e.g., a 9.6% MRR drop on JK).

={0.5, 1, 3, 5} km.

3) Impact of Spatial Distance Threshold

Δ D

. As illustrated in Figure 12, the effect of

Δ D

on model performance varies across datasets, primarily driven by POI spatial density and the complementary interplay between GeoEncoder and STSEncoder. For POI-dense, area-small datasets (NYC, JK): When

Δ D = 1 k m

, the model achieves optimal performance—this aligns with the empirical finding that users’ daily movement radius is concentrated within 0.5-1 km. A smaller

Δ D = 0.5 k m

narrows the local spatial scope, leading to a sparser local geospatial relationship graph

G_{g e o}

and thus attenuating GeoEncoder’s effectiveness. Concurrently, the increased spatial grid density of spatio-temporal states enhances STSEncoder’s ability to capture fine-grained spatial patterns, forming a complementary trade-off between the two modules. Notably, NYC and JK yield higher MRR at

Δ D = 0.5 k m

than at

Δ D = 3 / 5 k m

, confirming that STSEncoder’s performance contribution outweighs GeoEncoder’s—a conclusion consistent with our module ablation study. For POI-sparse, large-scale dataset (CA): A larger

Δ D

is required to capture long-range POI associations. The model reaches peak performance at

Δ D = 3 k m

, as this threshold balances coverage of sparse POIs and avoidance of over-dense spatial connections. When

Δ D = 5 k m

, the overly expanded spatial scope leads to a denser

G_{g e o}

(strengthening GeoEncoder) but coarser-grained spatio-temporal states (weakening STSEncoder). Given STSEncoder’s dominant role (validated in ablation experiments), the net effect is a 5.0% MRR decline for CA.

4) General Insight. Neither a smaller nor larger

Δ D

uniformly improves performance:

Δ D = 0.5 k m

reduces NYC/JK’s MRR by 9.9%/9.7%, while

Δ D = 5 k m

degrades CA’s MRR. This non-monotonic pattern highlights that

Δ D

must be calibrated to dataset-specific spatial characteristics (density, scale) to optimize the GeoEncoder-STSEncoder trade-off.

5.8. Case Study

1) Dynamic Friendship. Mitigating geospatial distance limitations: We take a 4-step trajectory of user 48274: <77335, 2011-12-13> → <132792, 2011-12-23> → <90472, 2011-12-25> → <873490, 2011-12-26> (the first three points as input, the fourth as ground truth). Dynamic friend identification: Per the definition of dynamic friendship, user 48274’s dynamic friends at the 4th check-in (<873490, 2011-12-26>) are 1419576, 432267, 1205137, and 1674387. Geospatial context: As shown in Figure 13a, the ground truth POI 873490 is far from the user’s first three visited POIs (77335, 132792, 90472)—exceeding the 1 km distance threshold—indicating weak geographical correlation. Prediction outcome: GSTRDFA’s top-10 predictions include: 6345 (co-occurring in user 48274’s and dynamic friend 1674387’s recent visit history); 873490 (present in all dynamic friends’ recent visit records); 8 additional POIs from the union of the user’s and dynamic friends’ historical visits. This case confirms that dynamic friendship modeling enables GSTRDFA to transcend geospatial distance constraints: even when the target POI is geographically distant, the model leverages friend-shared behavioral patterns to make accurate predictions.

2) Global Spatio-Temporal Relationships. Enhancing medium-distance prediction accuracy: We take a 4-step trajectory of user 17154: <12968, 2011-12-08> → <19766, 2011-12-09> → <60383, 2011-12-10> → <11492, 2011-12-11> (first three as input, fourth as ground truth). We compare predictions from the full GSTRDFA model and the ablation variant w/o STSEncoder (Table 5 for data and Figure 13b for visualization). Ranking Disparity: 8/10 predicted POIs overlap between the two models, but the ground truth POI 11492 ranks 3rd in the full model vs. 7th in w/o STSEncoder. Root cause analysis: The w/o STSEncoder variant relies solely on historical visit records and ranks POIs by geographical distance to the last visited POI (60383). Since 11492 is far from 60383 (even beyond the

G_{g e o}

threshold), it is downranked. The full model integrates

G_{s t s}

(spatio-temporal state graph) and STSEncoder, which: Expand the model’s receptive field to include medium-distance POIs (exceeding

G_{g e o}

but captured by

G_{s t s}

); Compute loss for medium-distance POIs via joint embedding of the previous POI and spatio-temporal states (compensating for geographical distance bias); Preserve short-distance prediction accuracy (same-spatio-temporal-state POIs use loss from the previous POI’s intrinsic features, avoiding interference).

6. Conclusions

Findings. Through systematic experiments and analysis, we found that GSTRDFA outperforms the existing SOTA methods in all metrics, significantly improving the accuracy of Next POI recommendation. (1) By capturing the global spatio-temporal transfer patterns among POIs (at the spatio-temporal state granularity rather than the POI granularity), the system can identify the macroscopic patterns of users' long-distance movements, alleviating the cold start problem. Experiments show that after introducing the global spatio-temporal relationship, the system can expand the geographical space field of view during prediction, improving the prediction accuracy of next POIs at medium and long distances without affecting the prediction effect at short distances. (2) By capturing the real-time behavioral similarities among users (such as checking in at the same POI within the last week), the system can discover potential interest points through the friend network, alleviating the problem of difficult prediction for low-frequency visited POIs at long distances. Experiments show that after introducing dynamic friend relationships, the system can predict the correct POI through dynamic friends even when the geographical space relationship is not significant (long-distance transfer). (3) By jointly modeling global and local geographical space relationships, as well as dynamic friends and static social relationships, GSTRDFA achieves a significant improvement in recommendation accuracy, proving that by balancing the global and local geographical space relationships among POIs and the dynamic friends and static social relationships among users, the modeling of POI and user relationships can complement each other to achieve comprehensive improvement in recommendation performance.

Limitations: (1) Data limitations: The city data was directly selected from the global dataset by filtering with longitude and latitude ranges, so the data may have biases. (2) Method limitations: The parameter settings, such as the geographical space distance threshold and time interval, are somewhat empirical, and the spatio-temporal state division is not fine-grained enough.

Future works: (1) Multi-granularity division of spatio-temporal states, such as quarter and sixteenth divisions, coexisting. (2) Spatial range definition of dynamic friend relationships, defined within a certain spatial range of the current POI rather than the current POI itself. (3) Extension of application scenarios, such as epidemic spread prediction and tourism route planning.

Author Contributions

Xiaoyu Ji: Writing - original draft, Methodology, Data Curation, Software. Formal analysis. Yibing Cao: Writing - review & editing, Conceptualization, Funding acquisition, Project Administration. Jiangshui Zhang: Writing - review & editing, Supervision. Minjie Chen: Data Curation, Software. Pengyu Cui: Data Curation, Software. Yuan Yang: Software, Validation.

Funding

This research was funded by National Key Laboratory of Avionics Integration and Aviation System-of-Systems Synthesis (Granted 2025AIASS0601).

Acknowledgments

This research work was supported by a National Key Laboratory of Avionics Integration and Aviation System-of-Systems Synthesis (Granted 2025AIASS0601) and a Ministry-Province Cooperative Project under the Ministry of Natural Resources (Granted 2024ZRBSHZ152).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Xu, L.; Liu, Y.; Xu, T.; Chen, E.; Tang, Y. Graph augmentation empowered contrastive learning for recommendation. ACM Trans. Inf. Syst. 2025, 43(2), 1–27. [Google Scholar] [CrossRef]
Zhang, Q.; et al. A Survey on point-of-interest recommendation: models, architectures, and security. IEEE Trans. Knowl. Data Eng. 2025, 37(6), 3153–3172. [Google Scholar] [CrossRef]
Zhang, Q.; Xia, L.; Cai, X.; Yiu, S.-M.; Huang, C.; Jensen, C.S. Graph augmentation for recommendation. In: IEEE International Conference on Data Engineering (ICDE), Utrecht, Netherlands; 2024, pp. 557–569. [CrossRef]
Lian, D.; Zheng, K.; Ge, Y.; Cao, L.; Chen, E.; Xie, X. GeoMF++: Scalable location recommendation via joint geographical modeling and matrix factorization. ACM Trans. Inf. Syst. 2018, 36(3), 1–29. [Google Scholar] [CrossRef]
Rahmani, H.A.; Aliannejadi, M.; Ahmadian, S.; Baratchi, M.; Afsharchi, M.; Crestani, F. LGLMF: Local geographical based logistic matrix factorization model for POI recommendation. In: Asia Information Retrieval Symposium (AIRS), Hong Kong, China, 2019; pp. 66–78. [Google Scholar] [CrossRef]
Kang, W.; McAuley, J. Self-attentive sequential recommendation. In: IEEE International Conference on Data Mining (ICDM), Singapore, 2018; 2018, pp. 197–206. [Google Scholar] [CrossRef]
Qin, Y.; Wu, H.; Ju, W.; Luo, X.; Zhang, M. A diffusion model for POI recommendation. ACM Trans. Inf. Syst. 2023, 42(2), 1–27. [Google Scholar] [CrossRef]
Zuo, J.; Zhang, Y. Diff-DGMN: A diffusion-based dual graph multi-attention network for POI recommendation. IEEE Internet Things J. 2024, 11(23), 38393–38409. [Google Scholar] [CrossRef]
Long, J.; Ye, G.; Chen, T.; Wang, Y.; Wang, M.; Yin, H. Diffusion-based cloud-edge-device collaborative learning for next POI recommendations. In: 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Barcelona, Spain; 2024, pp. 2026–2036. [CrossRef]
Zhan, G.; Xu, J.; Huang, Z.; Zhang, Q.; Xu, M.; Zheng, N. A semantic sequential correlation based LSTM model for next POI recommendation. In: IEEE International Conference on Mobile Data Management (MDM), Hong Kong, China, 2019; pp. 128–137. [Google Scholar] [CrossRef]
Zhang, Y.; Lan, P.; Wang, Y.; Xiang, H. Spatio-temporal mogrifier LSTM and attention network for next POI recommendation. In: IEEE International Conference on Web Services (ICWS), Barcelona, Spain; 2022, pp. 17–26. [CrossRef]
Lai, Y.; Zeng, X. A POI recommendation model for intelligent systems using AT-LSTM in location-based social network big data. Int. J. Semant. Web Inf. Syst. 2023, 19(1), 1–15. [Google Scholar] [CrossRef]
Zang, H.; Han, D.; Li, X.; Wan, Z.; Wang, M. CHA: Categorical hierarchy-based attention for next POI recommendation. ACM Trans. Inf. Syst. 2021, 40(1), 1–22. [Google Scholar] [CrossRef]
Xie, J.; Chen, Z. Hierarchical transformer with spatio-temporal context aggregation for next point-of-interest recommendation. ACM Trans. Inf. Syst. 2023, 42(2), 1–30. [Google Scholar] [CrossRef]
Feng, S.; Meng, F.; Chen, L.; Shang, S.; Ong, Y. Rotan: A rotation-based temporal attention network for time-specific next POI recommendation. In: 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Barcelona, Spain; 2024, pp. 759–770. [CrossRef]
Li, P.; De, R.M.; Xue, H.; Ao, S.; Song, Y.; Salim, F.D. Large language models for next point-of-interest recommendation. In: 47th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Washington DC, USA; 2024, pp. 1463–1472. [CrossRef]
Wang, D.; Huang, Y.; Gao, S.; Wang, Y.; Huang, C.; Shang, S. Generative next poi recommendation with semantic id. In: 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Toronto, Canada; 2025, pp. 2904–2914. [CrossRef]
Rahmani, H.A.; Aliannejadi, M.; Baratchi, M.; Crestani, F. A systematic analysis on the impact of contextual information on point-of-interest recommendation. ACM Trans. Inf. Syst. 2022, 40(4), 1–35. [Google Scholar] [CrossRef]
Xiong, F.; Sun, H.; Luo, G.; Pan, S.; Qiu, M.; Wang, L. Graph attention network with high-order neighbor information propagation for social recommendation. In: 33rd International Joint Conference on Artificial Intelligence (IJCAI), Jeju, Korea; 2024, pp. 2478–2486. [CrossRef]
Yu, C.; Shi, L.; Zhao, Y. Trajectory- and friendship-aware graph neural network with transformer for next POI recommendation. ISPRS Int. J. Geo-Inf. 2025, 14(5), 192. [Google Scholar] [CrossRef]
Li, X.; et al. Beyond individual and point: next POI recommendation via region-aware dynamic hypergraph with dual-level modeling. In: 34th International Joint Conference on Artificial Intelligence (IJCAI), Guangzhou, China; 2025, pp. 3081–3089. [CrossRef]
Yang, D.; Qu, B.; Yang, J.; Cudré-Mauroux, P. LBSN2Vec++: Heterogeneous hypergraph embedding for location-based social networks. IEEE Trans. Knowl. Data Eng. 2020, 34(4), 1843–1855. [Google Scholar] [CrossRef]
Rao, X.; Chen, L.; Liu, Y.; Shang, S.; Yao, B.; Han, P. Graph-flashback network for next location recommendation. In: 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Washington DC, USA, 2022; pp. 1463–1471. [Google Scholar] [CrossRef]
Wei, X.; Liu, Y.; Sun, J.; Jiang, Y.; Tang, Q.; Yuan, K. Dual subgraph-based graph neural network for friendship prediction in location-based social networks. ACM Trans. Knowl. Discov. Data 2023, 17(3), 1–28. [Google Scholar] [CrossRef]
Li, J.; et al. Trajectory prediction enhanced by geographic knowledge graph and multi-spatio temporal constraints. Acta Geod. Et. Cartogr. Sin. 2024, 53(10), 2021–2033. [Google Scholar] [CrossRef]
Kong, X.; Chen, Z.; Li, J.; Bi, J.; Shen, G. KGNext: Knowledge-graph-enhanced transformer for next POI recommendation with uncertain check-ins. IEEE Trans. Comput Soc. Syst. 2024, 11(5), 6637–6648. [Google Scholar] [CrossRef]
Yin, F.; Liu, Y.; Shen, Z.; Chen, L.; Shang, S.; Han, P. Next POI recommendation with dynamic graph and explicit dependency. In: AAAI Conference on Artificial Intelligence (AAAI), Washington DC, USA; 2023, pp. 4827–4834. [CrossRef]
Ji, X.; Cao, Y.; Zhang, J.; Zhao, X. STSE: Spatio-temporal state embedding for knowledge graph completion. Knowl.-Based Syst. 2025, 317, 113469. [Google Scholar] [CrossRef]
Han, H.; et al. STGCN: A spatial-temporal aware graph learning method for POI recommendation. In: IEEE International Conference on Data Mining (ICDM), Sorrento, Italy; 2020, pp. 1052–1057. [CrossRef]
Wang, D.; Wang, X.; Xiang, Z.; Yu, D.; Deng, S.; Xu, G. Attentive sequential model based on graph neural network for next POI recommendation. World Wide Web 2021, 24(6), 2161–2184. [Google Scholar] [CrossRef]
Wang, Y.; Liu, A.; Fang, J.; Qu, J.; Zhao, L. ADQ-GNN: Next POI recommendation by fusing GNN and area division with quadtree. In: International Conference on Web Information Systems Engineering (WISE), Melbourne, Australia, 2021; pp. 177–192. [Google Scholar] [CrossRef]
Kim, J.; Jeong, S.; Park, G.; Cha, K.; Suh, I.; Oh, B. DynaPosGNN: Dynamic-Positional GNN for Next POI Recommendation. In: IEEE International Conference on Data Mining Workshops (ICDMW), Auckland, New Zealand; 2021, pp. 36–44. [CrossRef]
Sun, C.; et al. Attention-based graph neural networks: A survey. Artif. Intell. Rev. 2023, 56(2), 2263–2310. [Google Scholar] [CrossRef]
He, X.; He, W.; Liu, Y.; Lu, X.; Xiao, Y.; Liu, Y. ImNext: Irregular interval attention and multi-task learning for next POI recommendation. Knowl.-Based Syst. 2024, 293, 111674. [Google Scholar] [CrossRef]
Zhang, J.; Liu, X.; Zhou, X.; Chu, X. Leveraging graph neural networks for point-of-interest recommendations. Neurocomputing 2021, 462, 1–13. [Google Scholar] [CrossRef]
Zhang, J.; Ma, W. Hybrid structural graph attention network for POI recommendation. Expert Syst. Appl. 2024, 248, 123436. [Google Scholar] [CrossRef]
Xu, X.; et al. Revisiting mobility modeling with graph: A graph transformer model for next point-of-interest recommendation. In: 31st ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL), Hamburg, Germany; 2023, pp. 1–10. [CrossRef]
Zhang, J.; et al. Hyper-relational knowledge graph neural network for next POI recommendation. World Wide Web 2024, 27(4), 46. [Google Scholar] [CrossRef]
Qin, Y.; et al. DisenPOI: Disentangling sequential and geographical influence for point-of-interest recommendation. In: 16th ACM International Conference on Web Search and Data Mining (WSDM), Singapore, 2023; pp. 508–516. [Google Scholar] [CrossRef]
Liu, T.; Jiang, A.; Zhou, J.; Li, M.; Kwan, H.K. GraphSAGE-Based Dynamic Spatial-Temporal Graph Convolutional Network for Traffic Prediction. IEEE Trans. Intell. Transp. Syst. 2023, 24(10), 11210–11224. [Google Scholar] [CrossRef]
Qin, Y.; et al. Disentangling Geographical Effect for Point-of-Interest Recommendation. IEEE Trans. Knowl. Data Eng. 2023, 35(8), 7883–7897. [Google Scholar] [CrossRef]
Zhao, P.; et al. Where to Go Next: A Spatio-Temporal Gated Network for Next POI Recommendation. IEEE Trans. Knowl. Data Eng. 2022, 34(5), 2512–2524. [Google Scholar] [CrossRef]
Luo, Y.; Liu, Q.; Liu, Z. STAN: Spatio-temporal attention network for next location recommendation. In: ACM Web Conference (WWW), Ljubljana, Slovenia, 2021; pp. 2177–2185. [Google Scholar] [CrossRef]
Lian, D.; Wu, Y.; Ge, Y.; Xie, X.; Chen, E. Geography-aware sequential location recommendation. In: 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD) Virtual Event; 2020, pp. 2009–2019. [CrossRef]
Yang, S.; Liu, J.; Zhao, K. GETNext: Trajectory flow map enhanced transformer for next POI recommendation. In: 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Madrid, Spain, 2022; pp. 1144–1153. [Google Scholar] [CrossRef]
Pan, X.; Cai, X.; Xu, S.; Zhang, Y.; Nie, P.; Yuan, X. GeoCo: Geographical Correlation Enhanced Network for POI Recommendation. IEEE Trans. Knowl. Data Eng. 2024, 36(12), 8362–8376. [Google Scholar] [CrossRef]

Figure 1. Description of global spatio-temporal relationships and dynamic friendships.

Figure 2. Framework of GSTRDFA.

Figure 3. Illustration of global spatio-temporal relationship graph

G_{s t s}

Figure 3. Illustration of global spatio-temporal relationship graph

G_{s t s}

Figure 4. Mapping global spatio-temporal relationship between POIs to

G_{s t s}

Figure 4. Mapping global spatio-temporal relationship between POIs to

G_{s t s}

Figure 5. Illustration of dynamic friendship graph

G_{d f}

Figure 5. Illustration of dynamic friendship graph

G_{d f}

Figure 6. Structure of Global Spatio-Temporal Relationship-Aware Encoder.

Figure 7. Structure of STSEncoder.

Figure 8. Performance comparison of the best models in each group.

Figure 9. Impact of module ablation on NYC, JK, and CA datasets.

Figure 10. MRR on 3 datasets with

d

Figure 10. MRR on 3 datasets with

d

Figure 11. MRR on 3 datasets when

Δ T

Figure 11. MRR on 3 datasets when

Δ T

Figure 12. MRR on 3 datasets when

Δ D

Figure 12. MRR on 3 datasets when

Δ D

Figure 13. Case study on the impact of global spatio-temporal relationships and dynamic friendships.

Table 1. Statistics of datasets.

	NYC	JK	CA
#Users	81948	8657	109674
#POIs	73995	6376	173292
#Check-ins	156066	10357	202486
#SocialLinks	298606	248	295900
Time Span	2011/12/08 2012/04/23	2011/12/08 2012/04/23	2011/12/08 2012/04/23
Latitude Extent	40.501545 40.916965	-6.379921 -6.102930	32.50144 41.99948
Longitude Extent	-74.25498 -73.70002	106.730095 106.979420	-124.265854 -114.002945

Table 2. Statistics of baselines.

Group	Model	Description
(a) Direct sequence prediction	STGN [42]	Spatio-temporal RNN
(a) Direct sequence prediction	STAN [43]	Spatio-temporal attention
(b) User-user enhanced	LBSN2Vec++ [22]	Social heterogeneous hypergraph
	DSGNN [24]	Dual GNN incorporating social links
	Graph-Flashback [23]	KG without dynamic friendships
(c) POI-POI enhanced	GeoSAN [44]	Geo-enhanced with GPS constraints
	GETNext [45]	Trajectory flow KG enhanced
	GeoCo [46]	Fine-grained hierarchical sequences
(d) Recent	KGNext [26]	Hybrid soc- and geo-enhanced based on KG and Transformer
(d) Recent	SNPM [27]	Dynamic graph and explicit dependency

Table 3. Performance comparison in ACC@K and MRR on NYC, JK, and CA datasets.

	NYC				JK				CA
	Acc@1	Acc@5	Acc@10	MRR	Acc@1	Acc@5	Acc@10	MRR	Acc@1	Acc@5	Acc@10	MRR
STGN	0.1423	0.3281	0.4124	0.2215	0.1282	0.3104	0.3957	0.2053	0.1357	0.3112	0.3983	0.2104
STAN	0.1736	0.3819	0.4738	0.2657	0.1638	0.3587	0.4472	0.2519	0.1687	0.3679	0.4604	0.2573
LBSN2Vec++	0.1587	0.3562	0.4389	0.2438	0.1398	0.3246	0.4075	0.2202	0.1539	0.3428	0.4286	0.2361
DSGNN	0.1852	0.4027	0.4924	0.2783	0.1473	0.3391	0.4268	0.2335	0.1812	0.3926	0.4798	0.2718
Graph-FB	0.2035	0.4319	0.5237	0.2984	0.1703	0.3785	0.4689	0.2594	0.2054	0.4298	0.5229	0.2992
GeoSAN	0.1674	0.3725	0.4613	0.2582	0.1625	0.3583	0.4492	0.2498	0.1876	0.4032	0.4937	0.2775
GETNext	0.1928	0.4186	0.5079	0.2984	0.1745	0.3784	0.4653	0.2637	0.1978	0.4156	0.5084	0.2886
GeoCo	0.2189	0.4574	0.5523	0.3217	0.2184	0.4537	0.5486	0.3273	0.2132	0.4417	0.5368	0.3114
KGNext	0.2112	0.4458	0.5386	0.3093	0.1907	0.4043	0.4937	0.2824	0.2059	0.4289	0.5183	0.3042
SNPM	0.2256	0.4689	0.5638	0.3325	0.1983	0.4175	0.5064	0.2926	0.2219	0.4623	0.5582	0.3351
Ours	0.2357	0.4996	0.6014	0.3497	0.2245	0.4669	0.5654	0.3366	0.2291	0.4828	0.5844	0.3468
Improvements	4.48%	6.55%	6.67%	5.17%	2.79%	2.91%	3.06%	2.84%	3.24%	4.43%	4.69%	3.49%

Table 4. Ablation study on NYC dataset.

	Acc@1	Acc@5	Acc@10	MRR	Drop
Full Model	0.2357	0.4996	0.6014	0.3497	----
w/o SocEncoder	0.2259	0.4815	0.5813	0.3367	-3.7%
w/o DFEncoder	0.2164	0.4636	0.5612	0.3232	-7.6%
w/o DFAE	0.2085	0.4493	0.5442	0.3126	-10.6%
w/o GeoEncoder	0.2211	0.4724	0.5706	0.3295	-5.8%
w/o STSEncoder	0.2022	0.4368	0.5306	0.3036	-13.2%
w/o GSTRAE	0.1928	0.4189	0.5106	0.2905	-16.9%
w/o GSTRDFAE	0.1592	0.3706	0.4547	0.2374	-32.1%

Table 5. The predicted POIs and ranking of full model and w/o STSEncoder.

	1	2	3	4	5	6	7	8	9	10
FM	19766	12968	11492	10936	782	33767	30931	818	1096	29769
WS	12968	19766	10936	30931	33767	782	11492	299105	1096	260210

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

A Global Spatio-Temporal Relationship- and Dynamic Friendship-Aware POI Recommendation Method

Abstract

Keywords:

Subject:

1. Introduction

2. Related Work

3. Problem Formulation

4. Methodology

4.1. Representation Layer

4.2. Propagation Layer

4.3. Prediction Layer

4.4. Negative Sampling

4.5. Loss and Optimization

5. Experiments

5.1. Datasets

5.2. Metrics

5.3. Baselines

5.4. Experimental Settings

5.5. Results & Analysis

5.6. Ablation Study

5.7. Hyper Parameters Sensitivity Analysis

5.8. Case Study

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

MDPI Initiatives

Important Links

Subscribe