1. Introduction
The Triangle Finding Problem is a cornerstone of graph theory and computer science, with applications spanning social network analysis, computational biology, and database optimization. Given an undirected graph , where represents the number of vertices and the number of edges, the problem has two primary variants:
Decision Version: Determine whether G contains at least one triangle—a set of three vertices where edges all exist.
Listing Version: Enumerate all such triangles in G.
This problem is significant in fields such as social network analysis (e.g., identifying tightly-knit communities), computational biology (e.g., detecting protein interaction motifs), and database systems (e.g., optimizing multi-way joins). Efficient algorithms for this problem are essential for processing the large-scale graphs common in these applications.
Traditional approaches to triangle detection range from brute-force
enumeration to matrix multiplication-based methods in
time, where
is the fast matrix multiplication exponent [
1]. For sparse graphs (
), combinatorial algorithms like those by Chiba and Nishizeki achieve
, where
is the graph’s arboricity [
2]. However, fine-grained complexity theory introduces conjectured barriers: the sparse triangle hypothesis posits no combinatorial algorithm can achieve
for
, while the dense triangle hypothesis suggests
as a lower bound.
This paper presents the Aegypti algorithm, a novel Depth-First Search (DFS)-based solution that detects triangles in time-linear in the graph’s size-potentially challenging these barriers. Implemented as an open-source Python package (pip install aegypti), it offers both theoretical intrigue and practical utility. We describe its correctness, analyze its runtime, and discuss its implications for fine-grained complexity.
2. State-of-the-Art Algorithms
2.1. Naive Approach
The simplest method to solve the Triangle Finding Problem is to examine every possible triplet of vertices in the graph. For a graph with n vertices, there are such triplets. For each triplet , we check if the edges , , and exist in E. This approach has a time complexity of , making it computationally expensive and impractical for large graphs, though it is conceptually straightforward.
2.2. Matrix Multiplication-Based Methods
A more sophisticated approach uses matrix multiplication. Let
A be the adjacency matrix of the graph, where
if
and 0 otherwise. Computing
, the cube of the adjacency matrix, allows triangle detection because the trace of
(the sum of its diagonal elements) is non-zero if and only if a triangle exists. This follows from the fact that
counts the number of length-3 closed walks starting and ending at vertex
i, which includes triangles. With the fastest known matrix multiplication algorithms, this method runs in
time, where
is the matrix multiplication exponent [
1]. This is efficient for dense graphs but does not achieve linear time.
2.3. Algorithms for Sparse Graphs
For sparse graphs, where the number of edges
m is much less than
, tailored algorithms perform better. One notable example is the algorithm by Chiba and Nishizeki (1985) [
2]. It exploits the graph’s arboricity
, a measure of how densely edges are distributed, and runs in
time. Since
(the maximum degree of the graph), this approach is particularly effective for graphs with low maximum degree. It works by iteratively processing vertices and their neighbors to identify triangles, optimizing for sparsity.
2.4. Algorithms for Dense Graphs
In dense graphs, where , matrix multiplication-based methods are often preferred due to their sub-cubic complexity. However, combinatorial algorithms also exist that leverage the high edge density to achieve competitive performance, typically matching the bound. These methods often involve preprocessing the graph to reduce redundant checks. Fine-grained complexity theory suggests lower bounds, conjecturing that no combinatorial algorithm can solve the problem in time for sparse graphs or for dense graphs, where , based on reductions from problems like 3SUM.
3. The Aegypti Algorithm
Our algorithm introduces a novel solution to the Triangle Finding Problem using Depth-First Search (DFS). It begins at each unvisited vertex and explores the graph by maintaining a stack of (node, parent) pairs. For each vertex v being explored and its parent u, it examines v’s neighbors. If a neighbor w of v has an edge back to u, the triplet forms a triangle. This process ensures triangles are detected efficiently.
Unlike the naive approach, matrix multiplication’s bound, or sparse graph methods like , our algorithm achieves a time complexity of —linear in the size of the graph. This is possible because each vertex and edge is processed a constant number of times during the DFS traversal, avoiding redundant computations. This linear-time performance holds across all graph densities, distinguishing it from existing methods that optimize for either sparse or dense graphs but not both.
This algorithm challenges the fine-grained complexity conjectures, suggesting that the Triangle Finding Problem may be solvable faster than previously assumed. Practically, it has been implemented as a Python package named aegypti, installable via pip, making it accessible for testing and deployment. This approach promises significant advancements in both theoretical graph algorithms and real-world applications involving large graphs.
The algorithm, implemented in the
Aegypti Python package, operates on an undirected NetworkX graph
G.
Figure A1 presents the core implementation.
3.1. Key Features
Input Validation: Ensures G is a NetworkX graph.
Visited Tracking: Uses a set to mark processed nodes, preventing cycles.
DFS Traversal: Employs a stack for iterative DFS, tracking each node’s parent.
Triangle Detection: Identifies a triangle when is visited and an edge exists between and .
Avoiding Duplicates: Stores triangles as in a set.
Early Termination: Returns after the first triangle if .
Disconnected Graphs: Outer loop ensures all components are explored.
3.2. Correctness
Theorem 1. The DFS-based algorithm correctly identifies all triangles in an undirected graph when , and detects at least one triangle if any exist when .
Proof. We prove the theorem by establishing two properties: completeness (all triangles are detected) and soundness (only actual triangles are reported). □
3.2.1. Completeness
To show that every triangle in G is detected when , consider an arbitrary triangle with edges .
Since G may be disconnected, the outer loop over all nodes ensures that DFS is initiated from an unvisited node in each connected component. Thus, the component containing is fully explored.
During DFS, a spanning tree T is constructed for the component. Because forms a cycle of length 3, at least one edge in T is a back edge (an edge from a node to an ancestor in T). We demonstrate that the algorithm detects the triangle in any DFS traversal order:
-
Case 1: DFS processes .
- −
u is visited, v is processed with .
- −
w is processed with , .
- −
Since and (as G is undirected, ), is detected.
-
Case 2: DFS processes .
- −
w is visited, u is processed with .
- −
v is processed with , .
- −
Since and , is detected.
-
Case 3: DFS processes .
- −
u is visited, w is processed with .
- −
v is processed with , .
- −
Since and , is detected.
In each case, the triangle is identified when processing the node that encounters the back edge closing the cycle. The condition ensures are distinct. Since the algorithm explores all nodes and edges, and G is undirected, every triangle is detected at least once. The use of a set for eliminates duplicates.
For , the algorithm terminates after detecting the first triangle, ensuring at least one is found if any exist.
3.2.2. Soundness
To prove that only actual triangles are reported, examine the detection condition. A set is added to only if:
(via DFS tree edge),
(neighbor relation),
(explicit check),
(distinct nodes).
This precisely matches the definition of a triangle. Thus, no false positives occur.
3.2.3. Conclusion
The algorithm is complete (detects all triangles when , or at least one when ) and sound (reports only triangles). Hence, it correctly identifies all triangles in G.
4. Runtime Analysis
The algorithm’s complexity is evaluated under two configurations: first_triangle=True, where it terminates upon detecting the first triangle, and first_triangle=False, where it identifies all triangles in the graph.
Theorem 2. Let be an undirected graph with nodes and edges, and let t represent the number of triangles in G. The DFS-based triangle detection algorithm exhibits the following runtime complexities:
-
Forfirst_triangle=True:
- −
Best-case runtime:
- −
Worst-case runtime:
-
Forfirst_triangle=False:
- −
Best-case runtime: (when )
- −
Worst-case runtime:
Proof. The proof is structured by analyzing the algorithm’s behavior in both cases, focusing on graph traversal, edge checks, and triangle processing. □
4.1. General Observations
Graph Representation: The graph is stored as an adjacency list (e.g., using NetworkX), enabling edge existence checks and efficient neighbor traversal.
DFS Traversal: DFS visits each node and edge exactly once in the worst case, yielding a baseline complexity of .
Triangle Detection: For each edge , the algorithm checks for a triangle by verifying an edge between v and the parent of u, an operation per edge.
4.2. Case 1: first_triangle=True
Here, the algorithm halts after detecting the first triangle.
Worst-Case Runtime:
The algorithm may traverse the entire graph if no triangle exists or if the first triangle is found late, visiting all n nodes and m edges in time.
For each edge, an triangle check is performed, contributing time in total.
Thus, the worst-case runtime is .
Best-Case Runtime:
If a triangle is detected early (e.g., after exploring only a few nodes), the algorithm terminates in as little as time, depending on the traversal depth at termination.
Hence, the runtime ranges from (best case) to (worst case).
4.3. Case 2: first_triangle=False
Here, the algorithm detects all triangles in the graph.
Worst-Case Runtime:
DFS traversal covers all n nodes and m edges, requiring time.
Each edge undergoes an triangle check, totaling time across all edges.
Each of the t detected triangles is stored in a set (e.g., as a frozenset), with an average insertion time of per triangle, adding time.
Thus, the total worst-case runtime is .
Best-Case Runtime:
Thus, the runtime is when and otherwise.
4.4. Summary of Runtime Analysis
The runtime complexities are summarized as follows:
| Case |
Best-Case Runtime |
Worst-Case Runtime |
| first_triangle=True |
|
|
| first_triangle=False |
|
|
4.5. Key Observations
Early Termination (first_triangle=True): The algorithm’s efficiency shines when a triangle is found early, potentially avoiding a full graph traversal.
Full Exploration (first_triangle=False): The runtime grows linearly with t, becoming significant in dense graphs with many triangles.
DFS Efficiency: Processing each node and edge at most once ensures efficiency, particularly in sparse graphs.
Space Complexity: The algorithm uses space for tracking visited nodes and space for storing triangles.
4.6. Practical Implications
For first_triangle=True, the algorithm is ideal when only one triangle is needed, offering potential early termination.
For first_triangle=False, large t values in dense graphs may necessitate optimizations, such as depth limits or approximate methods.
5. Research Data
A Python implementation,
(in memory of pioneering epidemiologist Carlos Juan Finlay), has been developed for the Triangle Finding Problem. This implementation is publicly accessible through the Python Package Index (PyPI) [
3]. By setting the
parameter to
, the algorithm can identify and count all triangles.
Table 1 presents the ancillary data for code metadata.
6. Illustrative Examples
To illustrate the correctness of the algorithm, we walk through several examples and verify that it correctly identifies all triangles in various graphs. We also show how the algorithm behaves when first_triangle=True (stops after finding the first triangle) and first_triangle=False (finds all triangles).
6.1. Example 1: Simple Triangle
Graph:
Nodes:
Edges:
This graph contains a single triangle: .
Execution:
Start DFS traversal from node 1.
Process neighbors of 1: 2 is visited next.
-
Process neighbors of 2: 3 is visited next.
- −
When processing 3, its neighbor 1 is already visited. The algorithm checks if 2 (parent of 3) is connected to 1. Since exists, the triangle is detected.
Results:
If first_triangle=True: Returns .
If first_triangle=False: Returns .
6.2. Example 2: Two Disconnected Triangles
Graph:
Nodes:
Edges:
This graph contains two disconnected triangles: and .
Execution:
Start DFS traversal from node 1:
-
Detects triangle as described in Example 1.
- −
Continue DFS traversal from node 4:
- −
Process 5, then 6.
- −
When processing 6, its neighbor 4 is already visited. The algorithm checks if 5 (parent of 6) is connected to 4. Since exists, the triangle is detected.
Results:
If first_triangle=True: Returns (stops after finding the first triangle).
If first_triangle=False: Returns (finds both triangles).
6.3. Example 3: Overlapping Triangles
Graph:
Nodes:
Edges:
This graph contains two overlapping triangles:
Execution:
Results:
If first_triangle=True: Returns (stops after finding the first triangle).
If first_triangle=False: Returns (finds both triangles).
6.4. Example 4: Graph Without Triangles
Graph:
Nodes:
Edges:
This graph does not contain any triangles.
Execution:
Results:
6.5. Example 5: Complex Graph with Multiple Triangles
Graph:
Nodes:
Edges:
This graph contains three triangles:
Execution:
Results:
If first_triangle=True: Returns (stops after finding the first triangle).
If first_triangle=False: Returns (finds all triangles).
6.6. Summary of Results
| Example |
first_triangle=True |
first_triangle=False |
| Simple Triangle |
|
|
| Two Disconnected |
|
|
| Overlapping Triangles |
|
|
| No Triangles |
None |
None |
| Complex Graph |
|
|
6.7. Conclusion
The algorithm consistently detects all triangles in the graph when first_triangle=False and stops early when first_triangle=True. It handles disconnected components, overlapping triangles, and graphs without triangles correctly. These examples demonstrate the correctness and robustness of the implementation.
7. Impact
The triangle-finding problem is a fundamental task in graph theory and network analysis, with applications in social network analysis, bioinformatics, and computational geometry. The development of linear-time algorithms for this problem has had significant theoretical and practical implications.
7.1. Theoretical Impact
Linear-time algorithms for triangle finding represent a breakthrough in computational complexity. Traditionally, the best-known algorithms for detecting triangles in a graph have a time complexity of , where is the matrix multiplication exponent ( as of current knowledge). However, linear-time algorithms achieve complexity for sparse graphs, where m is the number of edges and n is the number of nodes. This improvement reduces the computational overhead significantly for large-scale sparse graphs, which are common in real-world applications.
7.2. Practical Impact
Efficiency in Large-Scale Graphs: Linear-time algorithms enable efficient triangle detection in massive graphs, such as social networks or web graphs, where m and n can grow into the millions or billions.
Scalability: These algorithms allow real-time or near-real-time analysis of dynamic graphs, which is crucial for applications like fraud detection, recommendation systems, and community detection in social networks.
Foundation for Advanced Algorithms: Linear-time triangle finding serves as a building block for more complex graph algorithms, such as clustering coefficient computation, motif counting, and dense subgraph discovery.
7.3. Applications
The impact of linear-time triangle finding extends to various domains:
7.4. Conclusion
Linear-time algorithms for triangle finding have revolutionized graph analysis by providing a computationally efficient solution to a classical problem. Their impact spans theoretical advancements, practical scalability, and diverse real-world applications, making them a cornerstone of modern graph algorithms.
8. Conclusion
The Aegypti algorithm offers a linear-time solution to triangle finding, potentially revolutionizing our understanding of graph algorithm complexity. The Aegypti algorithm’s runtime is striking against fine-grained complexity conjectures:
Sparse Triangle Hypothesis: For , beats , suggesting , violating the conjecture.
Dense Triangle Hypothesis: For , outperforms , with .
A linear-time algorithm for the triangle finding problem would imply that numerous other problems can also be solved in subquadratic time, as this problem is 3SUM-hard [
6]. Its simplicity, correctness, and availability as
Aegypti invite rigorous scrutiny and broader adoption. Deployed via aegypti (PyPI), the algorithm is accessible for real-world use [
3].
Acknowledgments
The author would like to thank Iris, Marilin, Sonia, Yoselin, and Arelis for their support.
Appendix A
Figure A1.
A linear-time solution to the Triangle Finding Problem in Python.
Figure A1.
A linear-time solution to the Triangle Finding Problem in Python.
References
- Alon, N.; Yuster, R.; Zwick, U. Finding and counting given length cycles. Algorithmica 1997, 17, 209–223. [Google Scholar] [CrossRef]
- Chiba, N.; Nishizeki, T. Arboricity and Subgraph Listing Algorithms. SIAM Journal on computing 1985, 14, 210–223. [Google Scholar] [CrossRef]
- Vega, F. Aegypti: Triangle-Free Solver. https://pypi.org/project/aegypti. Accessed August 4, 2025.
- Milo, R.; Shen-Orr, S.; Itzkovitz, S.; Kashtan, N.; Chklovskii, D.; Alon, U. Network Motifs: Simple Building Blocks of Complex Networks. Science 2002, 298, 824–827. [Google Scholar] [CrossRef] [PubMed]
- Newman, M.E. The Structure and Function of Complex Networks. SIAM Review 2003, 45, 167–256. [Google Scholar] [CrossRef]
- Patrascu, M. Towards polynomial lower bounds for dynamic problems. In Proceedings of the Proceedings of the Forty-Second ACM Symposium on Theory of Computing, New York, NY, USA, 2010; STOC ’10, p. 603–610. [CrossRef]
Table 1.
Code metadata for the Aegypti package.
Table 1.
Code metadata for the Aegypti package.
| Nr. |
Code metadata description |
Metadata |
| C1 |
Current code version |
v0.3.3 |
| C2 |
Permanent link to code/repository used for this code version |
https://github.com/frankvegadelgado/finlay |
| C3 |
Permanent link to Reproducible Capsule |
https://pypi.org/project/aegypti/ |
| C4 |
Legal Code License |
MIT License |
| C5 |
Code versioning system used |
git |
| C6 |
Software code languages, tools, and services used |
python |
| C7 |
Compilation requirements, operating environments & dependencies |
Python ≥ 3.12 |
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).