Submitted:
03 December 2025
Posted:
07 December 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. Background
1.2. Motivation
- Fragmented Dependency and Coordination Bottlenecks
- Technical Debt and Productivity Loss
- Performance and Scalability Constraints
- Cognitive Overhead and Developer Friction
1.3. Contributions
- 1.
- Graph-Theoretic Formal Verified Development Framework
- 2.
- Three-Level Encapsulation for Hierarchical Data
- 3.
- Machine-Checked Formal Verification
- 4.
- Rigorous Industrial Validation
2. Related Work
2.1. Domain-Driven Design, Collaborative Modeling, and Low-Code Platforms
2.2. Formal Methods, LTL, and Model-Driven Engineering
2.3. State-Based and Traversal-Oriented Approaches
2.4. Encoded Data Structures and Hierarchical Storage
2.5. Synthesis and Positioning of PBFD/PDFD
3. Formal Framework and Methodologies
3.1. Introduction and Motivation
- Termination — The development process completes in finite time, visiting all reachable vertices.
- Deadlock freedom — No circular dependency chains prevent progress (i.e., the graph is acyclic or cycles are explicitly managed).
- Dependency satisfaction— All prerequisite vertices are processed before their dependents, respecting the partial order imposed by E.
- Completeness—All vertices representing required system components are eventually processed and verified.
- Structural diagrams visualize workflow architecture and traversal paths.
- State machines define precise operational semantics and control logic.
- Unified transition tables specify deterministic rules linking states, conditions, and actions.
- Pseudocode encodes algorithmic logic for traversal, validation, and refinement.
- Communicating Sequential Processes (CSP) [45] model concurrent execution and inter-process communication, with execution traces serving as the semantic basis for temporal verification.
- Linear Temporal Logic (LTL) [60] specifies global temporal properties—such as liveness, termination, and rollback safety—to be proven over all possible CSP traces.
3.2. Formal Notation and Communication Conventions
- Pseudocode: Defined as Procedure [Name](...) with explicit inputs, outputs, and traversal logic.
- CSP Specifications: All formal models use synchronous channels to represent communication and control flow. Each specification is validated in FDR 4.2.7, with complete source code and verification scripts available in the corresponding appendices A.2–A.7 and linked GitHub repositories.
- Unified Transition Tables: Specify formal transition rules between states, including conditions, actions, and branching logic.
- Structural Diagrams: Mermaid-based diagrams visualize workflow structure and state transitions. Source code is provided in the respective appendices.
- Cross-Representational Mappings: Appendices A.2–A.7 include full mappings between pseudocode, CSP specifications, and transition tables, ensuring consistency and enable reproducibility across diverse implementation contexts.
3.3. Basic Methodologies
- Directed Acyclic Development (DAD): Enforces strict, non-cyclic dependencies to ensure monotonic progress and traceability. Its full formal specification is provided in Appendix A.2.
- Depth-First Development (DFD): Derived from depth-first search (DFS). Prioritizes vertical exploration by completing deep dependency chains before addressing sibling units. Its full formal specification is provided in Appendix A.3.
- Breadth-First Development (BFD): Derived from breadth-first search (BFS). Promotes horizontal, level-wise traversal to maintain cross-component consistency at each stage. Its full formal specification is provided in Appendix A.4.
- Cyclic Directed Development (CDD): Based on cyclic directed graphs. Incorporates bounded feedback loops within otherwise acyclic workflows, supporting structured reprocessing for iterative refinement. Its full formal specification is provided in Appendix A.5.
3.3.1. Directed Acyclic Development (DAD)
- Definition and Formalization
- Nodes represent components (e.g., modules, tasks).
- Edges represent irreversible dependencies ((u, v) means u must complete before v).
- Acyclicity ensures no cycles exist, preventing deadlocks or circular dependencies.
- 2.
- Key Characteristics
- 3.
- Workflow Representation
- 4.
- State Descriptions
- 5.
- Unified State Transition Table
- 6.
- State Machine Diagram
- 7.
- CSP Formal Verification Results and Guarantees for DAD
- Interpretation & Contributions
- Nodes are processed immediately after being dequeued (DA2).
- Dependency validation occurs immediately after processing (DA2 → DA3/DA4).
- Children are generated only once all dependencies are completed (DA3).
- Generated children are properly enqueued for subsequent processing (DA3).
- Missing dependencies properly trigger DAG extension while preserving acyclicity (DA4 & DA5).
- Final validation occurs only after complete processing (DA6).
- System can always reach a successful or error termination state.
- Supports correct dependency-first construction of hierarchical software components
- Ensures topological order execution and integrity of the DAG
- Allows incremental graph extension while maintaining acyclic structure
- Avoids deadlocks, livelocks, and nondeterministic processing
- 8.
- LTL Properties
- 9.
- Advantages
- 10.
- Example Use Case
- Root: Continent (e.g., “Africa”)
- Hierarchy: Country → Province → Commune
- Termination: Process completes at leaf nodes (communes)
- Dependencies: Unidirectional (e.g., Africa → Algeria → Adrar Province)
3.3.2. Depth-First Development (DFD)
- Definition and Formalization
- 2.
- Key Characteristics
- 3.
- Workflow Representation
- 4.
- State Descriptions
- 5.
- Unified State Transition Table
- 6.
- State Machine Diagram
- 7.
- CSP Formal Verification Results and Guarantees for DFD
- Interpretation & Contributions
- Nodes are processed as soon as they are dequeued (DF2–DF3).
- Non-leaf nodes correctly push their children before descent.
- Leaf processing reliably initiates the backtracking sequence.
- The system cannot stall in backtracking or validation cycles (DF5–DF7).
- All hierarchical paths are completed before termination.
- Final termination is guaranteed once traversal is exhausted.
- Supports correct recursive descent through hierarchical structures using deterministic stack operations
- Ensures subtree completion before parent-level progression
- Avoids deadlocks, livelocks, and nondeterministic backtracking
- 8.
- LTL Properties
- 9.
- Advantages
3.3.3. Breadth-First Development (BFD)
- Definition and Formalization
- 2.
- Key Characteristics
- 3.
- Workflow Representation
- 4.
- State Descriptions
- 5.
- Unified State Transition Table
- 6.
- State Machine Diagram
- 7.
- CSP Formal Verification Results and Guarantees for BFD
- Interpretation & Contributions
- Each node in the current level queue is dequeued and processed before moving to the next node.
- Level advancement occurs only after all nodes in the current level are validated.
- BFD can always successfully reach the termination state terminate_successfully_actual.
- All nodes and levels are fully processed, ensuring liveness and preventing livelock (BF5).
- Supports safe, level-by-level processing of hierarchical structures
- Guarantees full completion and validation of each level before moving to the next
- Prevents deadlocks or livelocks while ensuring predictable, deterministic behavior
- Ensures internal consistency and milestone integrity through explicit assertions on processing order, validation, and termination
- 8.
- LTL Properties
- 9.
- Advantages
3.3.4. Cyclic Directed Development (CDD)
- Definition and Formalization
- 2.
- Key Characteristics
- 3.
- Workflow Representation
- 4.
- State Descriptions
- 5.
- Unified State Transition Table
- 6.
- State Machine Diagram
- 7.
- CSP Formal Verification Results and Refinement Guarantees for CDD
- Interpretation & Contributions
- N4 dependency: N4 cannot start until both N2 and N3 are complete.
- N5 dependency: N5 cannot start until N4 is complete.
- After Rₘₐₓ= 3 failed refinements, the process issues the error termination event terminate_with_error_actual and does not deadlock or livelock.
- Supports safe, concurrent processing under explicit dependencies
- Provides a provable defense against infinite refinement cycles by bounding retries and enforcing termination in worst-case conditions
- Ensures internal consistency and milestone completion integrity through both guards and dependency assertions
- 8.
- LTL Properties
- 9.
- Advantages
3.4. Hybrid Methodologies
- DFD and BFD lack mechanisms for iterative adaptability.
- CDD accommodates iteration but sacrifices hierarchical scaffolding.
- Primary Depth-First Development (PDFD): An adaptive, vertical progression model optimized for recursive, dependency-heavy systems requiring early risk resolution. It integrates depth-first traversal with bounded parallelism (Kᵢ) and cyclic refinement (Rₘₐₓ) to manage local complexity while securing critical paths.
- Primary Breadth-First Development (PBFD): A scalable, horizontal progression model optimized for large-scale systems where architectural stability is paramount. It utilizes pattern-driven modularity (e.g., Three-Level Encapsulation) to establish architectural scaffolds before engaging in selective depth-oriented refinement.
3.4.1. Primary Depth-First Development (PDFD)
- Foundational Concepts and Definitions
- Depth-First Development (DFD): Enables vertical progression through the hierarchy, adapted from graph traversal theory [62] for systematic elaboration of dependencies
- Cyclic Directed Development (CDD): Enables iterative, validation-driven refinement with bounded limit Rₘₐₓ, providing corrective feedback without infinite loops [74]
- 2.
- Key Characteristics
- 3.
- Workflow Representation
- Depth-oriented progression through successive levels
- Iterative refinement cycles via backward jumps
- Completion sweep through bottom-up and top-down finalization
- 4.
- State Descriptions
- 5.
- Unified State Transition Table
- 6.
- State Machine Diagram
- 7.
- CSP Formal Verification Results and Refinement Guarantees
- Interpretation & Contributions
- No infinite refinement loops occur.
- On exceeding Rₘₐₓ, the system transitions to terminate_error, enforcing bounded failure handling.
- Ensures termination by always reaching either T (success) or safely halting at S5 (error)
- Provides consistency through six validated conditional soundness checks
- Guarantees predictability via globally deterministic control flow
- 8.
- LTL Properties
- 9.
- Advantages
3.4.2. Primary Breadth-First Development (PBFD)
- Definition and Pattern Encapsulation
- Instance 1 (Continent-anchored): Continent → Country → State
- Instance 2 (Country-anchored): Country → State → County
- Instance 3 (State-anchored): State → County → City
- Breadth-First Development (BFD): PBFD's primary progression is breadth-first, facilitating sequential, level-by-level processing of the layered directed acyclic graph. Nodes within the same level share structural characteristics defined by discrete structural signatures (e.g., bitmask encoding), enabling efficient pattern-driven initial development and horizontal batch processing. Because BFD processes nodes level-by-level, a single pattern implementation is reused across all nodes sharing the same signature (e.g., bitmask-defined level sets, shared data schemas, or common processing logic).
- Depth-First Development (DFD): DFD complements the breadth-first structure by enabling selective vertical traversal. Within TLE structure, DFD is operationalized through selective promotion of parent nodes to grandparent positions. This allows the system to refine specific hierarchical paths (critical subtrees) without processing all branches uniformly.
- Cyclic Directed Development (CDD): CDD governs validation-driven refinement by introducing bounded iterative cycles. This permits systematic re-entry into development based on feedback, continuing until predefined resolution criteria or refinement limits are met [78].
- Selection and Advancement: At level i, specific patterns (denoted Patternᵢ, a subset of nodes at level i; see Table A.1.4) are selected and processed based on dependency structure or criticality [65,79]. Advancement to level i+1 is permitted only when all nodes within Patternᵢ reach finalized status (P(n) = 2), enabling the derivation of Patternᵢ₊₁ from the children of those finalized nodes.
- Selective Refinement: Pattern progression to Patternᵢ₊₁ is governed by selective advancement via function select_critical_children(Patternᵢ) (Table A.1.5). This mechanism concentrates refinement along critical paths while preserving completeness guarantees through the S₄ completion phase (Table 39). This modularity follows principles of minimizing coupling and maximizing cohesion [80].
- Validation-driven refinement: Upon validation fails at level i, the function trace_origin(i) identifies the earliest affected level Jᵢ. This triggers reprocessing across the range [Jᵢ, i]. This backtracking capability allows previously finalized nodes to be revisited when validation errors originate from earlier levels, ensuring systemic coherence and architectural integrity across the hierarchy [82].
- Bounded refinement: CDD enforces the per-level limit Rₘₐₓ and iteration tracking indices—adhere to the formal model introduced in PDFD (Section 3.4.1), enforcing termination consistent with lifecycle principles [83]. The PBFD MVP implementation demonstrates this with Rₘₐₓ = 50 (Appendix A.14).
- Top-down finalization: Upon reaching the leaf level, PBFD initiates a top-down completion phase [81]. Remaining unprocessed patterns are finalized sequentially from level 1 through level L. This ensures comprehensive system completion while preserving the architectural consistency established during pattern-driven progression.
- 2.
- Key Characteristics
- 3.
- Workflow Representation
- 4.
- State Descriptions
| State ID | Phase | Description |
|---|---|---|
| S₀ | Initialization | Load tree and initialize patterns |
| S₁(i) | Current Pattern | Processes nodes in Patternᵢ |
| S₁(i+1) | Next Pattern (Children) | Represents the state of actively processing Patternᵢ₊₁, which is derived from children of Patternᵢ |
| S₁(j) | Refinement Level | Reprocess Patternⱼ due to failure propagated from a later level |
| S₂(i) | Pattern Validation | Validate processed nodes in Patternᵢ |
| S₂(j) | Refinement Validation | Validate reprocessed nodes in Patternⱼ during refinement |
| S₃(i) | Depth-Oriented Resolution | Depth-Oriented Resolution (Normal Context) - Load required data and resolve node implementation before descending |
| S₃(j) | Refinement Depth-Oriented Resolution | Refinement Depth Resolution - Load required data and resolve node implementation for Patternⱼ during refinement before descending or returning to the original context |
| S₄(i) | Completion Level | Finalize unprocessed nodes in Patternᵢ during the top-down pass |
| S₅ | Error | Terminates due to unresolved validation failures after exhausting Rₘₐₓ |
| T | Termination | All patterns processed and finalized |
- 5.
- Unified State Transition Table
- 6.
- State Machine Diagram
- 7.
- CSP Formal Verification Results and Refinement Guarantees
- Interpretation & Contributions
- Initialization (S0, S1 at each level L1, L2, L3)
- Validation (S2_ValidationInitial and S2_ValidationRefinement for all valid (j,i) combinations)
- Depth progression (S3_DepthProgression and S3_RefinementDepthResolution for all valid (j,i) combinations)
- Completion (S4 at all levels L1, L2, L3)
- Terminal states (S5 for error, T for success)
- Guaranteed Termination: The process always reaches either T (success) or S5 (controlled failure), eliminating the risk of system hangs.
- Bounded Recovery: Infinite refinement cycles are prevented via enforcement of the Rₘₐₓ threshold, ensuring resource-bounded execution.
- Fault Tolerance: The model maintains correctness under adversarial inputs, supporting deployment in mission-critical environments.
- 8.
- LTL Properties
- 9.
- Advantages
- Cross-Paradigm References:
- PDFD refinement mechanics (Section 3.4.1) apply to PBFD’s Jᵢ, Rᵢ, and Rₘₐₓ parameters.
- trace_origin(i) follows the PDFD specification (Appendix A.1, Table A.1.5). For details on trace_origin, see PDFD’s dependency-tracing logic in Section 3.4.1.
3.5. Methodological Synergy and Graph Theory in Practice
- Directional Rigor: Methodologies like DAD enforce strict hierarchies to prevent cycles, while DFD/BFD prioritize vertical/horizontal progression for early validation.
- Iterative Resilience: CDD enables controlled iterative refinement through structured feedback loops, essential for managing complexity and evolving requirements.
- Hybrid Efficiency: PDFD and PBFD apply hybrid traversal strategies, balancing depth-first and breadth-first techniques, and integrating CDD's iterative refinement to meet different scalability and modularity requirements.
4. Bitmask Encoding and Three-Level Encapsulation
- Overview
- Compact representation of child node selections
- Each child corresponds to a single bit in an integer
- Enables O(1) set operations (union, intersection, membership testing)
- Analogous to bitmap-index encoding in relational systems [91]
- Hierarchical pattern organizing data into Grandparent-Parent-Children levels
- Applies bitmask encoding at the Children level
- Enables O(1) relationship queries without joins
- Combines relational structure with bitmask efficiency
- Grandparent = Table (root context)
- Parent = Columns (intermediate entities)
- Children = Bitmask-encoded values (using Section 4.1 technique)
4.1. Bitmask-Based Pattern Encoding
4.1.1. Motivation and Encoding Mechanism
- The Problem
- The Solution
- Key characteristics:
- O(1) operations for n ≤ w (where w is machine word size, typically 64 bits)
- O(⌈n/w⌉) operations for n > w (multi-word bitmasks with minimal constant factor)
- Other lifecycle states (e.g., 'processed,' 'validated,' 'finalized') tracked using separate auxiliary bitmask fields
4.1.2. Structure and Operations
- Bit Assignment
- Core Operations
4.1.3. Application in PBFD
- Node Selection and Tracking
- Check if a child node is selected: parent_bitmask & child_node_mask != 0
- Mark a child node as processed/selected: parent_bitmask |= child_node_mask
- A child node is “active” (selected) if its corresponding bit is set in the node's bitmask.
- Once processing for a child node is finalized, additional bits can be toggled to record completion status.
- Integration into the PBFD Lifecycle
- Pattern matching: Select relevant groups of nodes at each level based on their bitmask representation
- Validation and refinement: Encoded selection status to avoid redundant node checks
- Finalization: Ensures complete coverage for all required node selections before progressing downward or exiting
- State machine control: Enables conditional transitions (e.g., transition from S₃ to S₄ only if all required children within a pattern are selected in the relevant parent's bitmask)
4.1.4. Performance Characteristics
- Storage and Computational Efficiency
- Key Advantages:
- Compact representation: Up to w distinct children nodes can be encoded in a single w-bit word (e.g., w = 64), assigning each node a unique bit position— enabling simultaneous updates and queries via single-cycle bitwise operations [95].
- Atomic updates: Selection flags within a parent's bitmask can be updated using atomic bitwise operations if concurrency is involved.
- Pattern combination: Bitwise OR or AND across multiple parent nodes supports group operations (e.g., finding all parent nodes that share a common set of selected children).
- Composable filtering: Parent nodes can be filtered based on complex combinations of child node selections via simple bitwise comparisons.
4.2. Three-Level Encapsulation (TLE)
- Grandparent level: Table (root context)
- Parent level: Columns (intermediate entities)
- Children level: Bitmask-encoded cell values (using the technique from Section 4.1)
4.2.1. Pattern Definition and Core Concepts
- Pattern Definition
- Relational Mapping
- Recursive Extension
- Level 1: [North American] (table) → [United State] (column) → States (bitmask)
- Level 2: [United States] (table) → Maryland (column) → Counties (bitmask)
- Level 3: Maryland (table) → [Allegany County] (column) → Cities (bitmask)
- Implementation Variants
- Canonical pattern (MVP): One table per grandparent entity, maximizing modularity and independent evolution
- Consolidated pattern (Enterprise): Multiple grandparent entities combined into wide tables, optimizing for query performance and reduced I/O overhead
- Bitmask Semantics
- Bit 0 (LSB) = 1 → Allegany County is active
- Bit 1 = 0 → Anne Arundel County is inactive
- Bit 2 = 1 → Baltimore County is active
4.2.2. Hybrid Architecture and Implementation
- Architecture Components
- Source hierarchy table: Maintains normalized parent-child relationships using traditional foreign keys. This serves as the authoritative data source and ensures referential integrity.
- Derived TLE table: A denormalized, bitmask-encoded representation materialized from the source table. Structured according to Table 47's mapping, this provides O(1) hierarchical access without joins.
- Operational Workflow
- Core Operations
- LOAD(Grandparent): Load the TLE-encoded data for a given grandparent context
- READ(Parent, Child): Check the state (selected/active) of a specific Child within a Parent's bitmask
- WRITE(Parent, Child, State): Set or clear the state of a specific Child within a Parent's bitmask
- COMMIT(Grandparent): Persist the updated TLE-encoded data for the grandparent context
- Performance Characteristics
- Key advantages:
- Eliminated joins: Parent-child relationships accessed via bitmask operations
- Predictable I/O: Fixed-width rows enable efficient memory layout and caching
- Constant-time operations: Bitwise operations replace recursive traversals
4.2.3. Formal Specification and Verification
- Abstract State Descriptions
- Unified State Transitions
- Formal Verification and Refinement Guarantees for TLE
- Interpretation and Technical Contributions
- Implementation specification: S₀ (1) + S₁–S₆ across u₁, u₂, u₃ (18) = 19 assertions
- Abstract specification: Abstract_S₀ (1) + Abstract_S₁–S₆ across u₁, u₂, u₃ (18) = 19 assertions
- Total: 19 + 19 = 38
- Isolation: Parameterized state and channel definitions maintain separation between concurrent units.
- Robustness: The system remains safe under adversarial scheduling or unexpected event ordering.
- Event-Driven Correctness: Synchronization via parameterized channels mirrors the intended event-driven semantics.
- Continuous Operation: The S₆ → S₀ recurrence supports unbounded execution without termination or deadlock.
4.2.4. Performance Characteristics and Complexity Analysis
- Computational Complexity
- Formal Properties
4.3. Summary of Advantages
5. Evaluation of PBFD and PDFD: From Controlled MVPs to Production Deployment
- Evidence from MVP Implementations
- From Architectural Validation to Production Performance
- Focus of This Section
5.1. Problem Context
- Complex data requirements: The system was designed to support the structured capture of incident locations, timelines, multi-tiered classification codes, and detailed employment data, including union affiliations, employment status, and employer information.
- Deep hierarchical dependencies: The form structure includes up to eight levels of conditionally dependent elements, which are formally modeled as an n-ary tree. This depth leads to a combinatorial explosion of possible states, making traditional row-based storage and retrieval inefficient [91].
- Performance and Delivery Demands: The system required real-time validation and responsive user interaction under production load, with complete feature delivery within three weeks—a timeline incompatible with conventional iterative development approaches.
5.2. Solution: Adoption of PBFD Methodology
- Hierarchical modeling
- Bitmask-based representation
- Database Optimization via Consolidated TLE Schema
- Consolidation Approach
-
Hierarchy flattening: The 8-level hierarchy (Figure 16) was flattened by representing grandparent entities as columns within a single table, rather than as separate tables in the canonical TLE design. This creates a recursive column promotion pattern:
- ○
- Parent columns at level N contain bitmask values encoding their children
- ○
- These parent columns are promoted to grandparent columns at level N+1
- ○
- Each column–bitmask pair preserves the parent→child relationship within a unified table structure
- Preserved semantics: The core TLE logic remains unchanged—for any parent value, a bitmask column encodes its selected children. Parent–child relationship semantics and bitwise operations are identical to canonical TLE; only the physical storage model differs.
- Performance outcome: This consolidation reduced the transactional schema to two tables, minimizing I/O overhead and join complexity while guaranteeing production-scale performance [54].
- UI integration
5.3. Implementation Outcomes
5.4. Technical Observations
- Rapid Development and Onboarding: PBFD enabled one developer to deliver a production system in a single month. Compared to traditional methods (≥9× faster) and low-code tools (≥20× faster), this is supported by Appendix A.20’s analysis. The graph-driven structure also fostered rapid onboarding, aligning with evidence on the role of coherent mental models in comprehension [108].
- Compact Storage and Schema Simplification: Encoding relationships into fixed-width bitmask fields reduced schema complexity from 13 tables (6 factor and 7 junction tables) to 2, while achieving 11.7× overall storage reduction and 85.7× index reduction (Appendix A.22).
- Optimized Write and Query Performance: Bitwise O(1) updates replaced traditional O(n) multi-row operations. This explains the 7–8× page-load improvement and lower tail latency (Appendix A.21), mitigating known bottlenecks in hierarchical queries [91].
- Production-Stable Hybrid Semantics: PBFD illustrates a hybrid relational–NoSQL design through TLE: SQL Server is used to achieve document-like modeling within a relational system. Eight years of production stability demonstrate that PBFD balances hierarchical flexibility with ACID integrity [109].
5.5. Limitations and Threats to Validity
- Single-case Generalizability: Findings from one enterprise case, offering strong ecological validity but limited statistical generalization
- Construct Validity – Developer Expertise: While all implementations were led by expert developers, expertise levels and domain familiarity vary across individuals. The PBFD vs. relational comparison involves the same expert (PBFD's inventor) leading both, introducing additional confounds from learning effects and problem familiarity. Detailed analysis in Appendix A.20.5
- Construct Validity – Baseline Heterogeneity: Heterogeneous systems for baseline comparisons, providing ecological realism and potentially underestimating PBFD’s performance advantage (see Appendices A.21.6, A.22.4)
- Temporal and Maturation Threats: Data spanning 2016–2024, introducing potential history and maturation effects mitigated by the longitudinal design
6. PDFD and PBFD Comparative Analysis
6.1. Traditional FSSD: Situational Advantages and Trade-Offs
6.2. Methodological Comparison: FSSD vs PDFD vs PBFD
6.3. PBFD vs. Conventional Relational Models (Including PDFD)
6.4. Comparison with Modern Database Paradigms
6.5. Comparison to Traditional Bitmap Indexing
6.6. Comparison to Multi-Column or Multi-Row
6.7. Key Takeaways: Advancing FSSD with Directed Graph-Based Methodologies
- Methodological Fit: PBFD excels in layered or dependency-driven domains (e.g., claims processing, product taxonomies), while PDFD suits feature-centric, quick end-to-end testing needs consistent with the iterative, feature-focused delivery principles of Extreme Programming [110].
- Complexity Management: Both reduce maintenance burdens by decoupling dependencies and enforcing structure, addressing common software evolution challenges [111].
- Adoption Potential: Their conceptual clarity facilitates onboarding and modular scaling, supporting integration into low-code and DSL-based workflows.
- Scalability: Empirical results confirm stability at large user scales, affirming their suitability for evolving, long-lived systems.
6.8. Limitations of PDFD and PBFD
- Learning Curve: Understanding bitmasks (PBFD) or state transitions and directed graph slicing (PDFD) can be nontrivial for teams used to traditional relational models.
- Tooling and Middleware: PBFD may require custom middleware to support cross-shard aggregation of TLE-encoded bitmasks. Both PBFD and PDFD rely on dependency- or hierarchy-aware tooling to manage their underlying traversal graphs (e.g., DAG slicing in PDFD and TLE-based parent–child graph navigation in PBFD).
- Model Rigidity: PDFD assumes well-isolated features; PBFD assumes a relatively stable hierarchy—both may be challenged in dynamic, unstructured domains (e.g., social graphs).
- Initial Overhead: Upfront modeling and pattern definition require more investment than ad hoc FSSD approaches, consistent with the trade-offs of plan-driven methodologies [111].
7. Discussion
7.1. Significance of the Study
7.2. Mechanisms Underpinning PBFD and PDFD Efficiency
- Graph-Based Abstraction for Business Logic: Modeling business processes as directed graphs (Figure 3 and Figure 16) profoundly reduced cognitive load and streamlined development, leading to over 20× speedup compared to conventional tools (Table 54, Appendix A.20) [119].
- Context Consistency in Sequential Development: Disciplined sequential development across refinement layers minimized context switching and cross-module regressions (Appendices A.11 & A.14), improving modular testability and reducing verification cycles [120].
- Encoded Data Optimization: The combination of Three-Level Encapsulation (TLE) and bitmask techniques (Section 4) yielded substantial space savings (11.7× compression; Appendix A.22) and dramatically improved lookup speed (O(1) complexity, Table 61). The efficiency gains from such encoding are a well-understood principle in database systems, where optimized data structures are critical for high-performance query execution [53,55]. The use of bitmask techniques in PBFD aligns with established indexing strategies such as bitmap indexes, which are widely used in data warehouses to accelerate query processing over low-cardinality columns [54].
7.3. Early Adoption Challenges for PBFD
7.4. Adapting TLE to Non-Relational Database Systems
7.5. Relational Constraints and Design Trade-Offs in PBFD Deployments
7.6. Study Limitations
7.7. Unexpected Benefits
7.8. Additional Future Research Directions
- Domain Generalization: Extend methodologies to other contexts (e.g., ETL, BI, rules engines) by mapping abstract nodes to domain primitives and refining traversal semantics
- Distributed and Modular Systems: Investigate utility in microservice and edge computing, focusing on runtime synchronization, orchestration, and modular validation
- Tooling and Developer Ecosystem: Develop companion tooling (e.g., IDE plugins, visualizers) to translate abstract process models into accessible engineering workflows
- Rigorous Empirical Validation: Conduct controlled comparative studies against conventional methods across performance, scalability, maintainability, and defect density. Future empirical work could build upon the comprehensive frameworks for evaluating database system performance as laid out in standard texts [54,118]
8. Conclusion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Conflicts: of Interest
Data Availability Statement
Acknowledgments
Appendix A
Appendix A.1. Formal Notation and Semantic Symbols
| Symbol | Meaning |
|---|---|
| □φ | Always φ (globally true) — “Globally” in LTL |
| ◯φ | Next state φ — φ will be true in the very next state |
| ◊φ | Eventually φ — φ will be true at some future time |
| φ ⇒ ψ | Implication — if φ holds, then ψ must also hold |
| ¬φ | Negation — φ does not hold |
| φ ∧ ψ | Conjunction — both φ and ψ hold |
| φ ∨ ψ | Disjunction — at least one of φ or ψ holds |
| <_{lex} | Lexicographical comparison. The operator evaluates if the tuple on the left is strictly less than the tuple on the right. Comparison proceeds from left to right, element by element. |
| Expression | Meaning |
|---|---|
| ∀x ∈ X | Universal quantifier: for all x in set X |
| ∃x ∈ X | Existential quantifier: there exists x in set X |
| ∄ | There does not exist (e.g., no cycles, no path) |
| X ⊆ Y | Set inclusion: X is a subset of Y |
| X ∖ Y | Set difference: elements in X but not in Y |
| Notation | Meaning |
|---|---|
| P(n) = 0 | Node n is unprocessed |
| P(n) = 1 | Node n is in progress |
| P(n) = 2 | Node n is fully processed and validated |
| processed(n) | P(n)=1 or P(n)=2 |
| validated(n) | P(n) = 2 |
| finalized(n) | P(n) = 2. Used interchangeably with validated(n) |
| Term | Definition / Description |
|---|---|
| G=(V,E) | A Directed Acyclic Graph (DAG) with vertex set V and edge set E |
| children(v) | The set of direct successor nodes to node v in the graph or tree |
| D(v) | Direct dependencies of node v: the set of nodes u such that there is a directed edge from u to v (i.e., {u | (u,v) ∈ E}) |
| Tr | Rooted, finite, acyclic tree structure with nodes V and edges E |
| Cᵢ | The current node being processed in the traversal |
| Bⱼ | A backtrack point (a node on the current path with unvisited siblings) |
| Q | Global queue tracking nodes to process |
| Nₖ | Set of nodes at level k |
| Iₖ | Incremental delivery milestone k, representing a validated subset of the system |
| Fₖ | Feedback trigger mechanism (e.g., validation failure, stakeholder input) associated with milestone k |
| depth(v) | The length of the longest path from a root node to node v |
| ancestors(v) | The set of all nodes from which node v is reachable in the graph (i.e., {u ∈ V | there exists a path from u to v}) |
| descendants(v) | The set of all nodes reachable from node v in the graph (i.e., {u ∈ V | there exists a path from v to u}) |
| level(k) | The set of all nodes at a specific depth k in a tree or layered graph (i.e., {v ∈ V | depth(v)=k}) |
| Path(v) | A directed path from a root node to node v |
| state(Bⱼ) | A function mapping node Bⱼ to its processing state |
| Subtree(Bⱼ) | All descendants of node Bⱼ |
| invalid(s) | True if state s violates the state machine constraints or invariant conditions |
| ReachableStates | The set of all states reachable from the initial state through legal transitions |
| follows_rules(t) | True if the transition t complies with the transition rules |
| consistent(n, a, d) | True if node n is consistent with its ancestor a and descendant d in terms of structure/data |
| valid_state(s) | A state is considered valid if and only if it is not invalid(s) |
| succ(L) | Returns the successor level to L |
| pred(L) | Returns the predecessor level to L |
| Next(level) | Returns the logically next level from the current level (e.g., level + 1), capped at the maximum depth L. Used for sequential level progression |
| Patternᵢ | A formal model: a cohesive, feature/function-grouped subset of nodes (comprising data, logic, and UI artifacts) at hierarchical level i, encapsulating a distinct unit of business logic or system functionality (See Section 3.4.2 for detailed discussion) |
| roots(G) | The set of root nodes in graph G: {v ∈ V | ¬∃u: (u,v) ∈ E} |
| leaves(G) | The set of leaf nodes in graph G: {v ∈ V | ¬∃u: (v,u) ∈ E} |
| L | The maximum depth of the graph/tree hierarchy: max{depth(v) | v ∈ V} |
| [P] | Iverson bracket: [P] = 1 if predicate P is true, 0 otherwise |
| bitmask | Binary representation of child relationships under a parent, supporting constant-time access |
| Term | Type | Description | Methodologies |
|---|---|---|---|
| processed(n) | Predicate | Evaluates to True if node n has undergone its core processing or development action | DAD, DFD, BFD, CDD |
| Rₘₐₓ | Constant | The maximum number of refinement attempts allowed for any specific level or pattern before an error state is triggered | PDFD, PBFD |
| Jᵢ | Constant | Start of refinement: Earliest level impacted by failures at i, where Jᵢ = trace_origin(i) | PDFD, PBFD |
| Rᵢ | Constant | Refinement range: The number of levels to reprocess, calculated as Rᵢ = i - Jᵢ + 1 (bounded by L) | PDFD, PBFD |
| Kᵢ | Constant | Progression Threshold: Minimum finalized nodes (P(n)=2) at level i required before advancing to i+1. Acts as a configurable WIP limit enforcing structured synchronization points | PDFD, PBFD |
| rⱼ | Constant | Current refinement attempt index for Patternⱼ | PDFD |
| Reset(n) | Predicate | Evaluates to True if node n's processing status or validation state is reverted, requiring re-evaluation or re-processing. | PDFD, PBFD |
| refinement_attempts(j) | Counter | Tracks the number of refinement attempts for a specific level/pattern j. Resets when a new refinement cycle begins | PDFD, PBFD |
| trace_origin(i) | Function | Determines the root cause level Jᵢ (or pattern Jᵢ) based on a validation failure detected at level i | PDFD, PBFD |
| trace(i) | Function | The path or sequence of levels leading to level i, used to constrain progression and ensure bounded advancement | PDFD |
| selected_subtree | Set | The subset of nodes selected for processing within a level or pattern, constrained by trace and eligibility criteria | PDFD |
| max_batch_size | Constant | The maximum number of nodes that can be processed in a single batch within a level | PDFD |
| validated(n) | Predicate | Evaluates to True if node n has successfully passed all its associated validation criteria | DFD, BFD, CDD, PDFD, PBFD |
| critical(n) | Predicate | True if node n requires vertical processing (children must be processed) | PBFD |
| start(i) | Pseudocode | Initial state transition (idle → active) | DAD, DFD, BFD, CDD |
| terminate(i) | Pseudocode | Terminal state (all nodes processed) | DAD, DFD, BFD |
| refine(c) | Function | A node that needs iterative improvement. | CDD |
| finalize(i) | Function | Finalizes a single node | CDD |
| processing_complete(i) | Predicate | Evaluates to True when processing at level i is complete | PDFD |
| refining(j) | Predicate | True when the system is executing a refinement cycle targeting level j (state = S₁(j) ∧ refinement_attempts(j) > 0) | PDFD, PBFD |
| affected_nodes(j) | Function | Returns the set of nodes {n ∈ G | ∃k ∈ [j, L]: n ∈ level(k)} that may be reset during refinement at level j | PDFD, PBFD |
| consistent(n) | Predicate | True if node n satisfies all internal consistency constraints and validation criteria specific to its domain | PDFD, PBFD |
| dependencies_satisfied(n) | Predicate | True if node n satisfies all architectural dependencies and interface contracts with related nodes | PDFD, PBFD |
| all_descendants_validated(n) | Predicate | True if all descendant nodes of n have been validated | PDFD, PBFD |
| processed_subtree(n) | Function | Returns the set of nodes selected for processing in the subtree of n | PDFD, PBFD |
| dequeue(v) | Predicate | True when node v is dequeued for processing | DAD |
| process(v) | Function | Initiates core processing for node v | DAD |
| select_critical_children(Patternᵢ) | Function | Returns a subset of ∪_{n∈Patternᵢ} children(n) selected based on critical path analysis, dependency ordering, and resource constraints. Ensures architectural coherence while allowing efficient progression, with remaining nodes handled in S₄ completion phase | PBFD |
| k₁ (unfinalized_nodes) | Function | Returns the count of nodes with P(n) ≠ 2 | PDFD, PBFD |
| k₂ (remaining_attempts) | Function | Returns ∑_{j∈ActiveLevels} (Rₘₐₓ − refinement_attempts(j)) | PDFD, PBFD |
| k₃ (phase_ordinal) | Function | Maps state phases to ordinals: S₀ = 4, S₁=3, S₂=2, S₃=1, S₄=0 | PDFD, PBFD |
| k₄ (intra_phase_progress) | Function | Tracks progress within the current phase | PDFD, PBFD |
| M | Function | Lexicographic measure M = (k₁, k₂, k₃, k₄) | PDFD, PBFD |
| enabled_transition(s) |
Predicate |
True if at least one transition is enabled in state s |
PDFD |
| eligible(n) | Predicate | True if node n meets all local validation and architectural criteria, allowing it to be part of the set considered for the Kᵢ threshold in S₂ progression. (Implies validated(n) and consistent(n)) | PDFD |
| Structural Invariants | Set/Term | The set of all fundamental structural properties required for correct termination, including: Global Consistency, Descendant Finalization Invariant, and dependencies_satisfied for all nodes | PDFD, PBFD |
| test_failed(Cᵢ) | Predicate | True if testing of node Cᵢ fails | CDD |
| feedback_triggered(Cᵢ) | Predicate | True if feedback is triggered for node Cᵢ | CDD |
| refinement_complete(Cᵢ) | Predicate | True if refinement of node Cᵢ is complete | CDD |
| refinement_failed(Cᵢ) | Predicate | True if refinement of node Cᵢ fails | CDD |
| refinement_count(Cᵢ) | Counter | Tracks the number of refinements for node Cᵢ | CDD |
| all_components_written(Iₖ) | Predicate | True if all components in milestone Iₖ are written | CDD |
| feedback_received(Iₖ) | Predicate | True if feedback is received for milestone Iₖ | CDD |
| validation_failed(Iₖ) | Predicate | True if validation of milestone Iₖ fails | CDD |
| all_increments_validated | Predicate | True if all increments are validated | CDD |
| validation_successful(Iₖ) | Predicate | True if validation of milestone Iₖ is successful | CDD |
| initiate_workflow(Grandparent) | Function / Operation | Starts the TLE workflow for a given grandparent unit (loads context, registers processing unit) | TLE |
| LOAD(Grandparent) | Operation | Atomic load of grandparent data and metadata into TLE context | TLE |
| resolve_hierarchy() | Function / Operation | Internal resolution that computes parent/child relationships and prepares traversal order | TLE |
| evaluate_children(Parent) | Predicate / Operation | Iteratively evaluates each child of Parent for processing eligibility (reads child state, bitmask tests) | TLE |
| READ(Parent, Child) | Operation | Read access to Parent and Child data (used during evaluate_children) | TLE |
| update_required(Parent, Child) | Predicate | True iff a child/parent pair requires an update (e.g., bitmask change or state change) | TLE |
| apply_update(Parent, Child, State) | Operation | Apply the computed update to Parent/Child in-memory state (pre-commit) | TLE |
| persist_changes() | Operation | Flush pending updates to durable storage (pre-commit stage) | TLE |
| WRITE(Parent, Child, State) | Operation | Durable write of Parent/Child state (used when persisting updates) | TLE |
| COMMIT(Grandparent) | Operation | Commit the grandparent-level changes (atomic commit of bitmask / selection) | TLE |
| has_next_unit() | Predicate | True if there is another TLE processing unit (grandparent) to process in the workload | TLE |
| has_unprocessed_unit() | Predicate | True if there exists at least one grandparent unit not yet processed | TLE |
| finalize_process() | Operation | Finalize the overall TLE workflow (cleanup, release resources, produce summary) | TLE |
| State ID | Global Label | Description | Methodologies Using This State |
|---|---|---|---|
| S₀ | Initialization | The initial state, involving loading foundational structures (e.g., DAGs, trees, or graphs) and initializing necessary parameters, queues, or dependency structures | All (DAD, DFD, BFD, CDD, PDFD, PBFD, TLE) |
| S₁ | Active Processing | Represents the core development or processing phase where active work is performed on nodes, levels, or components (e.g., enqueuing, pushing, resolving patterns) | DAD, DFD, BFD, CDD |
| S₁(i) | Current Pattern/Level | Indicates active processing of nodes within Patternᵢ or level i | PDFD, PBFD |
| S₁(i+1) | Next Level/Pattern Progression | Processing of Patternᵢ₊₁ or level i+1, typically derived from children of Patternᵢ or level i | PDFD, PBFD |
| S₁(j) | Refinement Level | Reprocessing Patternⱼ or level j due to a validation failure detected in a later stage | PDFD, PBFD |
| S₁ (TLE) | Parent Batch Loaded | Indicates the parent node batch has been loaded and is ready for context-aware evaluation | TLE |
| S₂ | General Validation / Dependency Check/Refinement | A non-parameterized validation phase. Examples include verifying dependency completeness (DAD), backtracking to a parent node (DFD), validating an entire level (BFD), or refining nodes and levels (CDD) | DAD, DFD, BFD, CDD |
| S₂(i) | Pattern/Level Validation | Validates the processed nodes within Patternᵢ or level i | PDFD, PBFD |
| S₂(j) | Refinement Validation | Validates the reprocessed nodes in Patternⱼ or level j during an active refinement cycle | PDFD, PBFD |
| S₂ (TLE) | Context Established | Resolves grandparent-level context to support child node resolution and bitmask evaluation | TLE |
| S₃ | Graph Extension / Validation | General adaptation including node/edge addition and iterative design validation | DAD, DFD, CDD |
| S₃(i) | Depth-Oriented Process / Resolution | Bottom-up subtree validation and subtree resolution before descent | PDFD, PBFD |
| S₃(j) | Refinement Depth-Oriented Resolution | Refinement Depth Resolution - Load required data and resolve node implementation for Patternⱼ during refinement before descending or returning to the original context | PBFD |
| S₃ (TLE) | Ancestor Data Prepared | Loads ancestor-level metadata to support bitmask-based child node resolution | TLE |
| S₄ | Completion Phase | A top-down traversal phase used to finalize unprocessed nodes or patterns, ensuring full coverage and correctness prior to termination | PDFD, PBFD |
| S₄(i) | Level / Pattern Completion Phase | Completes all unprocessed nodes within Patternᵢ or level i during top-down finalization | PDFD, PBFD |
| S₄ (TLE) | Children Evaluated | Child Node Evaluation via Bitmask Logic – Determines structural inclusion or filtering | TLE |
| S₅ | Error / Failure Termination | Triggered when validation or refinement fails irrecoverably, or Rₘₐₓ (maximum refinement attempts) is exceeded | PDFD, PBFD |
| S₅ (TLE) | Bitmask Committed | Ancestor-Level Bitmask Update – Writes finalized selection to ancestor or top-level structure | TLE |
| S₆ (TLE) | Traversal Finalized | Indicates that the traversal is complete and no further node evaluation remains for the current resolution pass. | TLE |
| T | Termination | The successful conclusion of all phases: all nodes, patterns, and components are validated and finalized. Applies to both flat and hierarchical methods, including hybrid workflows (PBFD, PDFD). | All (DAD, DFD, BFD, CDD, PDFD, PBFD, TLE) |
| Symbol | Meaning |
|---|---|
| -> | Action Prefix / Event Sequencing: Defines sequential event occurrences where event a occurs then process P executes (Example: a -> P) |
| [] | External Choice: Allows environment selection between processes where either A or B can occur based on external input (Example: (event1 -> P1) [] (event2 -> P2)) |
| ; | Process Sequencing: Ensures process P completes (reaches SKIP) before process Q begins (Example: P ; Q) |
| SKIP | Successful Termination: Represents successful completion of an event or process |
| ? | Input Parameter: Receives input from the environment for parameterized events (Example: ?node) |
| ! | Output Parameter: Sends output to the environment for parameterized events (Example: !result) |
| [] x:S @ P | Indexed External Choice: Enables non-deterministic selection where the environment chooses any element from set S to initiate process P (Example: [] c:NodeID @ process_c) |
| STOP | Deadlock / Halt: Represents a blocked state where no events are possible |
| ?x / !x | Channel Input / Output: Receives values via ?x or sends values via !x |
| if ... then ... else ... | Conditional Branching: Enables guard-based process selection |
| let ... within ... | Local Variable Assignment: Defines local variables for intermediate computation |
| RUN(A) | Infinite Acceptance: Accepts any event from alphabet A indefinitely |
| [T= P] | Trace Refinement: Verifies that process behavior conforms to specification P |
| \ | Hiding: Makes specified events internal and unobservable |
| [| X |] | Synchronized Parallel Composition: Executes two processes in parallel with required synchronization on events in set X while allowing independent execution of events outside X |
| |~| | Internal Non-deterministic Choice: Enables system-internal selection among multiple options without environment influence |
| ||| | Interleaving / Independent Parallel: Executes processes independently without event synchronization |
| Symbol | Meaning |
|---|---|
| n | Number of root entities (grandparent units) |
| Maximum number of children for any parent entity | |
| c_id | Identifier of a specific child within a parent bitmask; used for bitwise indexing |
| Variable number of parent entities for grandparent unit i | |
| Total number of parent entities across all grandparents | |
| Time complexity of a single lookup query (Theorem A.10.2) | |
| Time complexity of a single update operation (Theorem A.10.3) | |
| Total time complexity of processing all relationships (Theorem A.10.4) | |
| Variable bitmask size in bits for a parent entity j (e.g., 8, 16, 32, 64, or varchar(n)) | |
| k | Bit length of a traditional foreign key used in the baseline relational representation |
| m | Total number of child relationships in the hierarchy |
| ĉ | The average number of children per parent across all parent entities |
| Ć | The average bitmask size (in bits) across all parent entities |
| w | Machine word size used for bitmask storage (e.g., 64 for BIGINT) |
| Total storage size (in bits) required by the TLE model | |
| Total storage size (in bits) required by the traditional foreign key representation | |
| Grandparent |
Root-level entity that encapsulates multiple parent entities and their hierarchical context |
| Parent | Intermediate entity that manages child relationships through bitmask-based selection |
| Child | Leaf-level entity evaluated for inclusion/exclusion via parent's bitmask logic |
Appendix A.2. DAD Mermaid Code, Algorithm, and Process Algebra
Appendix A.2.1. Structural Workflow Mermaid Code
Appendix A.2.2. State Machine Mermaid Code
Appendix A.2.3. Algorithm (Pseudo Code)
Appendix A.2.4. CSP Implementation and Formal Verification
- GitHub: https://github.com/IBM-Consulting-Formal-Methods/CDD_CSP (commit: 03b972d)
Appendix A.2.5. DAD (Directed Acyclic Development) Methodology Tables
| Pseudocode Term | Type | Description | Pseudocode Lines | CSP Mapping |
|---|---|---|---|---|
| Initialization | ||||
| LoadDAG(G) | Function | Initializes the DAD process by loading the Directed Acyclic Graph structure G | 1 | load_dag_actual!g_initial |
| queue Q ← [v₁] | Function | Initializes the processing queue Q with the root node v₁ | 2 | initialize_queue_actual!v1_root |
| Node Processing Loop | ||||
| Q is not empty | Condition | True if the processing queue Q has no nodes (loop termination condition) | 3 | queue_not_empty |
| v ← Dequeue(Q) | Function | Removes and returns a node v from the front of the processing queue Q | 3a | dequeue_actual!node |
| Process(v) | Function | Perform core processing action for node v | 3b | process_actual!node |
| Dependency Validation | ||||
| ValidateDependencies(D(v)) | Function | Verify completeness of v's dependencies | 3c | validate_dependencies_actual!node |
| all_u_in_Dv_are_processed(v) | Condition | True if all direct dependencies of v are processed | 3d | all_dependencies_processed!node |
| Enqueue(children(v)) | Function | Add children of v to the queue for next iteration | 3e | generate_children_actual!node / enqueue_nodes_actual!children(node) |
| Graph Extension (Missing Dependencies) | ||||
| Else (missing dependency) | Control | Handles unresolved dependencies | 3f | missing_dependency!node |
| ExtendGraph(v_new) | Function | Add new node v_new and its necessary edges to the DAG to resolve dependency | 3g | extend_graph_actual!node!v_new_param |
| Enqueue(v_new) | Function | Enqueue new node v_new for future processing | 3h | enqueue_nodes_actual!{v_new} |
| Termination | ||||
| FinalValidation() | Function | Perform final validation and conclude workflow | 4 | perform_final_validation_actual |
| CSP Process | Key Transitions | Pseudocode Lines | CSP Events |
|---|---|---|---|
| S0 (Initialization) | DA1: →S1 (Load DAG & Init Queue) | 1-2 | load_dag_actual!g_initial, initialize_queue_actual!v1_root |
| S1 (Node Processing) | DA2: →S2ValidateOutcome(v) (Dequeue & Process) | 3a-3c | queue_not_empty, dequeue_actual!node, process_actual!node, validate_dependencies_actual!node |
| DA6: →T_SUCCESS (All Nodes Processed) | 3, 4 | all_nodes_processed, perform_final_validation_actual | |
| S2ValidateOutcome(v) | DA3: →S1 (Dependencies Processed) | 3d-3e | all_dependencies_processed!node, generate_children_actual!node, enqueue_nodes_actual!(children(node)) |
| DA4: →S3ExtendCompletion(v_new) (Missing Dependency) | 3f-3g | missing_dependency!node, extend_graph_actual!node!v_new_param | |
| S3ExtendCompletion(v_new) | DA5: →S1 (Enqueue New Node) | 3h | enqueue_nodes_actual!{v_new} |
| T_SUCCESS (Successful Termination) | N/A | N/A | terminate_successfully_actual |
| T_ERROR (Error Termination) | N/A | N/A | terminate_with_error_actual |
Appendix A.2.6. Formal Verification Details for DAD Model and Guarantees
- Compression: default behavioral reduction (e.g., diamond elimination, sbisim).
- Search order: Breadth-first exploration (default, ensures shortest counterexample discovery).
- The model state space was fully explored. Verification confirms tractability and correctness for all ten critical assertions.
- Core safety and liveness (Assertions 1–3): Confirm predictable, non-blocking dependency-first traversal.
- Local processing and dependency control (Assertions 4–8): Enforce strict adherence to DA2–DA3 sequencing.
- Validation and termination (Assertions 9–10): Guarantee that traversal, final validation, and termination complete correctly.
Appendix A.3. DFD Mermaid Code, Algorithm, and Process Algebra
Appendix A.3.1. Structural Workflow Mermaid Code
Appendix A.3.2. State Machine Mermaid Code
Appendix A.3.3. Algorithm (Pseudo Code)
Appendix A.3.4. CSP Implementation and Formal Verification
- GitHub: https://github.com/IBM-Consulting-Formal-Methods/DFD_CSP (commit: b421b32)
Appendix A.3.5. DFD (Depth-First Development) Methodology Tables
| Pseudocode Term | Type | Description | Pseudocode Lines | CSP Mapping |
|---|---|---|---|---|
| Initialization | ||||
| LoadProject(T) | Function | Initializes tree structure | 1 | load_tree_actual!t_initial |
| stack ← [C₁] | Function | Initializes DFS stack | 2 | initialize_stack_actual!c_root |
| Node Processing Loop | ||||
| stack is not empty | Condition | Loop continuation | 4 | stack_not_empty!c |
| stack is empty | Condition | Termination check | 4 | stack_is_empty |
| C ← pop(stack) | Function | Pops node from stack | 4a | dequeue_actual!c |
| Process(C) | Function | Core processing | 4b | dequeue_actual!c |
| Add C to Processed | Operation | Mark node as processed | 4c | Tracked in processed set parameter |
| Non-Leaf Processing | ||||
| C is a non-leaf | Condition | Node has children | 4d | is_non_leaf!c |
| push(reverse(children(C)), stack) | Function | Push children for DFS traversal | 4e | process_child_actual!c → push_children_actual!c → PushChildren process |
| Leaf Processing & Backtracking | ||||
| C is a leaf | Condition | Node is leaf | 4f | is_leaf!c |
| Bⱼ ← parent(C) | Function | Set backtrack point to parent | 4g | set_backtrack_point_actual!parent(c) |
| Bⱼ is not null | Condition | Backtracking loop continuation | 4h | Implicit in S2/S3 recursion |
| has_unprocessed_sibling(Bⱼ) | Condition | Check for unprocessed siblings | 4i | has_unprocessed_sibling!b_j |
| push(get_unprocessed_sibling(Bⱼ), stack) | Function | Push sibling to stack | 4j | get_unprocessed_sibling_actual!b_j → push_sibling_actual!sibling |
| no alternative siblings at Bⱼ | Condition | No unprocessed siblings remain | 4l | no_unprocessed_sibling!b_j |
| ValidateSubtree(Bⱼ) | Function | Subtree validation | 4m | validate_subtree_actual.Bⱼ |
| Termination Checks | ||||
| stack is empty and no_more_backtrack_points_above(Bⱼ) | Condition | Final termination check | 4n | no_more_backtrack_points_above!b_j |
| Terminate() | Function | Final termination | 4o, 5 | terminate_successfully_actual |
| Bⱼ ← parent(Bⱼ) | Function | Backtrack upward to parent | 4r | backtrack_to_actual!b_j!parent(b_j) |
| CSP Process | Key Transitions | Pseudocode Lines | CSP Events |
|---|---|---|---|
| S0 (Initialization) | DF1: →S1 (Load tree & initialize stack) | 1-2 | load_tree_actual!t_initial, initialize_stack_actual!c_root |
| S1 (Vertical Processing) | DF7: →T (Stack empty termination) | 4,5 | stack_is_empty, terminate_successfully_actual |
| DF2: →S1 (Non-leaf processing) | 4a-4e | stack_not_empty!c, dequeue_actual!c, process_actual!c, is_non_leaf!c, process_child_actual!c, push_children_actual!c, PushChildren process (iterates over children) | |
| DF3: →S2 (Leaf processing) | 4a-4g | stack_not_empty!c, dequeue_actual!c, process_actual!c, is_leaf!c, set_backtrack_point_actual!parent(c) | |
| S2(Bⱼ) (Backtracking) | DF4: →S1 (Process unprocessed sibling) | 4h-4j | has_unprocessed_sibling!b_j, get_unprocessed_sibling_actual!b_j, push_sibling_actual!sibling |
| DF5: →S3 (No siblings, validate subtree) | 4h, 4l-4m | no_unprocessed_sibling!b_j, validate_subtree_actual!b_j | |
| S3(Bⱼ) (Validation) | DF7: →T (Terminate at root) | 4n-4o | no_more_backtrack_points_above.Bⱼ, terminate_successfully_actual |
| DF6: →S2 (Continue backtracking upward) | 4q-4r | subtree_validated.Bⱼ, backtrack_to_actual.parent(Bⱼ) | |
| T (Termination) | Final state | 5 | terminate_successfully_actual |
Appendix A.3.6. Formal Verification Details for DFD Model and Guarantees
- Compression: default behavioral reduction (e.g., diamond elimination, sbisim)
- Search order: Breadth-first exploration (default, ensures shortest counterexample discovery)
- Core safety and liveness (Assertions 1–3): Confirm predictable, non-blocking traversal
- Local processing and control flow (Assertions 4–6, 8): Enforce strict adherence to stack-based sequencing (DF2→DF3)
- Validation and termination (Assertion 7): Guarantee that traversal and validation complete before halting
Appendix A.4. BFD Mermaid Code, Algorithm, and Process Algebra
Appendix A.4.1. Structural Workflow Mermaid Code
Appendix A.4.2. State Machine Mermaid Code
Appendix A.4.3. Algorithm (Pseudo Code)
Appendix A.4.4. CSP Implementation and Formal Verification
- GitHub: https://github.com/IBM-Consulting-Formal-Methods/BFD_CSP (commit: 2dd71de)
Appendix A.4.5. BFD (Breadth-First Development) Methodology Tables
| Pseudocode Term | Type | Description | Pseudocode Lines | CSP Mapping |
|---|---|---|---|---|
| Initialization | ||||
| LoadProject(T) | Function | Initializes tree structure | 1 | load_tree_actual!t_initial |
| level_queues ← [[C₁]] | Function | Initializes level queue structure | 2 | initialize_level_queues_actual!c_root |
| k ← 0 | Variable | Current level index | 3 | (tracked implicitly in S1 parameter lv) |
| Level Processing | ||||
| k < len(level_queues) | Condition | Check whether more levels remain | 5 | get_level_queue_actual!k |
| Qₖ is not empty | Condition | Nodes available at current level k | 7 | level_queue_not_empty!k |
| Qₖ is empty | Condition | Current level finished — trigger validation | 7 | level_queue_empty!k |
| Node Operations | ||||
| C ← Dequeue(Qₖ) | Function | Dequeues node from level k | 7a | dequeue_actual!k!C |
| Process(C) | Function | Perform core processing action for node C | 7b | process_actual!C |
| Add C to Processed | Operation | Mark node C as processed for validation/ordering | 7c | tracked in processed parameter of S1/S2 |
| for each child in children(C) → enqueue(child, level_queues[k+1]) | Function | Add C's children to next level queue (create next queue if needed) | 7d–7g | append_new_queue_actual!(k+1) (if needed) then enqueue_child_actual!(k+1)!child for each child |
| Validation & Level Transition | ||||
| ValidateLevel(k) | Function | Validate all nodes at level k; enter S2 (Validation) | 8 | validate_level_actual!k → (S2 entry) → level_validated!k |
| k ← k + 1 | Operation | Advance to next level after successful validation | 9a | level_validated!k → advance_level_actual!k |
| Termination | ||||
| k + 1 < len(level_queues) | Condition | Check for next level existence (Advance case) | 9 | level_validated!k → advance_level_actual!k |
| k + 1 ≥ len(level_queues) / no_more_levels | Condition | No further levels — final termination case | 10 | level_validated!k → no_more_levels!k |
| Terminate() | Function | Final termination of the algorithm | 10a, 10b | terminate_successfully_actual |
| CSP Pro-cess | Key Transitions | Pseudo-code Lines | CSP Events |
|---|---|---|---|
| S0 | BF1: →S1 | 1-4 | load_tree_actual!t_initial, initial-ize_level_queues_actual!c_root |
| S1(k) |
BF2: →S1 (process node) | 7a-7g | get_level_queue_actual!k, level_queue_not_empty!k, dequeue_actual!k!C, process_actual!C, [ap-pend_new_queue_actual!(k+1)]?, enqueue_child_actual!(k+1)!child* — * means repeated per child; ? means conditional append if next level not present |
| BF3: →S2 (Enter valida-tion) | 7, 8 | get_level_queue_actual!k, level_queue_empty!k, vali-date_level_actual!k (enters S2; validation result is emitted from S2 as level_validated!k) | |
| S2(k) |
BF4: →S1 (advance level) | 9, 9a | level_validated!k, advance_level_actual!k — then continue at S1(k+1) |
| BF5: →T (terminate) | 10, 10a | level_validated!k, no_more_levels!k, termi-nate_successfully_actual | |
| T | — | final | terminate_successfully_actual |
Appendix A.4.6. Formal Verification Details for BFD Model and Guarantees
- Compression: Default behavioral reduction (e.g., diamond elimination, sbisim)
- Search order: Breadth-first state exploration
- Core safety and liveness (Assertions 1–2) guarantee no deadlocks or livelocks.
- Determinism (Assertion 3) ensures unique execution paths for any given state.
- Dequeue implies process and level validation (Assertions 4–5) ensure correct breadth-first hierarchical processing.
- Post-validation behavior and termination correctness (Assertions 6–8) guarantee that BFD completes all levels and nodes.
Appendix A.5. CDD Mermaid Code, Algorithm, and Process Algebra
Appendix A.5.1. Structural Workflow Mermaid Code
Appendix A.5.2. State Machine Mermaid Code
Appendix A.5.3. Algorithm (Pseudo Code)
Appendix A.5.4. CSP Implementation and Formal Verification
- GitHub: https://github.com/IBM-Consulting-Formal-Methods/CDD_CSP (commit: 03b972d)
Appendix A.5.5. CDD (Cyclic Directed Development) Methodology Tables
| Pseudocode Term | Type | Description | Pseu-docode Lines | CSP Mapping |
|---|---|---|---|---|
| Initialization | ||||
| LoadGraph(G) | Function | Loads project graph | 1 | load_graph_actual!Graph |
| InitializeDependen-cies() | Function | Initializes dependencies | 2 | initialize_dependencies_actual |
| current_milestone ← 1 | Variable | Set initial milestone | 3 | (Implied in S1(M1) parameter) |
| Internal State | ||||
| refinement_counts | Variable | Tracks refinement attempts (parameter attempts in S2) | 4, 6o | (Abstracted as attempts parameter in S2) |
| Component Processing | ||||
| SelectAndProcess-Node() | Function | Node processing action | 6e-6f | process_node_actual!NodeID |
| test_failed(C) | Condition | Test failure → S2 (CD3a) | 6h | test_failed_actual!NodeID |
| feedback_triggered(C) | Condition | Feedback detected → S2 (CD3b) | 6h | feed-back_triggered_actual!NodeID |
| all_components_written(k) | Condition | Milestone complete check | 6b | all_components_written_actual!MilestoneID |
| Refinement | ||||
| RefineComponent(C) | Function | Initiates refinement at-tempt | 6p | refine_component_actual!NodeID → refine-ment_confirmed_actual!NodeID |
| refine-ment_successful(C) | Condition | Refinement successful | 6q | refine-ment_complete_actual!NodeID |
| refinement_failed(C) | Condition | Refinement failed → check Rmax | 6s | refinement_failed_actual!NodeID |
| Validation | ||||
| ValidateIncrement(k) | Function | Validates milestone incre-ment k | 6v | vali-date_increment_actual!MilestoneID |
| validation_failed | Condition | Validation failed → S2 (CD6) | 6w | valida-tion_failed_actual!MilestoneID |
| feedback_received | Condition | Feedback received after validation → S2 (CD6) | 6w | feed-back_received_actual!MilestoneID |
| IdentifyFlaw() | Function | Identifies flawed compo-nent | 6x | identify_flaw_actual?NodeID |
| Termination | ||||
| current_milestone < L | Condition | Advance to next milestone check | 6aa | milestone_lt(k, L_max) (Implied in S3 logic) |
| current_milestone += 1 | Variable Assign-ment | Increments milestone counter | 6ab | ad-vance_milestone_actual!Next_Milestone(k) |
| FinalDeployment() | Function | Final deployment | 6ae | final_deployment_actual |
| TerminateSuccess() | Function | Successful termination | 7, 6ae | final_development_actual → ter-minate_successfully_actual |
| TerminateWithError() | Function | Error termination (Rmax exceeded) | 8, 6m, 6t | termi-nate_with_error_actual!NodeID |
| CSP Process | Key Transitions | Pseudocode Lines | CSP Events |
|---|---|---|---|
| S0 | CD1: →S1 (Load & init) | 1-5 | load_graph_actual!Graph, initialize_dependencies_actual |
| S1(k, n1..n5) | CD2: →S1 (Process success) | 6e-6g | process_node_actual!C → mark_completed → S1 self-loop |
| CD3a: →S2 (Test failure) | 6h-6j | process_node_actual!C → test_failed_actual!C → S2(C, k, n1..n5, 0) | |
| CD3b: →S2 (Feedback) | 6h-6j | process_node_actual!C → feedback_triggered_actual!C → S2(C, k, n1..n5, 0) | |
| CD5: →S3 (Milestone complete) | 6b-6c | all_components_written_actual!k → validate_increment_actual!k → S3(k, n1..n5) | |
| S2(c, k, n1..n5, attempts) | CD4a: →S1 (Refinement success) | 6p-6r | refine_component_actual!c → refinement_confirmed_actual!c → refinement_complete_actual!c → S1(k, n1..n5) |
| CD4b: → S0 (Error termination with S0 instead of T for FDR liveness verification) | 6m, 6t | refine_component_actual!c → refinement_confirmed_actual!c → refinement_failed_actual!c → [Rmax check] → terminate_with_error_actual!c → S0 | |
| S3(k, n1..n5) | CD6: →S2 (Validation failure) | 6w-6y | (validation_failed_actual!k → identify_flaw_actual?c → mark_not_completed) □ (feedback_received_actual!k → identify_flaw_actual?c → mark_not_completed) → S2(c, k, n1..n5, 0) |
| CD8: →S1 (Advance milestone) | 6z-6ac | milestone_lt(k, L_max) → advance_milestone_actual!Next_Milestone(k) → S1(Next_Milestone(k), NotCompleted, ...) | |
| CD7: → 0 (Final success) | 6ad-6ae | ¬ milestone_lt(k, L_max) → final_development_actual → terminate_successfully_actual → S0 | |
| T | Termination | final | Not explicitly used as a final state; replaced by → S0 for liveness verification. |
Appendix A.5.6. Formal Verification Details for CDD Model and Guarantees
- Compression: Default behavioral reduction (e.g., diamond elimination, sbisim)
- Search order: Breadth-first state exploration
- N4 (Assertion 6): Verified that N4 cannot execute until both N2 and N3 complete. Trace refinement confirms all observable behaviors respect this dependency.
- N5 (Assertion 7): Verified that N5 cannot execute until N4 completes. Trace refinement confirms strict sequential enforcement.
-
Using the Hostile Environment technique, the system is exposed to persistent refinement failures:
- ○
- Always triggers validation_failed_actual
- ○
- Always triggers refinement_failed_actual
-
Passing deadlock and divergence checks confirms:
- ○
- Maximum Rₘₐₓ attempts are enforced.
- ○
- System terminates with terminate_with_error_actual.
- ○
- Infinite refinement loops are prevented.
- Core safety and liveness (Assertions 1–2) guarantee no deadlocks or livelocks.
- Protocol compliance (Assertions 3–4) ensures deployment sequences conform to the expected events.
- Initial guard (Assertion 5) prevents premature shutdown before initialization.
- Internal consistency (Assertion 10) ensures mutually exclusive event sequences cannot occur.
Appendix A.6. PDFD Mermaid Code, Algorithm, and Process Algebra
Appendix A.6.1. Structural Workflow Mermaid Code
Appendix A.6.2. State Machine Mermaid Code
Appendix A.6.3. Algorithm (Pseudo Code)
Appendix A.6.4. CSP Implementation and Formal Verification
- GitHub: https://github.com/IBM-Consulting-Formal-Methods/PDFD_CSP (commit: b5107ac)
Appendix A.6.5. PDFD (Primary Depth-First Development) Methodology Tables
| Pseudocode Term | Type | Description | Pseudocode Lines | CSP Mapping |
|---|---|---|---|---|
| Initialization | ||||
| Load T, initialize | Procedure | Initializes tree T and refinement attempt counters to zero. | 1-3 | (Implicit) |
| call S1_InitialProcess(L1) | Call | Starts the process at the initial level L1. | 6 | PD1: process_level!L1 → S1_InitialProcess(L1) |
| S₁: Level Processing | ||||
| Process_Level(i) | Procedure | Performs the core processing for the given level i or j. | 14, 25 | process_level!i |
| if refinement_attempts[i] ≥ R_MAX | Condition | Checks if refinement attempts for the current level are exhausted. | 11, 22 | PD8: cond_refinement_exhausted?i → S5 |
| S₂: Validation | ||||
| is_threshold_met = Validate_Level(i) | Function | Performs the level validation check. | 32, 54 | validate_level!i |
| if is_threshold_met | Condition | Threshold met (PD2b) or refinement success (PD3a/3b). | 34, 56 | cond_threshold_met?i |
| call S1_InitialProcess(Next(i)) | State Transition | Advances to process the next level. | 40 | PD2b: S1_InitialProcess(Next(i)) |
| if j < i_orig | Condition | Successful refinement continues deeper. | 59 | PD3a: cond_j_lt_i.j.i_orig |
| else: call S2_LevelValidation(i_orig) | State Transition | Successful refinement resumes validation context. | 63-64 | PD3b: cond_j_eq_i.j.i_orig → S2_LevelValidation(i_orig) (CSP uses S2_LevelValidation which includes S3 call) |
| Refinement / Bottom-Up Logic | ||||
| if (i=L) OR ... (has_no_children(i)) | Condition | Checks if Bottom-Up is mandatory or an option (PD4). | 36 | cond_has_no_children?i |
| j = Find_Refinement_Origin(i, L) | Function | Identifies the root cause level j for refinement backtracking. | 44, 67, 89, 113 | cond_refinement_available?j (Non-deterministic choice) |
| refinement_attempts[j] += 1 | Action | Increments refinement attempt counter for level j. | 46, 69, 91, 115 | increment_attempts!j |
| call S1_RefinementProcess(j, i_orig) | State Transition | Transitions to the Level Processing state for refinement. | 47, 70, 92, 116 | S1_RefinementProcess(j, i_orig) |
| S₃: Bottom-Up Completion | ||||
| Finalize_Subtrees(i) | Procedure | Processes and validates subtrees at the current level. | 77 | finalize_subtrees!i |
| if is_validated | Condition | Checks if all nodes in a subtree are successfully validated. | 80 | cond_all_descendants_validated?i |
| if i != L1: call S3_BottomUpCompletion(Prev(i)) | State Transition | Continues bottom-up to the previous level (PD4a). | 81-83 | S3_BottomUpCompletion(Prev(i)) |
| else: call S4_TopDownCompletion(L1) | State Transition | Transitions to the Top-Down Completion state (PD5). | 84-86 | S4_TopDownCompletion(L1) |
| S₄: Top-Down Completion | ||||
| Finalize_Unprocessed_Nodes(i) | Procedure | Finalizes and validates any remaining unprocessed nodes. | 99 | finalize_unprocessed!i |
| if i != L5: call S4_TopDownCompletion(Next(i)) | State Transition | Continues top-down to the next level (PD6). | 103-105 | S4_TopDownCompletion(Next(i)) |
| else: call T | State Transition | Transitions to the successful termination state (PD7). | 106-108 | T |
| if Trace_Origin_Exists(i) | Condition | Checks if refinement is possible after failure (PD6a). | 111 | cond_trace_origin_exists?i |
| else: call S5 | State Transition | Transitions to the terminal error state (PD6b). | 121-122 | cond_trace_origin_not_exists?i → S5 |
| Final Outcome | ||||
| call T | Termination | The system terminates successfully. | 125-126 | terminate_success → T |
| call S5 | Termination | The system terminates with an error. | 129-130 | terminate_error → S5 |
| CSP Process | Key Transitions | Pseudocode Lines | CSP Events (Simplified) |
|---|---|---|---|
| S₀ | PD1: Initial start | 1–6 | process_level!L1 → S1_InitialProcess(L1) |
| S₁_InitialProcess(i) | PD2: Core sequence start | 9–14 | process_level!i → S2_LevelValidation(i) |
| PD8: Exhaustion check | 11 | cond_refinement_exhausted?i → S5 | |
| S₁_RefinementProcess(j, i_orig) | PD3: Core sequence start | 20–25 | process_level!j → S2_RefinementValidation(j, i_orig) |
| PD8: Exhaustion check | 22 | cond_refinement_exhausted?j → S5 | |
| S₂_RefinementValidation(j, i_orig) | PD3 (Entry) | 53–54 | validate_level!j → ... |
| PD3a/PD3b: Refinement success | 56–64 | cond_threshold_met?j → S3_RefinementResolution(...) | |
| PD3c: Refinement failure | 66–73 | cond_threshold_not_met?j → (refinement choice) | |
| S₃_RefinementResolution(j, i_orig) | PD3a: Continue deep refinement | 58–61 | cond_j_lt_i.j.i_orig -> S1_RefinementProcess |
| PD3b: Resume validation context | 62–64 | cond_j_lt_i.j.i_orig → S1_RefinementProcess(Next(j), i_orig) | |
| S₂_LevelValidation(i) | PD2b: Advance level | 39–40 | cond_threshold_met?i → S1_InitialProcess(Next(i)) |
| PD4: Go bottom-up (mandatory) | 48–50 | cond_has_no_children?i → S3_BottomUpCompletion(i) | |
| PD2a: Refine (failure path) | 44–47 | cond_refinement_available?j → increment_attempts!j → S1_RefinementProcess(j, i) | |
| S₃_BottomUpCompletion(i) | PD4a: Move up | 80–83 | finalize_subtrees!i → cond_all_descendants_validated?i → S3_BottomUpCompletion(Prev(i)) |
| PD5: Start top-down | 84–86 | finalize_subtrees!i → cond_all_descendants_validated?i → S4_TopDownCompletion(L1) | |
| PD4b: Refine (failure) | 88–95 | cond_not_all_descendants_validated?i → SimpleRefinementHandler(i) | |
| S₄_TopDownCompletion(i) | PD6: Move down | 102–105 | finalize_unprocessed!i → cond_all_descendants_validated?i → S4_TopDownCompletion(Next(i)) |
| PD7: Success | 106–108 | finalize_unprocessed!i → cond_all_descendants_validated?i → T | |
| PD6a: Refine (failure) | 110–119 | cond_not_all_descendants_validated?i → cond_trace_origin_exists?i → SimpleRefinementHandler(i) | |
| PD6b: Error | 120–122 | cond_not_all_descendants_validated?i → cond_trace_origin_not_exists?i → S5 | |
| S₅ / T | Termination | 125–130 | terminate_error → S5 / terminate_success → T |
Appendix A.6.6. Formal Verification Details for PDFD Model and Guarantees
- Five core levels (L1–L5)
- Core and refinement transitions
- The refinement attempt counter
- Structural Integrity (1 Assertion)
- 2.
- Consistency and Soundness (6 Assertions)
- 3.
- Liveness and Bounded Termination (4 Assertions)
Appendix A.7. PBFD Mermaid Code, Algorithm, and Process Algebra
Appendix A.7.1. Structural Workflow Mermaid Code
Appendix A.7.2. State Machine Mermaid Code
Appendix A.7.3. Algorithm (Pseudo Code)
Appendix A.7.4. CSP Implementation and Formal Verification
- GitHub: https://github.com/IBM-Consulting-Formal-Methods/PBFD_CSP (commit: ea1a3bc)
Appendix A.7.5. PBFD (Primary Breadth-First Development) Methodology Tables
| Pseudocode Term | Type | Description | Pseudocode Lines | CSP Mapping |
|---|---|---|---|---|
| Initialization | ||||
| Load T | System Function | Initializes tree structure and pattern hierarchy | PBFD: 1 | load_tree_actual |
| initialize refinement_attempts | System Function | Sets all level refinement counters to 0 | PBFD: 1 | initialize_refinement_attempts_actual |
| currentState ← S1_InitialProcess | State Transition | Begins main pattern processing (PB1) | PBFD: 2 | S1_InitialProcess(L1) |
| Pattern Processing | ||||
| Process Patternᵢ | Pattern Function | Executes core pattern processing (PB2) | PBFD: 6 | process_pattern_actual.i |
| Validate Patternᵢ | Validation Action | Performs pattern validation (PB4 Action) | PBFD: 12, 27 | validate_pattern_actual.i |
| ∃n ∈ Patternᵢ: ¬validated(n) | Validation Condition | Pattern validation failed (PB2) | PBFD: 7, 22, 29, 32 | cond_not_all_validated?i |
| ∀n ∈ Patternᵢ: validated(n) | Validation Condition | Pattern validation succeeded (PB2a, PB4) | PBFD: 9, 13, 24, 27 | cond_all_validated?i |
| Refinement Control | ||||
| Find j | Trace Function | Identifies minimal root cause level j (PB3/PB7a) | HandlePBFDFailureRefinement: 1 | (Implicit in TryTraceOrigin using cond_trace_origin) |
| affected_by_unprocessed | Trace Function | Finds patterns affecting unprocessed nodes | PBFD: 57 | (Implicit in TryTraceOrigin_Completion) |
| refinement_attempts[j]++ | Counter Operation | Increments refinement attempts for level j (PB3/PB3a2/PB7a) | HandlePBFDFailureRefinement: 3, PBFD: 30 | increment_refinement_attempts_actual.j |
| refinement_attempts[j] ≥ Rₘₐₓ | Limit Check | True when refinement attempts for level j ≥Rmax (PB3c/PB3a3/PB7b/PB9) | HandlePBFDFailureRefinement: 5 (else branch), PBFD: 18, 32 | cond_ref_attempts_ge_Rmax?j |
| refinement_attempts[j] < Rₘₐₓ | Limit Check | True when refinement attempts for level j <Rmax (PB3/PB3a2/PB7a) | HandlePBFDFailureRefinement: 2, PBFD: 29 | cond_ref_attempts_lt_Rmax?j |
| HandlePBFDFailureRefinement | Procedure | Handles PB3/PB3c/PB7a/PB7b logic | PBFD: 16, 57 | TryTraceOrigin_Initial/Completion |
| Critical Children Selection | ||||
| available_children(Patternᵢ) | Function | Returns set of direct child nodes: {c ∈ V | ∃n ∈ Patternᵢ: (n,c) ∈ E} | PBFD: 37 | (Implied by resolve_depth_actual) |
| is_on_critical_path(c) | Predicate | True if node c lies on critical path from roots to leaves | select_critical_children | (Not directly mapped, external logic) |
| has_high_fanout(c) | Predicate | True if node c has ≥3 dependents | select_critical_children | (Not directly mapped, external logic) |
| is_foundational_component(c, level) | Predicate | True if node c provides foundational services for its level | select_critical_children | (Not directly mapped, external logic) |
| select_critical_children(available_children, level) | Procedure | Selects architecturally critical nodes for Patternᵢ₊₁ | PBFD: 38 | select_critical_children_actual.i |
| Depth Processing | ||||
| Patternᵢ₊₁ ≠ ∅ | Existence Check | True when next level has no pattern entries (PB4b) | PBFD: 39 | cond_pattern_next_nonempty.i |
| i < L | Boundary Check | True when not at max level (PB4a/PB7) | PBFD: 39, 52 | cond_i_lt_L?i |
| i = L | Boundary Check | True at max level (PB4b/PB8) | PBFD: 41, 54 | cond_i_eq_L?i |
| Patternᵢ₊₁ = ∅ | Existence Check | True when next level has patterns (PB4b) | PBFD: 41 | cond_pattern_next_empty?i |
| Completion Phase | ||||
| Finalize Patternᵢ | Completion Function | Processes remaining nodes (PB7/PB8) | PBFD: 50 | finalize_pattern_actual.i |
| processed(n) | State Predicate | True when node n is fully processed (P(n)=1 ∨ P(n)=2) | Implied by PBFD: 51, 56 | (Implied by cond_all_processed) |
| ∃n∈Patternᵢ:¬processed(n) | Validation Condition | Pattern has unprocessed nodes (PB7a/PB7b) | PBFD: 56 | cond_not_all_processed?i |
| ∀n∈Patternᵢ:processed(n) | Validation Condition | All nodes processed (PB7/PB8) | PBFD: 51 | cond_all_processed?i |
| Termination | ||||
| S5 | Error State | Terminal state for all error conditions (PB3c/PB3a3/PB7b/PB9) | PBFD: 60 | terminate_failure_actual → S5 |
| T | Success State | Terminal state for successful completion (PB8) | PBFD: 61 | terminate_success_actual → T |
| CSP Process | Key Transitions (PB Ref.) | Pseudocode Lines | CSP Events (Simplified) |
|---|---|---|---|
| S0 | PB1: → S1_InitialProcess(L1) | PBFD: 1-2 | load_tree_actual → initialize_refinement_attempts_actual → S1_InitialProcess(L1) |
| S1_InitialProcess(i) | PB2: False → S2; PB2a: True → S3 | PBFD: 6-10 | process_pattern_actual.i → (cond_not_all_validated?i → S2_ValidationInitial(i) [] cond_all_validated?i → S3_DepthProgression(i)) |
| S2_ValidationInitial(i) | PB4: True → S3; PB3/PB3c: False → TryTraceOrigin | PBFD: 12-16 | validate_pattern_actual.i → (cond_all_validated?i → S3_DepthProgression(i) [] cond_not_all_validated?i → TryTraceOrigin_Initial(i) |
| S1_RefinementProcess(j,i_orig) | PB9: attempts ≥ Rmax → S5; PB3a: attempts < Rmax → S2 | PBFD: 18-25 | (cond_ref_attempts_ge_Rmax?j → S5) [] cond_ref_attempts_lt_Rmax?j → process_refinement_pattern_actual.j → … |
| S2_ValidationRefinement(j,i_orig) | PB4a: i < L, Patternᵢ₊₁ ≠ ∅ → S1(i+1); PB4b: i = L ∨ Patternᵢ₊₁ = ∅ → S4(1) | PBFD: 27-33 | validate_refinement_pattern_actual.j → (cond_all_validated?j → S3_RefinementDepthResolution(j, i_orig) [] cond_not_all_validated?j → …) |
| S3_DepthProgression(i) | PB5: j < i_orig → S1(Next(j)); PB6: j = i_orig → S3(i_orig) | PBFD: 37-42 | resolve_depth_actual.i → select_critical_children_actual.i → (cond_pattern_next_nonempty?i ∧ cond_i_lt_L?i → S1_InitialProcess(i+1) [] … → S4(L1)) |
| S3_RefinementDepthResolution(j,i_orig) | PB5: j < i_orig → S1(Next(j)); PB6: j = i_orig → S3(i_orig) | PBFD: 44-48 | resolve_refinement_depth_actual.j → (if LessThan(j, i_orig) then S1_RefinementProcess(Next(j), i_orig) else S3_DepthProgression(i_orig)) |
| S4(i) | PB7: i < L, processed → S4(i+1); PB8: i = L, processed → T; PB7a/PB7b: ¬processed → TryTraceOrigin | PBFD: 50-57 | finalize_pattern_actual.i → (cond_all_processed?i → (cond_i_lt_L?i → S4(i+1) [] cond_i_eq_L?i → T) [] cond_not_all_processed?i → TryTraceOrigin_Completion(i)) |
| S5 | N/A (Terminal Failure State) | PBFD: 60 | terminate_failure_actual → S5 |
| T | N/A (Terminal Success State) | PBFD: 61 | terminate_success_actual → T |
Appendix A.7.6. Formal Verification Details for PBFD model and Refinement Guarantees
- Three depth levels: L1, L2, L3. The verification guarantees correctness up to this depth.
- State set: S0 through S5 and T
- Full transition set: PB1–PB9 from Table 40
- Bounded refinement: R_max = 5
- Complete conditional environment: Both legal and hostile variants
| Category | Count | Coverage |
|---|---|---|
| Core Safety/Liveness | 5 | System deadlock/divergence freedom plus initialization safety |
| State-Level Safety | 26 | All operational and terminal states across all level combinations |
| Conditional Soundness | 1 | Mutual exclusivity of conditional predicates |
| Hostile Environment | 2 | Adversarial robustness under non-cooperative inputs |
| Total | 33 | Complete verification |
- Bounded progression through at most 3 levels
- Bounded refinement with at most R_max = 5 attempts per level
- Guaranteed termination at either T (success) or S5 (error)
- Load pbfd_model.csp in FDR 4.2.7
- Run all 33 assertions
- Expected outcome: all checks pass with no warnings or counterexamples
Appendix A.8. Formal Proofs

Appendix A.8.1. Termination Measure and State Transition Analysis
| Term / Invariant Name | Type | Formal Definition / Condition |
|---|---|---|
| processing_complete(i) | Predicate | All nodes n in level(i) have been processed by the current phase's validation logic. |
| descendants_validated(n) | Predicate | All nodes in the processed subtree rooted at n have been permanently finalized (P(n) = 2). |
| nrl(j) | Function | The Next Refinement Level function, returning the lowest level k < j that still requires validation. |
| Kᵢ | Constant | A fixed batch size threshold for level i, used to trigger a batch commit in transition PD2b. |
| Descendant Finalization Invariant | Invariant | A node n is finalized only if all its processed descendants are finalized. |
| Refinement Locality Invariant | Invariant | Any backtrack targets j = trace_origin(i) and the refinement scope is contiguous. |
| Level-wise Ordering Invariant | Invariant | New patterns at level i+1 are produced only after Patternᵢ is validated. (Ensured by PB4a guard.) |
| Top-down Finalization Invariant | Invariant | The S₄ completion phase proceeds sequentially from level 1 up to L, ensuring no level is skipped. (PB7) |
| Refinement Locality Invariant (PBFD) | Invariant | Any backtrack targets j = trace_origin(i) and the refinement scope is limited to levels k ∈ [j, i]. (PB3) |
- k₁: Count of unfinalized nodes — k₁ = |{n ∈ G | P(n) ≠ 2}|. (Highest priority.)
- k₂: Remaining refinement attempts across all levels — k₂ = ∑_{j ∈ ActiveLevels} (Rₘₐₓ − refinement_attempts(j)). (Finite, >0 in non-terminal states while attempts remain.)
- k₃ ∈ {4, 3, 2, 1, 0} → Phase ordinal (map phases to ordinals: S₀ = 4, S₁ = 3, S₂ = 2, S₃ = 1, S₄ = 0. A transition to a later phase reduces the numerical value of k₃)
- k₄ ∈ ℕ → Intra-phase progress measure (e.g., remaining nodes in a batch or pattern)
- k₁ decreases only on commit/finalization transitions (when nodes are permanently set P(n)=2).
- k₂ strictly decreases on refinement-entry transitions (each such transition consumes one refinement attempt for a level).
- k₃, k₄ measure local progress within phases and provide the necessary descent when k₁, k₂ remain unchanged for short steps. Multiple-component (lexicographic or multi-ranking) proofs remain a mainstream tool in termination analysis [125].
| Rule | Transition | ΔM (Δk₁,Δk₂,Δk₃,Δk₄) | Key Condition | Type | Progress Justification (first non-zero component) |
|---|---|---|---|---|---|
| PD1 | S₀ → S₁(1) | — | i = 1 (initial) | Initial | Initialization (not used in lexicographic descent) |
| PD2 | S₁(i) → S₂(i) | (0,0,↓,↓) | processing_complete(i) ∧ ∃ n ∈ level(i): ¬validated(n) | Non-terminal | k₃ decreases (S₁→S₂) → progress |
| PD2a | S₂(i) → S₁(j) | (0,↓,↑,0) | j = trace_origin(i) ∧ refinement_attempts(j) < Rₘₐₓ (backtrack/refinement entry) | Non-terminal | k₂ decreases (attempt consumed) → progress |
| PD2b | S₂(i) → S₁(i+1) | (↓,0,↑,0) | ∑_{n ∈ level(i)} [P(n)=2] ≥ Kᵢ (commit/finalize batch) | Non-terminal | k₁ decreases (batch commit) → progress |
| PD3 | S₁(j) → S₂(j) | (0,0,↓,↓) | processing_complete(j) ∧ ∃ n ∈ level(j): ¬validated(n) | Non-terminal | k₃ decreases (S₁→S₂) → progress |
| PD3a | S₂(j) → S₁(nrl(j), i_orig) | (0,0,0,↓) | ∀ n ∈ level(j): validated(n) ∧ j < i (advance to next refinement level nrl(j)) | Non-terminal | k₄ decreases (intra-phase progress) → progress — PD3a treated intra-phase for M |
| PD3b | S₂(j) → S₂(i) | (0,0,0,↓) | ∀ n ∈ level(j): validated(n) ∧ j = i (resume original validation at level i) | Non-terminal | k₄ decreases (intra-phase progress) → progress |
| PD3c | S₂(j) → S₁(j) | (0,↓,↑,0) | processing_complete(j) ∧ ∃ n ∈ level(j): ¬validated(n) ∧ refinement_attempts(j) < Rₘₐₓ (retry refinement — consumes attempt) | Non-terminal | k₂ decreases (attempt consumed) → progress |
| PD4 | S₂(i) → S₃(i) | (0,0,↓,0) | processing_complete(i) ∧ (i = L ∨ level(i+1) = ∅) | Non-terminal | k₃ decreases (S₂→S₃) → progress |
| PD4a | S₃(i) → S₃(i−1) | (0,0,0,↓) | ∀ n ∈ level(i): validated(n) ∧ descendants_validated(n) | Non-terminal | k₄ decreases (intra-phase progress) → progress |
| PD4b | S₃(i) → S₁(j) | (0,↓,↑,0) | ∃ n ∈ level(i): ¬validated(n) ∧ j = trace_origin(i) ∧ refinement_attempts(j) < Rₘₐₓ (backtrack from bottom-up) | Non-terminal | k₂ decreases (attempt consumed) → progress |
| PD5 | S₃(2) → S₄(1) | (0,0,↓,↓) | i = 2 (bottom-up progress boundary) | Non-terminal | k₃ decreases (S₃→S₄) → progress |
| PD6 | S₄(i) → S₄(i+1) | (↓,0,0,0) | ∀ n ∈ level(i): validated(n) | Non-terminal | k₁ decreases (commit/finalize of level i). |
| PD6a | S₄(i) → S₁(j) | (0,↓,↑,0) | ∃ n ∈ level(i): ¬validated(n) ∧ j = trace_origin(i) ∧ refinement_attempts(j) < Rₘₐₓ (backtrack from completion) | Non-terminal | k₂ decreases (attempt consumed) → progress |
| PD6b | S₄(i) → S₅ | — | ∃ n ∈ level(i): ¬validated(n) ∧ (no refinement path remains for trace_origin(i)) (equivalently refinement_attempts(trace_origin(i)) ≥ Rₘₐₓ) | Terminal | Terminal (error) |
| PD7 | S₄(L) → T | — | ∀ i ∈ [1,L], ∀ n ∈ level(i): validated(n) | Terminal | Terminal (complete) |
| PD8 (generalized) | From ∈ {S₁(j), S₂(j), S₃(j)} → S₅ | — | refinement_attempts(j) ≥ Rₘₐₓ (no further attempts remain for level j) | Terminal | Terminal (exhaustion) |
- k₁ Strict Decrease: The finalization transition PD2b and PD6 strictly reduces k₁ (unfinalized nodes), overriding any changes in lower-priority components.
- k₂ Strict Decrease: The refinement-entry transitions PD2a, PD3c, PD4b, and PD6a strictly reduce k₂ (remaining refinement attempts), ensuring lexicographic progress even when backtracking causes k₃ to increase temporarily.
- k₃ Decrease Role: Phase-progression transitions (PD2, PD3, PD4, PD5) strictly reduce k₃, ensuring forward progress. Although k₃ may temporarily increase during backtracking (PD2a, PD2b, PD3c, PD4b, PD6a), the overall lexicographic decrease is maintained by strict reduction of higher-priority components k₁ or k₂.
- k₄ Strict Decrease: The intra-phase traversals PD3a, PD3b, and PD4a strictly reduce k₄ (intra-phase progress), providing the necessary descent when all higher-priority components remain unchanged.
| Rule | Transition | ΔM (Δk₁,Δk₂,Δk₃,Δk₄) | Key Condition | Type | Progress Justification |
|---|---|---|---|---|---|
| PB1 | S₀ → S₁(1) | — | i = 1 | Initial | — |
| PB2 | S₁(i) → S₂(i) | (0,0,↓,↓) | ∃n ∈ Patternᵢ: ¬validated(n) | Non-terminal | k₃ decreases (3→2). |
| PB2a | S₁(i) → S₃(i) | (0,0,↓,0) | ∀n ∈ Patternᵢ: validated(n) | Non-terminal | k₃ decreases (3→1). |
| PB3 | S₂(i) → S₁(j) | (0,↓,↑,0) | (∃n ∈ Patternᵢ: ¬validated(n)) ∧ j = trace_origin(i) ∧ refinement_attempts(j) < Rₘₐₓ (refinement entry) | Non-terminal | k₂ decreases (attempt consumed). |
| PB3a | S₁(j) → S₂(j) | (0,0,↓,↓) | ∃n ∈ Patternⱼ: ¬validated(n) | Non-terminal | k₃ decreases (3→2). |
| PB3a1 | S₂(j) → S₃(j) | (0,0,↓,0) | ∀n ∈ Patternⱼ: validated(n) | Non-terminal | k₃ decreases (2→1). |
| PB3a2 | S₂(j) → S₁(j) | (0,↓,↑,0) | ∃n ∈ Patternⱼ: ¬validated(n) ∧ refinement_attempts(j) < Rₘₐₓ (retry refinement) | Non-terminal | k₂ decreases (attempt consumed). |
| PB3a3 | S₂(j) → S₅ | — | ∃n ∈ Patternⱼ: ¬validated(n) ∧ refinement_attempts(j) ≥ Rₘₐₓ (refinement exhausted) | Terminal | — |
| PB3b | S₁(j) → S₃(j) | (0,0,↓,0) | ∀n ∈ Patternⱼ: validated(n) | Non-terminal | k₃ decreases (3→1). |
| PB3c | S₂(i) → S₅ | — | (∃n ∈ Patternᵢ: ¬validated(n)) ∧ (trace_origin(i) undefined ∨ refinement_attempts(trace_origin(i)) ≥ Rₘₐₓ) (no valid trace_origin or attempts exhausted) | Terminal | — |
| PB4 | S₂(i) → S₃(i) | (0,0,↓,0) | ∀n ∈ Patternᵢ: validated(n) (refinement validated) | Non-terminal | k₃ decreases (2→1). |
| PB4a | S₃(i) → S₁(i+1) | (↓,0,↑,0) | i < L ∧ Pattern_{i+1} ≠ ∅ ((commit/finalize)) | Non-terminal | k₁ decreases (commit/finalize of Patternᵢ). |
| PB4b | S₃(i) → S₄(1) | (0,0,↓,0) | i = L ∨ Pattern_{i+1} = ∅ (enter completion) | Non-terminal | k₃ decreases (1→0). |
| PB5 | S₃(j) → S₁(j+1) | (0,0,0,↓) | j < i (refinement-range progress) | Non-terminal | k₄ decreases (refinement-range progress). |
| PB6 | S₃(j) → S₃(i) | (0,0,0,↓) | j = i (return from refinement) | Non-terminal | k₄ decreases (intra-phase progress/return). |
| PB7 | S₄(i) → S₄(i+1) | (↓,0,0,0) | ∀n ∈ Patternᵢ: processed(n) | Non-terminal | k₁ decreases (commit/finalize of Patternᵢ). |
| PB7a | S₄(i) → S₁(j) | (0,↓,↑,0) | ∃n∈Patternᵢ:¬ processed(n)∧j=trace_origin(i)∧refinement_attempts(j)< Rₘₐₓ (backtrack from completion) | Non-terminal | k₂ decreases (attempt consumed). |
| PB7b | S₄(i) → S₅ | — | ∃n∈Patternᵢ:¬ processed(n)∧¬(j=trace_origin(i)∧refinement_attempts(j)< Rₘₐₓ) (unvalidated nodes and no refinement option) | Terminal | — |
| PB8 | S₄(L) → T | — | ∀i ∈ [1,L], ∀n ∈ Patternᵢ: validated(n) (all validated) | Terminal | — |
| PB9 | S₁(j) → S₅ | — | refinement_attempts(j) ≥ Rₘₐₓ (attempts exhausted) | Terminal | — |
- Transitions that decrement k₂ (remaining refinement attempts) are PB3, PB3a2, and PB7a. Each consumes exactly one attempt.
- k₁ (unfinalized nodes) is strictly reduced only by the commit/finalization transitions PB4a (forward pass) and PB7 (completion phase). These dominate all lower-priority changes.
- PB4a is the forward commit step finalizing Patternᵢ before moving to Patternᵢ₊₁.
- PB5 and PB6 represent intra-refinement navigation and strictly reduce k₄, not k₁.
- k₁ Strict Decrease: PB4a and PB7 finalize nodes, reducing the highest-priority component.
- k₂ Strict Decrease: PB3, PB3a2, and PB7a consume refinement attempts and strictly reduce k₂, ensuring lexicographic progress even when backtracking causes k₃ to increase temporarily.
- k₃ Decrease Role: The phase-progression transitions PB2, PB2a, PB3a, PB3a1, PB3b, PB4, and PB4b strictly reduce k₃ (phase ordinal), ensuring forward progress through the main execution path. Although k₃ may temporarily increase in commit transition PB4a and refinement/backtracking transitions (PB3, PB3a2, PB7a), the overall lexicographic decrease is guaranteed by the strict reduction of higher-priority components k₁ or k₂.
- k₄ Strict Decrease: PB5 and PB6 reduce intra-phase progress when higher-priority components remain unchanged.
Appendix A.8.2. Lemma (Bounded Refinement)
- Base Case. At initial state S₀: ∀k: refinement_attempts(k)=0 ≤ Rₘₐₓ. The statement holds vacuously.
- Inductive Step. Assume in state S the invariant holds. Consider a transition S → S′. Only refinement-entry rules increment refinement_attempts(j). From Tables A.8.2. - A.8.3. these are explicitly guarded by refinement_attempts(j) < Rₘₐₓ (PD2a, PD3c, PD4b, PD6a for PDFD; PB3, PB3a2, PB7a for PBFD). Hence any increment preserves refinement_attempts(j) ≤ Rₘₐₓ. All other rules leave all refinement counters unchanged. Terminal rules (e.g., PD6b, PD8, PB3a3, PB9, PB7b, PB3c) fire only when refinement_attempts(j) ≥ Rₘₐₓ for some j. Terminal transitions (which fire only when refinement_attempts(j) ≥ Rₘₐₓ) do not increment counters, preserving the invariant.
- Conclusion. By induction on transitions, the counter is bounded by Rₘₐₓ at all times. Since at most L levels can each suffer at most Rₘₐₓ increments, the total number of refinement attempts is bounded by L ⋅ Rₘₐₓ. Thus k₂ is finite and strictly decreases on each refinement entry until exhaustion.
Appendix A.8.3. Lemma (Finalization Monotonicity)
- Base Case. Initially no node is finalized (P(n) ≠ 2 for all n). The statement holds vacuously in the initial state.
- Finalization Step: Per Tables A.8.2-A.8.3, the rules that set nodes to finalized (i.e., produce committed P(n)=2) are the commit/finalize transitions PDFD: PD2b and PD6; PBFD: PB4a and PB7). In both algorithms, these transitions strictly reduce k₁. No other transition creates P(n)=2.
- Reset rules. The only rules that may reset previously finalized nodes to non-finalized ones (i.e., potentially Δk₁ > 0) are refinement-entry/backtrack rules (PD2a, PD3c, PD4b, PD6a; PB3, PB3a2, PB7a). Each such rule has the guard refinement_attempts(j) < Rₘₐₓ and the operational semantics of attempting correction. On taking such a rule, k₂ strictly decreases (since refinement_attempts(j) is incremented). No non-refinement rule resets finalized nodes.
- Lexicographic compensation. Therefore, any transition that reverses finalization (i.e., a reset that potentially increases k₁) is guaranteed to be a refinement-entry transition that strictly decreases k₂. Hence the pair (k₁, k₂) is lexicographically non-increasing across transitions: a rise in k₁ is strictly compensated by a fall in k₂.
- Conclusion. k₁ is monotone non-increasing unless a bounded, recorded refinement reset occurs; such resets are bounded by Lemma A.8.2. Thus the finalization invariant holds.
Appendix A.8.4. Lemma (Termination Guarantee)
- Success T: all nodes finalized (∀n ∈ V: P(n) = 2), or
- Bounded failure S₅: refinement exhausted for some level (∃j: refinement_attempts(j) = Rₘₐₓ).
-
Well-foundedness. Each component of M = (k₁, k₂, k₃, k₄) ranges over a well-founded (finite or well-ordered) set:
- ○
- 0 ≤ k₁ ≤ |V|.
- ○
- 0 ≤ k₂ ≤ L ⋅ Rₘₐₓ.
- ○
- k₃ ∈ {0, 1, 2, 3, 4}.
- ○
- k₄ bounded by finite batch sizes (≤|V|).
-
Measure descent on transitions. From the exhaustive ΔM annotations in Tables A.8.2- A.8.3, every non-terminal transition strictly decreases M in lexicographic order:
- ○
- If a non-terminal transition finalizes nodes, it decreases k₁.
- ○
- If it is a refinement-entry, it decreases k₂.
- ○
- Otherwise the phase/intra-phase components (k₃, k₄) strictly decrease.
- No infinite execution sequences. Since M decreases on every non-terminal step and M is well-founded, the system cannot execute infinitely many non-terminal moves. Therefore, every execution sequence reaches a terminal state.
- Terminal classification. Terminal rules in Tables A.8.2-A.8.3 correspond exactly to either all nodes validated (PD7, PB8) or to a bounded failure from exhausted refinements (PD6b, PD8, PB3a3, PB3c, PB7b, PB9). These cases partition all terminal states. Hence termination leads to either T or S₅.
Appendix A.8.5. Lemma (Invariant Preservation for PDFD)
- Descendant finalization invariant. A node at level i is not considered finally complete unless all nodes in its processed subtree are finalized (guards enforced by PD4a/PD6/PD7).
- Refinement locality. Backtracks always target j = trace_origin(i) with j ≤ i; refinement scope is contiguous and anchored.
- Base Case. The initial state S₀ satisfies both invariants vacuously: no nodes are finalized yet, and no refinement operations have been initiated. Therefore, both the descendant finalization invariant and refinement locality invariant hold trivially.
-
Inductive Step. Assume both invariants hold in state S. Consider any transition S → S′ according to Table A.8.2. We show that S′ preserves both invariants:
- ○
- Descendant finalization invariant. Transitions that finalize nodes or advance levels (PD4a, PD6, PD7) are strictly guarded by conditions requiring validated(n) or descendants_validated(n) to be true. These guards explicitly enforce that a node is finalized only when its processed descendants are already finalized. All other transitions either do not affect finalization status or are refinement backtracks that temporarily reset nodes (addressed by refinement locality).
- ○
- Refinement locality invariant. Backtrack transitions (PD2a, PD3c, PD4b, PD6a) compute the target level j using the trace_origin function, which by definition satisfies j ≤ i. The guard conditions ensure that refinement scope remains contiguous within the range [j, i]. Non-backtrack transitions do not modify refinement relationships.
- Conclusion. By induction on the transition sequence, both invariants are preserved across all reachable states. The exhaustive nature of the state transitions in Table A.8.2 guarantees that no invariant-violating state is reachable.
Appendix A.8.6. Lemma (Invariant Preservation for PBFD)
- Level-wise ordering. Children/pattern at level i+1 are produced only after Patternᵢ is validated (PB4a).
- Top-down finalization in completion. PB7/PB8 iterate from level 1 upward without skipping.
- Refinement locality. Backtracks always target j = trace_origin(i) with j ≤ i; refinement scope is contiguous and anchored (PB3).
- Base Case. The initial state S₀ satisfies all three invariants vacuously: no patterns have been processed, no finalization has begun, and no refinement operations have been initiated. Therefore, all invariants hold trivially in the initial state.
-
Inductive Step. Assume all three invariants hold in state S. Consider any transition S → S′ according to Table A.8.3. We show that S′ preserves all invariants:
- ○
- Level-wise Ordering Invariant. The transition PB4a, which advances from Patternᵢ to Patternᵢ₊₁, is strictly guarded by the condition that Patternᵢ is fully validated. This guard ensures that no pattern at level i+1 is produced unless the preceding pattern has been successfully validated. All other transitions either operate within a single level or do not produce new patterns.
- ○
- Top-down Finalization Invariant. The completion phase transitions (PB7, PB8) progress sequentially through S₄(i) → S₄(i+1), with each step guarded by ∀n ∈ Patternᵢ: processed(n). This ensures that levels are finalized in strict ascending order from 1 to L without skipping. Backtrack transitions from S₄ (PB7a) do not violate this invariant as they temporarily exit completion mode.
- ○
- Refinement Locality Invariant. Refinement backtrack transitions (PB3, PB3a2, PB7a) compute the target level j using the trace_origin function, which by definition satisfies j ≤ i. The guard conditions and operational semantics ensure that refinement scope remains contiguous within [j, i]. Non-refinement transitions do not modify these relationships.
- Conclusion. By induction on the transition sequence, all three invariants are preserved across all reachable states. The exhaustive nature of the state transitions in Table A.8.3. guarantees that no invariant-violating state is reachable.
Appendix A.8.7. Lemma (Unified Progress)
Appendix A.8.8. Theorem (Total Correctness)
- Terminate in T (all nodes validated) or S₅ (refinement exhausted).
- Structural invariants (descendant finalization, refinement locality, level ordering) hold at all reachable states.
- Termination by Lemma A.8.4.
- Partial correctness by Lemmas A.8.5–A.8.6 (invariants). Upon termination in state T, the postcondition ∀n ∈ V, P(n)=2 is met directly by the guard of the terminal rule (PD7/PB8). The structural invariants ensure this final state is internally consistent.
- Progress/no stalling by Lemma A.8.7.
- A.8.2.1 (Boundedness). Total number of refinement attempts ≤ L ⋅ Rₘₐₓ.
- A.8.3.1 (Finalization Permanence). Once P(n)=2 outside an active refinement rollback, it remains 2; any temporary reset is only through guarded refinement-entry transitions, is bounded by Lemma A.8.2, and is always accompanied by a strict decrease in the k₂ component of the measure M.
- A.8.4.1 (Temporal completeness). From start, eventually the run reaches either success T or bounded failure S₅: □(start ⇒ ◊(T ∨ S₅)).
Appendix A.8.9. Proof Mermaid Code
Appendix A.9. TLE Mermaid Code, Algorithm, and Process Algebra
Appendix A.9.1. Structural Workflow Mermaid Code
Appendix A.9.2. State Machine Mermaid Code
Appendix A.9.3. Algorithm (Pseudo Code)
Appendix A.9.4. CSP Implementation and Formal Verification
Appendix A.9.5. TLE (Three-Level Encapsulation) Technique Tables
| Pseudocode Term | Type | Description | Pseudocode Lines | CSP Mapping |
|---|---|---|---|---|
| Algorithm & States | ||||
| Algorithm TLE(Units) | Meta-Process | Coordinates the tree-leaf encoding pipeline. | Header | TLE_Process(start→ TLE_S0) |
| currentState | State Variable | Tracks the current stage of the TLE process. | 1, 4, 10, 13, 21, 28, 36, 38, 45, 52, 55, 60 | (Implicit in CSP State Processes TLE_Sₓ(u)) |
| S₀ | State | Idle. Waiting for input. | 5, 52, 60 | TLE_S0 |
| S₁ | State | Data Loaded. A TLE unit is loaded. | 10, 16 | TLE_S1(u) |
| S₂ | State | Hierarchy Resolved. Parent levels identified. | 21, 23 | TLE_S2(u) |
| S₃ | State | Children Evaluated. Child states processed. | 28, 30 | TLE_S3(u) |
| S₄ | State | Children Updated. Child states modified. | 36, 40 | TLE_S4(u) |
| S₅ | State | Changes Committed. Modifications persisted. | 38, 45, 47 | TLE_S5(u) |
| S₆ | System End State | Workflow Finalized. Process complete. | 13, 55, 57 | TLE_S6(u) |
| Functions & Actions | ||||
| LOAD(Grandparent) | Core TLE Op | Loads a TLE data unit. | 9 | load?u:UNIT (Input) |
| resolve_hierarchy() | Processing Function | Resolves and validates hierarchy. | 20 | hierarchy_resolved.u (Output) |
| evaluate_children() | Processing Function | Reads and logically processes children. | 27 | children_evaluated.u (Output) |
| apply_update(...) | Core TLE Op | WRITE. Modifies child states. | 35 | children_updated.u (Output) |
| persist_changes() | Core TLE Op | COMMIT. Persists changes. | 44 | changes_committed.u (Output) |
| finalize_process() | System Function | Completes the TLE algorithm. | 59 | finalize_process.u (Output) |
| Conditions | ||||
| update_required | Condition | Trigger for WRITE operation. | 34 | (Implied by children_updated.u choice in TLE_S3) |
| has_next_unit() | Condition / Signal | Checks if more units exist. | 50 | has_next_unit (Output, Valueless) |
| ∃ unprocessed unit... | Condition | Checks if more units exist. | 7 | (Implicit in load?u:UNIT choice in TLE_S0) |
| CSP-Specific Events | ||||
| load | CSP Input | Signals a unit is ready for processing. | 7 | load?u:UNIT |
| no_next_unit | CSP I/O | Signals no more units. | 7, 11, 48, 54 | S0: Input (?u); S5: Output (.u) |
| skip_update | CSP Output | Signals no update was required, skipping to commit. | 32, 37 | skip_update.u |
| CSP Process | Key Transitions (TLE Ref.) | Pseudocode Lines | CSP Events (Simplified) |
|---|---|---|---|
| S₀ (TLE_S0) |
TLE1: Start → S₀ | 1 | (start→TLE_S0)→TLE_S0 (via TLE_Process) |
| TLE2: load(u) → S₁ | 7–10 | load?u:UNIT → TLE_S1(u) | |
| TLE11: no_next_unit(u) → S₆ | 7, 11–13 | no_next_unit?u:UNIT → TLE_S6(u) | |
| S₁(u) (TLE_S1(u)) | TLE3: hierarchy_resolved(u) → S₂ | 18–21 | hierarchy_resolved.u → TLE_S2(u) |
| S₂(u) (TLE_S2(u)) | TLE4: children_evaluated(u) → S₃ | 25–28 | children_evaluated.u → TLE_S3(u) |
| S₃(u) (TLE_S3(u)) | TLE5: children_updated(u) → S₄ | 32, 34–36 | children_updated.u → TLE_S4(u) |
| TLE6: skip_update(u) → S₅ | 32, 37–38 | skip_update.u → TLE_S5(u) | |
| S₄(u) (TLE_S4(u)) | TLE7: changes_committed(u) → S₅ | 42–45 | changes_committed.u → TLE_S5(u) |
| S₅(u) (TLE_S5(u)) | TLE8: has_next_unit → S₀ | 50–52 | has_next_unit → TLE_S0 |
| TLE9: no_next_unit(u) → S₆ | 53–55 | no_next_unit.u → TLE_S6(u) | |
| S₆(u) (TLE_S6(u)) | TLE10: finalize_process(u) → S₀ | 58–60 | finalize_process.u → TLE_S0 |
| Top-Level (TLE_Process) | System Start → S₀ | 1 | start → TLE_S0 |
Appendix A.9.6. Formal Verification Methodology and Scope
| Category | Count | Coverage |
|---|---|---|
| Core System Safety | 4 | Deadlock freedom; behavioral refinement (T, F, FD) |
| State-Level Reliability | 38 | Two specifications: S₀ (non-param) + S₁–S₆ (3 units each) |
| Liveness Guarantees | 2 | Divergence checks for TLE_Process and TLE_Abstract_Process |
| Composition & Robustness | 5 | Concurrency checks (2), hostile-environment checks (2), determinism (1) |
| Total | 49 | Complete verification of safety, liveness, and concurrency |
- TLE_Process :[deadlock free]
- TLE_Process [T= TLE_Abstract_Process]
- TLE_Process [F= TLE_Abstract_Process]
- TLE_Process [FD= TLE_Abstract_Process]
- Implementation states: S₀ (1) + S₁–S₆ × (u₁, u₂, u₃) (18) = 19
- Abstract states: Abstract_S₀ (1) + Abstract_S₁–S₆ × (u₁, u₂, u₃) (18) = 19
- TLE_Process :[divergence free]
- TLE_Abstract_Process :[divergence free]
- TLE_TwoUnits :[deadlock free] (parallel composition test)
- TLE_Abstract_TwoUnits :[deadlock free] (abstract parallel test)
- TLE_Hostile_System :[deadlock free] (hostile environment robustness)
- TLE_HostileEnv :[deadlock free] (hostile environment itself)
- TLE_Process :[deterministic [F]] (internal determinism)
Appendix A.10. Proofs of TLE Theorems
- Ć is the average bitmask size (in bits) across all parent entities,
- ĉ is the average number of children per parent,
- k is the storage size (in bits) required per stored relationship in the traditional representation.
- Root Access: O(1) via direct or indexed lookup on g.
- Bitmask Retrieval: O(1) access to the fixed-width integer column for p.
- Bitwise Check: O(1) operation: (bitmask >> c_id) & 1.
- Root Access: O(1) via direct or indexed lookup on g.
- Bitmask Update: A single, constant-time bitwise operation:
- 3.
- Write-back: O(1) operation to persist the updated fixed-width field.
- 1.
- Iterate over each grandparent entity.
- 2.
- parent entities.
- 3.
- For each parent entity, process its bitmask.
- ○
- O(1) for fixed-width integer fields when n ≤ w
- ○
- O(⌈n/w⌉) for variable-width encodings when n > w
| Approach | Complexity | Practical Characteristics |
|---|---|---|
| ≤ w) | Linear scan, cache-friendly, predictable | |
| B-tree indexed adjacency | Logarithmic overhead per parent lookup | |
| ContinentViewModel | Depth-dependent; degrades for deep hierarchies |
Appendix A.11. The PDFD MVP
Appendix A.11.1. Overview of the PDFD MVP
Appendix A.11.2. Objective
Appendix A.11.3. Strategy in Practice
- 1.
- Hybrid Depth-First Progression with Controlled Breadth
- Vertical Execution (DFD-style): Hierarchical levels (e.g., State → Country → Province) were traversed sequentially, focusing on in-depth development along a primary path.
- Controlled Breadth (Breadth-First by Two, or BF-by-Two): At each hierarchical level, two peer nodes (e.g., “Asia” and “North America”) are processed in parallel to validate both their combinatorial selection states and the resulting feature-driven workflows. The BF-by-Two approach corresponds to a controlled parallel expansion strategy, conceptually aligned with branch-and-bound techniques used to manage combinatorial state spaces [72].
- 2.
- Iterative Refinement via Feedback
- CDD Cycles: The cycles were triggered upon the detection of inconsistencies or schema limitations (e.g., missing intermediate tables or key definitions). This prompted a return to previous hierarchical levels for necessary corrections.
- 3.
- Application Scalability and Portability
- The solution was designed to be stack-agnostic and modular. Though built in ASP.NET MVC, PDFD's structure maps naturally to other frameworks (e.g., React/Node.js), making the pattern portable and extensible.
Appendix A.11.4. Workflow and Database Structure

- Arrows represent dependencies between nodes.
- Dotted areas highlight subsets of the hierarchy that are deferred for population until after initial validation.
- Curved arrows indicate feedback loops that activate the CDD process for iterative refinement.
- Nodes are labeled according to their hierarchical position—e.g., 1 denotes the root node, 2.1 refers to the first node at Level 2, and so on—providing a structured view of the progressive traversal and refinement workflow.
Appendix A.11.5. State Machine Representation
- 1.
- Parameters
- L = 6 (max level)
- Rₘₐₓ= 60 (Predefined refinement iterative limit, allowing refinement up to 60 times per level in the MVP while ensuring termination guarantees.)
- For i=3,4,5, Jᵢ = trace_origin(i) = 2, indicating that each level traces back to Level 2. This enforces refinement to Level 2 in the MVP, emphasizing critical dependency fixes.

- For i=3,4,5, Rᵢ = min(i−Jᵢ +1, i) ensures that dependent levels are revisited while respecting hierarchy boundaries. This mirrors the state-space exploration strategy in model checkers like SPIN, which also rely on efficient traversal and pruning to verify correctness [71]. However, PDFD introduces hierarchy-aware semantics absent from SPIN, enabling structured backtracking aligned with layered dependencies.
- 2.
- States and Transitions
| State ID | Phase | Description | Generic Mapping (State + Parameters) |
|---|---|---|---|
| S1 | Process & Validate Level 1 | Root node (Node 1) | S₁(1) → S₂(1) |
| S2 | Process & Validate Level 2 | Nodes 2.1 and 2.2 | S₁(2) → S₂(2) |
| S3 | Process & Validate Level 3 | Nodes 3.1 and 3.2 | S₁(3) → S₂(3) |
| S4 | Process & Validate Level 4 | Nodes 4.1 and 4.2 | S₁(4) → S₂(4) |
| S5 | Process & Validate Level 5 | Nodes 5.1 and 5.2 | S₁(5) → S₂(5) |
| S6 | Process & Validate Level 6 | Nodes 6.1 and 6.2 | S₁(6) → S₂(6) |
| S2_R1 | Refine Levels 2-3 | Reprocess Levels 2-3 due to failure at Level 3 | S₁(j=2) → S₂(j=2) |
| S2_R2 | Refine Levels 2-4 | Reprocess Levels 2-4 due to failure at Level 4 | S₁(j=2) → S₂(j=2) |
| S2_R3 | Refine Levels 2-5 | Reprocess Levels 2-5 due to failure at Level 5 | S₁(j=2) → S₂(j=2) |
| S7 | Finalize Level 5 Subtree | Finalize subtree under 5.1 and 5.2 | S₃(5) |
| S8 | Finalize Level 4 Subtree | Finalize subtree under 4.1 and 4.2 | S₃(4) |
| S9 | Finalize Level 3 Subtree | Finalize subtree under 3.1 and 3.2 | S₃(3) |
| S10 | Finalize Level 2 Subtree | Finalize subtree under 2.1 and 2.2 | S₃(2) |
| S11 | Finalize Root Subtree | Finalize root node and ensure completeness | S₄(1) |
| S_ERROR | Terminate on Failure | Refinement limit exceeded or validation failed | S₅ |
| Rule ID | From State -> To State | Formal Condition / Trigger | Workflow Step | Generic Rule (PD# + Parameters ) |
|---|---|---|---|---|
| PDFD1 | [*] → S1 | System initialized | Begin root-level processing | PD1 |
| PDFD2 | S1 → S2 | Root validated | Advance to Level 2 | PD2b (i=1) |
| PDFD3 | S2 → S3 | Level 2 validated | Advance to Level 3 | PD2b (i=2) |
| PDFD4 | S3 → S2_R1 | Level 3 validation failed | Backtrack to refine Levels 2-3 | PD2a (i=3, j=2) |
| PDFD5 | S2_R1 → S3 | Levels 2-3 refinement validated | Revalidate Level 3 | PD3b (j=2→i=3) |
| PDFD6 | S3 → S4 | Level 3 validated | Advance to Level 4 | PD2b (i=3) |
| PDFD7 | S4 → S2_R2 | Level 4 validation failed | Backtrack to refine Levels 2-4 | PD2a (i=4, j=2) |
| PDFD8 | S2_R2 → S4 | Levels 2-4 refinement validated | Revalidate Level 4 | PD3b (j=2→i=4) |
| PDFD9 | S4 → S5 | Level 4 validated | Advance to Level 5 | PD2b (i=4) |
| PDFD10 | S5 → S2_R3 | Level 5 validation failed | Backtrack to refine Levels 2-5 | PD2a (i=5, j=2) |
| PDFD11 | S2_R3 → S5 | Levels 2-5 refinement validated | Revalidate Level 5 | PD3b (j=2→i=5) |
| PDFD12 | S5 → S6 | Level 5 validated | Advance to Level 6 | PD2b (i=5) |
| PDFD13 | S6 → S7 | Level 6 validated | Finalize Level 5 subtrees | PD4 (i=6) |
| PDFD14 | S7 → S8 | Subtree at Level 5 validated | Finalize Level 4 subtrees | PD4a |
| PDFD15 | S8 → S9 | Subtree at Level 4 validated | Finalize Level 3 subtrees | PD4a |
| PDFD16 | S9 → S10 | Subtree at Level 3 validated | Finalize Level 2 subtrees | PD4a |
| PDFD17 | S10 → S11 | Subtree at Level 2 validated | Finalize root node | PD5 |
| PDFD18 | S11 → [*] | Root finalized | Terminate | PD6 → PD7 |
| PDFD19 | S2_R1/S2_R2/S2_R3 → S_ERROR | Refinement validation failed AND refinement_attempts[2] ≥ 60 | Terminate | PD3c → PD8 |
| PDFD20 | S3/S4/S5 → S_ERROR | refinement_attempts[2] ≥ 60 | Terminate | PD8 |
Appendix A.11.6. Development Process
Appendix A.11.7. Key Technical Highlights
-
Controlled Depth Parallelism (BF-by-Two Adaptation):
- ○
- Benefit: By processing two sibling nodes in parallel at each hierarchical level during the depth-first traversal, the system can expose cross-branch inconsistencies and UI state conflicts early in development, rather than deferring them to integration.
- ○
- Contrast: A pure DFD approach may postpone the detection of lateral interactions until deeper refinement phases, whereas a pure BFD approach—by prioritizing horizontal breadth—may introduce significant coordination overhead and delay cross-level dependency validation.
- ○
- Example: Simultaneously testing the nodes “Asia” and “North America” at the continent level revealed UI inconsistencies in regional naming conventions (e.g., “state” in the US vs. “province” in China). Early resolution of these discrepancies prevented cascading structural conflicts at deeper country-specific levels of the hierarchy.
-
Iterative Schema Refinement
- ○
- Benefit: The integration of CDD allows for flexible schema evolution during the development process, accommodating necessary mid-development changes such as the introduction of surrogate keys.
- ○
- Contrast: Traditional, more rigid development methodologies like Waterfall, with their upfront and inflexible schema design, often hinder the incorporation of necessary updates identified later in the cycle.
- ○
- Example: Initially, composite keys (e.g., combining PersonId and ContinentId) were used. However, during backtracking at the continent level, these were refactored to simpler surrogate keys (e.g., SelectedContinentId), significantly simplifying downstream data relationships and query logic.

-
Hierarchical Backtracking
- ○
- Benefit: Backtracking to previously validated hierarchical levels to incorporate new branches enhances the stability and reusability of the developed components by ensuring core paths are solid before extensive horizontal expansion.
- ○
- Contrast: Monolithic development methods often require significant rework or even rollback when errors are discovered late in the process, especially after substantial horizontal expansion.
- ○
- Example: After thoroughly validating the path USA → Maryland → Howard, PDFD facilitated backtracking to the state level to add branches for Virginia. This allowed for the reuse of existing controllers and views, minimizing redundant development effort.
-
Methodological Cohesion
- ○
- The PDFD methodology effectively integrates DFD, BFD through the BF-by-Two strategy, and CDD.
- ○
- This MVP serves as a practical instantiation of the hybrid approach, demonstrating its ability to maintain the formal properties of the underlying methodologies (as discussed in Section 3.4.1) while offering a pragmatic and adaptable development process for hierarchical systems.
Appendix A.12. PDFD MVP State Machine Workflow Mermaid Code
Appendix A.12.1. Mermaid Code for Figure A.11.3
Appendix A.13. PDFD MVP Development Process
Appendix A.13.1. Root Node Level – Visitor

- Model: The Person class maps to the Persons database table (Table A.13.1), with PersonId as the primary key.
- Controller: The PersonsController processes HTTP requests, binds the Person model to the view, and handles form submissions.
- View: ASP.NET Razor syntax is used to render the visitor entry interface (Figure A.13.1).
- Workflow: Users input visitor details, which are persisted in SQL Server (Table A.13.1) upon submission. This process, representing Level 1 (S1 in Figure A.11.3), then redirects users to the Continent Level (Level 2) via PDFD2 (Table A.11.2).
| PersonId | First Name | Middle Name | Last Name | |
|---|---|---|---|---|
| 1 | Test | T | Tester | tester@test.com |
Appendix A.13.2. Continent Level – Asia and North America
- 1.
- Implementation Overview
| Model | SQL Table | Function | Key Data Fields |
|---|---|---|---|
| Continent | Continents | Reference Data | ContinentId, Name, NameTypeId |
| SelectedContinent | SelectedContinents | Selection Tracking | SelectedContinentId, PersonId, ContinentId, IsDeleted |
| ContinentViewModel | N/A | View Model | ContinentId, ContinentName, PersonId, IsSelected |
- 2.
- Source Tables
- Persons (Table A.13.1) – Shared across all levels
- Continents (Table A.13.3)
- NameTypes (Table A.13.4) – Shared across all levels
- SelectedContinents (Table A.13.5)
| ContinentId | Name | NameTypeId |
|---|---|---|
| 1 | Asia | 1 |
| 2 | North America | 1 |
| NameTypeId | Name |
|---|---|
| 1 | Continent |
| 2 | Country |
| 3 | State |
| 4 | County |
| 5 | City |
| 6 | District |
| 7 | Province |
| 11 | Region |
| SelectedContinentId | PersonId | ContinentId | IsDeleted |
|---|---|---|---|
| 1 | 1 | 1 | 1 |
| 2 | 1 | 2 | 0 |
- 3.
- Workflow Logic
-
Users interact with the continent selection interface (Figure A.13.2), which triggers updates to the SelectedContinents table (Table A.13.5). Upon submission, the system updates Table A.13.5. according to the following rules—also applicable at subsequent hierarchy levels:
- ○
- New selections are added with IsDeleted = 0.
- ○
- Deselections are marked with IsDeleted = 1 (soft delete).
- ○
- Restored selections have IsDeleted reset to 0.
- User selections at the continent level trigger cascaded updates to downstream levels (e.g., countries).

- Level 2 (S2) processed.
- Transitions to Level 3 (S3) follow PDFD3 (∑P(n) ≥ K₂).
-
Level 2 with K₂ = 2:
- ○
- Node 2.1: North America (ContinentId = 2)
- ○
- Node 2.2: Asia (ContinentId = 1)
- 4.
- Hierarchical Context
- Errors detected at Level 3 (S3) trigger refinement starting at Jᵢ=2 (PDFD4).
Appendix A.13.3. Country Level – United States and Canada
- 1.
- Implementation Overview
- Missing IsSelected field triggered refinement (PDFD4) for Levels 2–3.
- Post-refinement, processing resumed at Level 3 (PDFD5).
- Country, SelectedCountry, CountryViewModel (see Table A.13.6)
- Countries Lookup (Table A.13.7), SelectedCountries Transaction Data (Table A.13.8)
| Model | SQL Table | Function | Key Data Fields |
|---|---|---|---|
| Country | Countries | Reference Data | CountryId, Name, ContinentId, NameTypeId |
| SelectedCountry | SelectedCountries | Selection Tracking | SelectedCountryId, SelectedContinentId, CountryId, IsDeleted |
| CountryViewModel | N/A | View Model | CountryId, CountryName, SelectedContinentId, IsSelected |
| CountryId | Name | ContinentId | NameTypeId |
|---|---|---|---|
| 1 | USA | 2 | 2 |
| 2 | Canada | 2 | 2 |
| SelectedCountryId | SelectedContinentId | CountryId | IsDeleted |
|---|---|---|---|
| 1 | 2 | 1 | 0 |
| 2 | 2 | 2 | 1 |
- 2.
- Workflow Logic

-
State Machine (Figure A.11.3)
- ○
- S3 processing step failed
- ○
- Transitions to S2_R1
- Structural Workflow (Figure A.11.1)
- ○
- Node 3.1: USA (CountryId = 1)
- ○
- Node 3.2: Canada (CountryId = 2)
Appendix A.13.4. State Level – Maryland and Virginia
- 1.
- Implementation Overview
- Surrogate key introduction triggered refinement (PDFD7) for Levels 2–4.
- Processing resumed at Level 4 (PDFD8).
- State, SelectedState, StateViewModel. (Table A.13.9)
- States Lookup (Table A.13.10), SelectedStates (Table A.13.11)
| Model | SQL Table | Functions | Key Data Fields |
|---|---|---|---|
| State | States | Reference Data | StateId, Name, CountryId, NameTypeId |
| SelectedState | SelectedStates | Selection Tracking | SelectedStateId, SelectedCountryId, StateId, IsDeleted |
| StateViewModel | N/A | View Model | StateId, StateName, SelectedCountryId, IsSelected |
| StateId | Name | CountryId | NameTypeId |
|---|---|---|---|
| 1 | Maryland | 1 | 3 |
| 2 | Virginia | 1 | 3 |
| SelectedStateId | SelectedCountryId | StateId | IsDeleted |
|---|---|---|---|
| 1 | 1 | 1 | 0 |
| 2 | 1 | 2 | 1 |
- 2.
- Workflow Logic
- The StateController uses the StateViewModel to populate the interface (Figure A.13.4), where users toggle state selections (e.g., Maryland, Virginia). Changes are saved to the SelectedStates table (Table A.13.11) using soft deletion (IsDeleted flag).

- Users modify state selections, with pre-checked entries reflecting prior choices stored in SelectedStates.
- Level 4 processing
- Transitions to S2_R2 (PDFD7)
- Node 4.1: Maryland (StateId = 1)
- Node 4.2: Virginia (StateId = 2)
Appendix A.13.5. County Level – Howard and Baltimore
- 1.
- Implementation Overview
- Missing IsDeleted flag triggered refinement (PDFD10) for Levels 2–5.
- Processing resumed at Level 5 (PDFD11).
- County, SelectedCounty, CountyViewModel (Table A.13.12)
- Counties Lookup (Table A.13.13), SelectedCounties Transaction Data (Table A.13.14)
| Model | SQL Table | Function | Key Data Fields |
|---|---|---|---|
| County | Counties | Reference Data | CountyId, Name, StateId, NameTypeId |
| SelectedCounty | SelectedCounties | Selection Tracking | SelectedCountyId, SelectedStateId, CountyId, IsDeleted |
| CountyViewModel | N/A | View Model | CountyId, CountyName, SelectedStateId, IsSelected |
| CountyId | Name | StateId | NameTypeId |
|---|---|---|---|
| 1 | Howard | 1 | 4 |
| 2 | Boltimore | 1 | 4 |
| SelectedCountyId | SelectedStateId | CountyId | IsDeleted |
|---|---|---|---|
| 1 | 1 | 1 | 0 |
- 2.
- Workflow Logic
- Users toggle county selections (e.g., Howard, Baltimore) within Maryland via the interface (Figure A.13.5), with updates persisted to SelectedCounties (Table A.13.14).

- Level 5 processing
- Transitions to S2_R3 (PDFD10)
- Node 5.1: Howard County (CountyId = 1)
- Node 5.2: Baltimore County (CountyId = 2)
Appendix A.13.6. City Level – Ellicott City and Columbia
- 1.
- Implementation Overview
- City, SelectedCity, CityViewModel (Table A.13.15)
- Cities Lookup (Table A.13.16), SelectedCities Transaction Data (Table A.13.17)
| Model | SQL Table | Function | Key Data Fields |
|---|---|---|---|
| City | Cities | Reference Data | CityId, Name, CountyId, NameTypeId |
| SelectedCity | SelectedCities | Selection Tracking | SelectedCityId, SelectedCountyId, CityId, IsDeleted |
| CityViewModel | N/A | View Model | CityId, CityName, SelectedCountyId, IsSelected |
| CityId | Name | CountyId | NameTypeId |
|---|---|---|---|
| 1 | Ellicott City | 1 | 5 |
| 2 | Columbia | 1 | 5 |
| SelectedCityId | SelectedCountyId | CityId | IsDeleted |
|---|---|---|---|
| 1 | 1 | 1 | 0 |
| 2 | 1 | 2 | 0 |
- 2.
- Workflow Logic
- Users finalize city selections (e.g., Ellicott City, Columbia) within Howard County via the interface (Figure A.13.6), with data stored in SelectedCities (Table A.13.17).

- Level 6 processing.
- Transition to completion phase follows PDFD13.
- Node 6.1: Ellicott City (CityId = 1).
- Node 6.2: Columbia (CityId = 2).
Appendix A.13.7. Intermediate Development with CDD
- 1.
- Addition of the IsSelected Field
- Challenge: The IsSelected flag—essential for tracking user selections—was omitted during initial continent-level development and identified only at the country level.
- CDD Intervention: A feedback loop (curve a in Figure A.11.1) redirected development back to the continent level to add the IsSelected field, ensuring consistent state management and user selection tracking across all levels.
- 2.
- Transition from Composite to Surrogate Keys
- Initial Design: Composite keys (e.g., PersonId + ContinentId for SelectedContinents) were initially used to enforce uniqueness across tables.
- Challenge: As development progressed to deeper levels of the hierarchy (e.g., states, counties), composite keys became cumbersome, complicating foreign key relationships and reducing scalability.
- CDD Intervention: A surrogate key (SelectedContinentId) was introduced at the continent level (curve b in Figure A.11.1), simplifying downstream dependencies and improving scalability.
- 3.
- Introduction of the IsDeleted Flag
- Challenge: Soft-deletion functionality, essential for marking deselected entries without losing data, was overlooked initially, risking permanent data loss when users deselected entries.
- CDD Intervention: The IsDeleted field was retrofitted into transaction tables (e.g., SelectedContinents) via a feedback loop (represented by curve c in Figure A.11.1), allowing for dynamic updates to selections without data loss.
| Intervention | Scope Levels | i | Rᵢ | Depth | Rule ID | State Transition | Figure Reference |
|---|---|---|---|---|---|---|---|
| Addition of IsSelected | 2–3 | 3 | 2 | 2 | PDFD4 → PDFD5 | S3 → S2_R1 → S3 | Curve a (Figure A.11.1) |
| Transition to Surrogate Keys | 2–4 | 4 | 3 | 3 | PDFD7 → PDFD8 | S4 → S2_R2 → S4 | Curve b (Figure A.11.1) |
| Introduction of IsDeleted | 2–5 | 5 | 4 | 4 | PDFD10 → PDFD11 | S5 → S2_R3 → S5 | Curve c (Figure A.11.1) |
- Data Integrity: Retroactive fixes ensured consistent tracking of user selections and deletions across all levels, preventing data inconsistencies.
- Scalability: The introduction of surrogate keys reduced relational complexity, supporting seamless expansion to accommodate deeper hierarchical levels as the system grew.
- Workflow Cohesion: Iterative refinements aligned the system with real-world user behavior (e.g., revisiting selections), resulting in a more intuitive user experience.
- Per-level refinement limit: refinement_attempts[j] ≤ Rₘₐₓ = 60 (Section A.11.5)
-
S_ERROR enforcement:
- ○
- PDFD19: Refinement failure after 60 attempts
- ○
- PDFD20: Forward-pass failure after 60 attempts
- Development phases map 1:1 to PDFD states (Table A.11.1)
- CDD interventions trigger exact refinement rules (Table A.13.18)
- Jᵢ=2 maintained for all refinements (root-cause level)
-
Refinement Scope Consistency:
- ○
- Rᵢ=2: Levels 2-3 (S2_R1)
- ○
- Rᵢ=3: Levels 2-4 (S2_R2)
- ○
- Rᵢ=4: Levels 2-5 (S2_R3)
-
Tree Parameters:
- ○
- Depth: L=6 (Levels 1-6)
- ○
- State Complexity: |Q|=15 states
-
Refinement Attempts:
- ○
- Level 2: 3 attempts << Rₘₐₓ=60
- ○
- Level 3: 3 attempts << 60
- ○
- Level 4: 2 attempts << 60
- ○
- Level 5: 1 attempts << 60
-
Transition Complexity:
- ○
- |δ|=20 rules (Table A.11.2)
- ○
- Max depth: O(L)=6
Appendix A.13.8. The Report Page
- 1.
- Implementation Overview
| Type | Name | Role | Key Data Fields |
|---|---|---|---|
| Database View | vw_Report | Data Aggregation | Persons, SelectedContinents, Continents, SelectedCountries, Countries, SelectedStates, States, SelectedCounties, Counties, SelectedCities, Cities, NameTypes |
| Model | Report | UI Presentation | PersonName, ContinentName, CountryName, StateName, CountyName, CityName |
- 2.
- Workflow Logic
Appendix A.13.9. Backtracking to complete the entire application

-
Bottom-Up Processing:
- ○
- Finalizes subtrees level-by-level from leaves toward root
- ○
- Handles localized subtree completion
-
Local Top-Down Verification:
- ○
- Validates parent-child relationships within the current subtree
- ○
- Ensures hierarchical integrity from subtree root to leaves
- ○
- Example: S7 verifies Maryland→Howard County→Ellicott City
-
State S11 performs global top-down finalization:
- ○
- Verifies completeness from root perspective (Person→Continent→Country→...)
- ○
- Ensures cross-subtree consistency
- ○
- Executes final validation pass before termination (PDFD18)
-
Phase 1: County-Level Completion (Subset i in Figure A.11.1 and state S7 in Figure A.11.3)
- ○
- Objective: Expand Howard County by adding remaining cities (e.g., Columbia) and populate all cities in Baltimore County
- ○
- Actions: Update the Cities table with missing entries (Table A.13.16)
- ○
- State Machine: Maps to S7 → S8 (PDFD14) (Table A.11.2)
-
Phase 2: State-Level Expansion (Subset ii in Figure A.11.1 and state S8 in Figure A.11.3)
- ○
- Objective: Implement remaining counties/cities in Maryland and Virginia
- ○
- Actions: Populate Counties and Cities tables for Virginia (e.g., Fairfax County, Arlington)
- ○
- State Machine: Maps to S8 → S9 (PDFD15) (Table A.11.2)
-
Phase 3: National Scalability (Subset iii in Figure A.11.1 and state S9 in Figure A.11.3)
- ○
- Objective: Scale to all U.S. states and Canadian provinces
- ○
- Actions: Populate States, Counties, and Cities tables for the U.S. (e.g., Texas, California) and Canada (e.g., Ontario, Quebec)
- ○
- State Machine: Maps to S9 → S10 (PDFD16) (Table A.11.2)
-
Phase 4: Continental Integration (Subset iv in Figure A.11.1 and state S10 in Figure A.11.3)
- ○
- Objective: Integrate North American and Asian datasets
- ○
- Actions: Populate Asian countries (e.g., China, Japan) with region-specific hierarchies (e.g., provinces, prefectures)
- ○
- State Machine: Maps to S10 → S11 (PDFD17, Transitions to global top-down finalization)
-
Phase 5: Global Coverage (Unpopulated Nodes in Figure A.11.1 and S11 in Figure A.11.3)
- ○
- Objective: Achieve global completeness by adding remaining continents (e.g., Europe, Africa)
- ○
- Actions: Populate Countries, States, Counties, and Cities for all regions
- ○
- State Machine: Executes during S11 (global top-down finalization) and terminates via PDFD18
Appendix A.14. PBFD MVP WITH PATTERN-BASED TRAVERSAL AND TLE
Appendix A.14.1. Overview of the PBFD MVP
Appendix A.14.2. Technology Stack and Key Design Decisions
- Level 1: the country (“United States”), implemented in the MVP as the table name representing the grandparent pattern
- Level 2: its constituent states (e.g., Maryland, California), represented as columns within the Level 1 table
- Level 3: the counties within each state, encoded as bitmask values stored in the corresponding Level 2 column cells
- Selective Depth Exploration: After resolving a Level 1 or Level 2 pattern, the MVP performs controlled descent into the corresponding TLE instance to validate cross-level constraints while maintaining early UI feedback.
- Iterative Refinements (CDD): Bounded refinement cycles allow schema or pattern adjustments when validations fail. This preserves termination guarantees while supporting correction and incremental evolution of the hierarchy.
Appendix A.14.3. Strategy in Practice
- Example: continents such as "North America" and "Asia" are presented as checkboxes in a shared view, enabling batch-processing logic.
- Efficiency: server-side Razor views with shared models reduce UI duplication.
- Depth After Pattern: after a pattern (e.g., continent selection) is validated, the system descends into the children of selected parents only (e.g., countries inside selected continents), enabling earlier detection of cross-level invariants [62].
- Feedback Loops: mid-development changes (shared components, schema adjustments) were integrated via bounded CDD cycles; failures at deeper levels trigger controlled backtracking and refinement of parent-level patterns. This mirrors dependency-directed backtracking techniques used in knowledge refinement and constraint search [77].
- Rₘₐₓ = 50 (empirical maximum refinement attempts per level before bounded failure)
- Jᵢ = trace_origin(i) (refinement origin tracing)
- Rᵢ = i - Jᵢ + 1 (refinement span)
Appendix A.14.4. Structural Workflow

- Root Node: Level 1 (ContinentGrandparent)
- Numbering: First digit = level, second digit = position (e.g., Node 3.1 = North America)
- Arrows: Progression through hierarchical levels
- Dotted Lines: Unselected nodes
- Curve a: CDD-driven refinements (Levels 1–3) triggered by Level 3 failures
Appendix A.14.5. State Machine Representation
| State Id | Label | Phase | Generic Mapping | TLE Scope |
|---|---|---|---|---|
| S0 | Level_1_Processing_Validating_Resolving | Process & Validate Level 1 & resolve Level 2 (TLE Root: ContinentGrandparent) | S₁(1) → S₂(1) → S₃(1) | Levels 1–3 |
| S1 | Level_2_Processing_Validating_Resolving | Process & Validate Level 2 & resolve Level 3 (TLE Root: ContinentParent) | S₁(2) → S₂(2) → S₃(2) | Levels 2–4 |
| S2 | Level_3_Processing_Validating_Resolving | Process & Validate Level 3 & resolve Level 4 (TLE Root: a continent) | S₁(3) → S₂(3) → S₃(3) | Levels 3–5 |
| S3 | Level_4_Processing_Validating_Resolving | Process & Validate Level 4 & resolve Level 5 (TLE Root: a country) | S₁(4) → S₂(4) → S₃(4) | Levels 4–6 |
| S4 | Level_5_Processing_Validating | Process & Validate Level 5 (TLE Root: a state) | S₁(5) → S₂(5) | Levels 5–7 |
| S5 | Refine_Level1-3 | Refine Levels 1–3 (Level 3 failure) | S₁(j) → S₂(j) → S₃(j) (j=1) | Levels 1–3 |
| S6 | Finalize_All | Finalize all nodes top-down | S₄(1) → ... → S₄(7) | Levels 1–7 |
| S7 | Complete | Termination state | T | – |
| S8 | Validation_Failure | Terminate due to Rₘₐₓ = 50 exhaustion | S₅ | – |
| Rule ID | From State | To State | Condition | Generic Rule | Workflow Step |
|---|---|---|---|---|---|
| PBFD1 | [*] | S0 | Start | PB1 | Initialize Level 1 (TLE 1–3) |
| PBFD2 | S0 | S1 | Level 1 validated & resolved | PB4a | Proceed to Level 2 (TLE 2–4) |
| PBFD3 | S1 | S2 | Level 2 validated & resolved | PB4a | Proceed to Level 3 (TLE 3–5) |
| PBFD4 | S2 | S3 | Level 3 validated & resolved | PB4a | Proceed to Level 4 (TLE 4–6) |
| PBFD5 | S3 | S4 | Level 4 validated & resolved | PB4a | Proceed to Level 5 (TLE 5–7) |
| PBFD6 | S2 | S5 | Level 3 validation failed | PB3 | Refine Levels 1-3 |
| PBFD7 | S5 | S0 | Levels 1-3 reprocessed | PB3a | Resume Level 1 (TLE 1–3) |
| PBFD8 | S5 | S8 | refinement_attempts ≥ Rₘₐₓ | PB9 | Terminate with error |
| PBFD9 | S4 | S6 | Level 5 validated | PB4b | Finalize all levels |
| PBFD10 | S6 | S7 | All nodes finalized. Finalization (S6) combines PB7 and PB8, resolving all levels top-down in a single step for efficiency. | PB8 | Complete |

Appendix A.14.6. Data Structure and Relationships
- 1.
- Sample Locations Dataset
| Id | Name | Name Type Id | Type | Parent Id | Child Id | Level |
|---|---|---|---|---|---|---|
| 0 | ContinentGrandparent | null | INT | null | 0 | 1 |
| 1 | ContinentParent | null | INT | 0 | 0 | 2 |
| 2 | North America | 1 | INT | 1 | 0 | 3 |
| 3 | South America | 1 | INT | 1 | 1 | 3 |
| 9 | United States | 2 | BIGINT | 2 | 0 | 4 |
| 10 | Canada | 2 | INT | 2 | 1 | 4 |
| 14 | Brazil | 2 | INT | 3 | 0 | 4 |
| 38 | Virginia | 3 | VARCHAR(120) | 9 | 11 | 5 |
| 45 | Maryland | 3 | INT | 9 | 18 | 5 |
| 102 | Howard County | 4 | INT | 45 | 12 | 6 |
| 148 | Ellicott City | 5 | INT | 102 | 1 | 7 |
- Id: Unique identifier for the node
- Name: Entity name (e.g., "North America", "Maryland")
- Name Type Id: Categorize the entity type (e.g., continent = 1, country = 2). ContinentGrandparent and ContinentParent are structural placeholders for TLE
-
Type: The SQL data type for the node's bitmask, determined by the maximum number of children:
- ○
- INT: Supports up to 32 child selections
- ○
- BIGINT: Supports up to 64 child selections
- ○
- VARCHAR(X): For >64 children, storing a character-based bitmask representation
- Parent Id: References the parent node's Id
- Child Id: The node's zero-based position within its parent's bitmask encoding
- Level: The node's depth in the hierarchy
- 2.
- Design Rationale
- Hierarchical Querying: ParentId define the tree structure.
- Pattern Encoding: ChildId enables bitmask-based grouping within TLE tables.
- Dynamic Generation: Serves as input to recursively generate TLE tables at runtime, adapting bitmask data types as needed for flexibility.
- Consistency: Levels 1–5 follow a consistent schema; Levels 6–7 are embedded as bitmasks within parent levels.
- 3.
- Integration with TLE
- ParentId defines column-to-row relationships.
- ChildId defines the bit position in the bitmask.
- "United States" (ChildId = 0) → 0b0001 = bitmask 1
- "Canada" (ChildId = 1) → 0b0010 = bitmask 2
Appendix A.14.7. Three-Level Encapsulation (TLE) Rule

- Grandparent (Level 2): ContinentParent (Grandparent, Node 2)
- Parent (Level 3): [North America], [South America], etc. (Parent columns, Nodes 3.1 – 3.7)
- Child (Level 4): Bitmask for selected countries within each continent (Child state, Nodes 4.1 – 4.6)
| Level | Grandparent Node (Table) | Parent Nodes (Columns) | Child Nodes (Bitmask) | Three-Level Scope |
|---|---|---|---|---|
| 1 | ContinentGrandparent | Continentparent | Continent selections (e.g. North America (1)) | Levels 1–3 |
| 2 | Continentparent | e.g. Asia, North America | Country selections (e.g. United States (1)) | Levels 2–4 |
| 3 | Continent | e.g. United States, Canada | State selections (e.g., Maryland (262,144)) | Levels 3–5 |
| 4 | Country | e.g. Virginia, Maryland | County selections (e.g., Howard County (4096)) | Levels 4–6 |
| 5 | State | e.g. Howard County, Baltimore County | City selections (e.g., (Columbia MD + Ellicott City) (3)) | Levels 5–7 |
- County Level (Level 6): Represented as dedicated columns within the State table (Level 5)
- City Level (Level 7): Stored as bitmasks within the corresponding County columns
| PersonId | Howard County (bitmask) | …… |
|---|---|---|
| 1 | 3 | …… |
- It encapsulates the grandparent-parent-child hierarchy within a single unit, using bitmasks for O(1) updates and enabling parallel resolution of nodes within a pattern.
- Leveraging the analytical findings from Appendix A.16, it avoids creating hundreds of tables for leaf-level data by embedding their states, thus maintaining modularity and performance despite the exponential node growth in deeper levels.
- Scalability Alignment: By minimizing dynamic table proliferation and maintaining compact storage, this approach supports the horizontal scaling and operational efficiency required in cloud-native environments.
Appendix A.14.8. Database Implementation (SQL Server)
- N denote the current hierarchical level
- L denote the maximum depth of the hierarchy (in PBFD MVP, L=7)
- The algorithm iterates from level 1 to L - 2, generating one dynamic table per grandparent node
- Locations metadata (table or JSON)
- Maximum dynamic depth = 5 (up to the State level)
- SQL table per grandparent that follows the TLE rule (level N)
- One column per parent (level N+1)
- One bitmask field encoding child selections (level N+2)
- Load the Locations data
- Group nodes by hierarchical level
- For each level N from 1 to L-2:
- ○
- One column for each parent node at level N+1
- ○
- One bitmask field encoding child selections at level N+2
- 4.
-
Skip dynamic table creation for the lowest two levels (L−1 and L):
- ○
- These levels are embedded into their grandparent’s table as described in Appendix A.14.7, using dedicated columns and bitmask fields
- ContinentGrandparent (Level 1, Id = 0)
- Serves as the hierarchical entry point and contains bitmask columns for descendant states or subregions
- Deterministic CREATE TABLE generation occurs as part of controlled deployment scripts.
- All DDL changes are executed inside transactions to ensure rollback safety.
- Preflight checks validate bitmask width, column compatibility, and backward consistency before applying any schema upgrades.
- Type escalation (e.g., INT → BIGINT → VARCHAR) is handled automatically when child-node cardinality outgrows the existing bitmask type.
- Persons (core entity table)
- Locations (full hierarchy metadata)
- NameTypes (categorization of nodes: continent, country, etc.)
- Level 1: ContinentGrandparent
- Level 2: ContinentParent
- Level 3: one table per continent (e.g., NorthAmerica, Asia, etc.)
- Level 4: one table per country (e.g., [United State], Canada, etc.)
- Level 5: one table per state (e.g., Alabama, California, etc.)
- Lower levels embedded via bitmask columns rather than additional tables
- The Persons table as the static entry point
- Dynamically generated TLE structures for the first three hierarchical levels
- One-hop access paths from Persons
- Clear delineation of bitmask fields and level boundaries within each dynamic table

Appendix A.14.9. PBFD Loosely Coupled Table Design Benefits
| Feature | Benefit |
|---|---|
| Normalization [136] | Static tables are highly normalized. |
| Security [137] | Table-level permissions enforce granular access control (e.g., permitting team-specific access to regional data), a foundational relational security model. |
| Optimization [55,138] | Each grandparent table can utilize separate indexes and be independently partitioned or sharded, allowing for targeted performance tuning. |
| Challenge | PBFD Solution |
|---|---|
| Multi-Table Joins [139] | Replaces 4–5 join traversals with direct, one-hop access to precomputed grandparent tables, dramatically reducing query complexity. |
| ORM/Workflow Complexity [140] | Employs a single controller and view model across all hierarchical levels, simplifying the application layer and minimizing code duplication. |
| Backup/Restore Bottlenecks [141] | Enables modular, table-level operations (e.g., backing up only the "Europe" dataset), which aligns with modern, cloud-native operational practices [90]. |
Appendix A.14.10. Development Process
- Frontend — Visitor entry & pattern selection: The frontend collects visitor data, including each party’s initial pattern choices.
- Backend — Dynamic generation: The Locations table is consulted to deterministically generate TLE tables (CREATE TABLE statements).
- UI — Shared rendering: A single Razor view and ViewModel are reused across levels to render pattern options, reducing duplication.
- Data update — Bitmask write: User actions are persisted by updating the bitmask column in the grandparent table (typically a single-row O(1) operation).
- Hierarchy-Aware Design: Logical table boundaries are enforced for each three-level scope via TLE, aligning with structured decomposition principles in hierarchical relational schemas [118].
- Reusable Workflow: A single MVC controller and ViewModel operate across all hierarchical levels, minimizing ORM complexity and duplication in line with multi-view reuse patterns in enterprise MVC frameworks [142].
- Exceeding Rₘₐₓ transitions the workflow to state S8, as specified in Table A.14.2, enforcing bounded iteration and controlled bailout paths consistent with ISO/IEC 12207 lifecycle termination principles [87].
Appendix A.14.11. Key Claims Supported and Academic Grounding
| Claim | Academic Grounding |
|---|---|
| Bitwise/encoded access provides substantial read efficiency for pattern queries. | Grounded in columnar/encoding database literature [23,53,55] |
| Recursive-CTE/adjacency-list traversal has depth-dependent costs (worse for broad/deep hierarchies). | Grounded in classical database texts on hierarchical representations and relational trade-offs [118] |
| TLE’s dynamic table approach is a practical denormalization strategy that trades schema complexity for query and operational efficiency. | Consistent with schema evolution and polyglot persistence research [90] |
| Bounded iterative refinement and backtracking map to classical search/backtracking techniques. | Supported by DFS/BFS algorithmic foundations and process-refinement literature [62,77,83] |
| Formal verification of workflow/state-machine behavior aligns with CSP paradigms, and the MVP inherits its structural and behavioral guarantees from the verified Generic model. | Grounded in process algebra and model checking guidance (CSP) [45,71,87], as applied to the Generic model from which the MVP is derived |
Appendix A.15. PBFD MVP State Machine Workflow Mermaid Code
Appendix A.16. Quantifying Node Reduction in Perfect N-ary Trees
- Total Nodes (before removal):
-
Nodes removed:
- ○
- Leaves (level h): nodes
- ○
- Parent level (level h−1): nodes
- Remaining nodes (after removing leaves and their parents):
- Leaves (Level 6): nodes
- Parent Level (Level 5): nodes
- Total Nodes Removed: nodes
- ○
- Nodes in last two levels: 729 + 243 = 972 nodes
- ○
- Percentage of last two levels: (972 / 1093) × 100% ≈ 88.93%
| Metric | Value | Percentage |
|---|---|---|
| Total nodes | 1,093 | 100.00% |
| Level 6 (leaves) | 729 | 66.70% |
| Level 5 (parents) | 243 | 22.23% |
| Last two levels combined | 972 | 88.93% |
| Remaining nodes (Levels 0–4) | 121 | 11.07% |
Appendix A.17. PBFD MVP Development Process
Appendix A.17.1. The Visitor Page
- Purpose: Captures initial visitor information (e.g., name, contact details) and persists it to the static Persons table (Table A.13.1)
-
Design:
- ○
- Model: Person (maps to Persons table)
- ○
- UI: Person node excluded from PBFD MVP hierarchy (Figure A.15.1) but serving as root node in PDFD MVP design (Figure A.11.1)
- Workflow: On submission, redirects to the Continent Page to begin hierarchical selections
-
State Machine Context:
- ○
- Pre-Processing: This step occurs before the state machine initializes.
- ○
- Transition: Submission triggers PBFD1 (Table A.14.2), transitioning to S0 (Level_1_Processing_Validating_Resolving) (Table A.14.1).
Appendix A.17.2. Continent Level (Child Level 3, Grandparent Level 1)
- Hierarchical Structure
| Child LocationId | ChildId | Child Node | Parent Node (Columns) | Grandparent Node (Table) |
|---|---|---|---|---|
| 2 | 0 | North America | ContinentParent | ContinentGrandparent |
| 4 | 2 | Europe | ContinentParent | ContinentGrandparent |
| 6 | 4 | Asia | ContinentParent | ContinentGrandparent |
| PersonId | ContinentParent |
| 1 | 21 |

- 2.
- Key Workflow
- Data Retrieval: The LocationViewModel fetches continent nodes from the Locations table (Table A.14.3) where ParentId = 1.
- UI Binding: Continent names (e.g., "North America") are bound to checkboxes in the interface (Figure A.17.1).
- Bitmask Encoding: Selected continents are encoded as bitmasks (e.g., 21 for North America + Europe + Asia).
- Persistence: Bitmasks are saved in the ContinentGrandparent table (Table A.17.2).
- 3.
- Continent Level Interface
- Node Mapping (Figure A.14.1): Nodes 3.1–3.7 represent continents (e.g., 3.1 = North America).
- Example: Selecting Asia (3.5), Europe (3.3), and North America (3.1) generates the bitmask 0000000000010101 (decimal 21).
- 4.
- Interpretation
- Decimal Value: 21
-
Binary Value: 00010101 (8-bit format)Bit Positions Set:
- ○
- Bit 0: North America (Node 3.1 in Figure A.14.1)
- ○
- Bit 2: Europe (Node 3.3 in Figure A.14.1)
- ○
- Bit 4: Asia (Node 3.5 in Figure A.14.1)
- UI: North America, Europe, and Asia appear as checked checkboxes in Figure A.17.1.
- Storage: Selected continents are stored as bitmasks in the ContinentGrandparent table (Table A.17.2), with each bit representing a continent.
- 5.
- Workflow Impact
- Selection: Selections are saved as bitmasks in ContinentGrandparent.
- Deselection: Unchecking North America updates the bitmask to 20 (0000000000010100), while the LocationResetService recursively clears all associated child data within North America (including Country, State, etc.).
- UI/Backend Split: Only child nodes (Continents) are displayed, with grandparent and parent nodes managed by middleware.
- 6.
- State Machine Context
- Current State: S0 (Level_1_Processing_Validating_Resolving) (Table A.14.1)
- TLE Structure: Processes Child Level 3 under Grandparent Level 1 (ContinentGrandparent table)
- Transition: On submission, advances to S1 (Level_2_Processing_Validating_Resolving) via PBFD2 (Table A.14.2)
Appendix A.17.3. Country Level (Child Level 4, Grandparent Level 2)
- 1.
- Hierarchical Structure
| Child LocationId | ChildId | Child Node | Parent Node (Columns) | Grandparent Node (Table) |
|---|---|---|---|---|
| 9 | 0 | United States | North America | ContinentParent |
| 10 | 1 | Canada | North America | ContinentParent |
| 19 | 0 | United Kingdom | Europe | ContinentParent |
| 20 | 1 | France | Europe | ContinentParent |
| 24 | 0 | China | Asia | ContinentParent |
| 25 | 1 | India | Asia | ContinentParent |
| PersonId | North America | Europe | Asia |
|---|---|---|---|
| 1 | 3 | 3 | 0 |
- 2.
- Key Workflow
- Parent Nodes: Columns in the ContinentParent table (e.g., "North America") correspond to selected continents from the previous level (Table A.17.2).
- Child Bitmasks: Each column value encodes selected countries using a bitmask (e.g., 00000011 for United States and Canada, as shown under the [North America] column in Table A.17.4).
- UI Rendering: The LocationViewModel populates checkboxes for countries under selected continents (Figure A.17.2). Only child nodes (countries) and parent nodes (Continents) are displayed, with grandparent nodes managed by middleware. This hierarchical approach continues consistently down to the city level.

- 3.
- Interpretation
- Bitmask Value: 3 (binary 00000011 (8-bit format))
-
Set Bits:
- ○
- Bit 0: United States (Node 4.1 in Figure A.14.1)
- ○
- Bit 1: Canada (Node 4.2 in Figure A.14.1)
- Storage: Saved in the North America column of the Continent table (Table A.17.4)
- Bitmask Value: 3 (binary 00000011(8-bit format))
-
Set Bits:
- ○
- Bit 0: United Kingdom (Node 4.5 in Figure A.14.1)
- ○
- Bit 1: France (Node 4.6 in Figure A.14.1)
- Storage: Persisted in the Europe column of the Continent table (Table A.17.4)
- Bitmask Value: 0 (binary 00000000(8-bit format))
- Set Bits: None (all bits unset)
- Storage: Persisted in the Asia column of the Continent table (Table A.17.4)
- 4.
- Workflow Impact
- Selection: Selecting a country (e.g., United States) causes the corresponding state-level tables to be displayed.
- Deselection: Unchecking a country (e.g., Canada) invokes the LocationResetService, recursively nullifying child data (states, counties, etc.).
- 5.
- State Machine Context
- Current State: S1 (Level_2_Processing_Validating_Resolving) (Table A.14.1)
- TLE Structure: Processes Child Level 4 under Grandparent Level 2 (ContinentParent table)
- Transition: Advances to S2 (Level_3_Processing_Validating_Resolving) via PBFD3 after validation
Appendix A.17.4. State Level (Child Level 5, Grandparent Level 3)
- 1.
- Hierarchical Structure
| Child LocationId | ChildId | Child Node | Parent Node (Columns) | Grandparent Node (Table) |
|---|---|---|---|---|
| 38 | 11 | Virginia | United States | North America |
| 45 | 18 | Maryland | United States | North America |
| 77 | 0 | Ontario | Canada | North America |
| 89 | 12 | Nunavut | Canada | North America |
| PersonId | United States | Canada |
|---|---|---|
| 1 | 264192 | 4097 |
- 2.
- Key Workflow
- Grandparent Tables: Each grandparent table (e.g., North America in this sample) corresponds to a continent selected at the Country Level (Table A.17.4).
- Parent Columns: Columns in the grandparent table (e.g., "United States" in North America) represent selected countries.
- Child Bitmasks: Bitmasks in parent columns encode selected states (e.g., 264,192 for Virginia + Maryland in the United States in Table A.17.6)
- 3.
- Interpretation (Derived from Table A.17.6 and Figure A.17.3)

-
Parent Column (United States):
- ○
- Bitmask Value: 264,192 (binary 1000000100000000000 (20-bit format))
- ○
-
Set Bits:
- ▪
- Bit 11: Virginia (Node 5.2 in Figure A.14.1)
- ▪
- Bit 18: Maryland (Node 5.1 in Figure A.14.1)
-
Parent Column (Canada):
- ○
- Bitmask Value: 4,097 (binary 0001000000000001(16-bit format))
- ○
- Set Bits:
- ▪
- Bit 0: Ontario (Node 5.4 in Figure A.14.1)
- ▪
- Bit 12: Nunavut (Node 5.3 in Figure A.14.1)
- The same LocationViewModel renders checked states (e.g., Maryland, Nunavut) across all grandparent tables (e.g., North America, Europe), as shown in Figure A.17.3.
- Selected states are stored as bitmasks in the North America table (Table A.17.6), with columns representing parent countries.
- 4.
- Technical Note
- 5.
- Workflow Impact
- Selection: Choosing a state (e.g., Maryland) causes the corresponding county-level tables and user interfaces to be displayed.
- Deselection: Unchecking a state (e.g., Virginia) invokes the LocationResetService, recursively nullifying child data (counties, cities).
- 6.
- State Machine Context
- Current State: S2 (Level_3_Processing_Validating_Resolving) (Table A.14.1)
- TLE Structure: Processes Child Level 5 under Grandparent Level 3 (e.g. [North America] table)
-
Transition:
- ○
- On success: Advances to S3 (Level_4_Processing_Validating_Resolving) via PBFD4
- ○
- On failure: Transitions to S5 (Refine_Level1-3) (Table A.14.1) via PBFD6
Appendix A.17.5. County Level (Child Level 6, Grandparent Level 4)
- 1.
- Hierarchical Structure
| Child LocationId | ChildId | Child Node | Parent Node (Columns) | Grandparent Node (Table) |
|---|---|---|---|---|
| 92 | 2 | Baltimore County | Maryland | United States |
| 102 | 12 | Howard County | Maryland | United States |
| 120 | 6 | Arlington County | Virginia | United States |
| 186 | 28 | Fairfax County | Virginia | United States |
| PersonId | Virginia | Maryland |
|---|---|---|
| 1 | 268435520 | 4100 |

- 2.
- Key Workflow
-
Decimal Value: 268,435,520
- ○
- Binary Value: 00010000000000000000000001000000 (32-bit format)
- ○
-
Bit Positions Set:
- ▪
- Bit 6: Arlington County (Node 6.3 in Figure A.14.1)
- ▪
- Bit 28: Fairfax County (Node 6.4 in Figure A.14.1)
- UI: Both counties (Arlington and Fairfax) appear as checked checkboxes in Figure A.17.4.
-
Decimal Value: 4,100
- ○
- Binary Value: 0001000000000100 (16-bit format)
- ○
-
Bit Positions Set:
- ▪
- Bit 2: Baltimore County (ChildId = 2, Node 6.1 in Figure A.14.1)
- ▪
- Bit 12: Howard County (ChildId = 12, Node 6.2 in Figure A.14.1)
- UI: Both Baltimore County and Howard County appear as checked checkboxes in Figure A.17.4.
- 4.
- Technical Note
- 5.
- Workflow Impact
- Current State: S3 (Level_4_Processing_Validating_Resolving) (Table A.14.1)
- TLE Structure: Processes Child Level 6 embedded in Grandparent Level 4 (e.g. [United States] table)
- Transition: Advances to S4 (Level_5_Processing_Validating) via PBFD5
Appendix A.17.6. City Level (Child Level 7, Grandparent Level 5)
- 1.
- Hierarchical Structure
| Child LocationId | ChildId | Child Node | Parent Node (Columns) | Grandparent Node (Table) |
|---|---|---|---|---|
| 138 | 0 | Arbutus | Baltimore County | Maryland |
| 139 | 1 | Catonsville | Baltimore County | Maryland |
| 146 | 0 | Columbia MD | Howard County | Maryland |
| 147 | 1 | Ellicott City | Howard County | Maryland |
| 149 | 3 | Laurel | Howard County | Maryland |
| 156 | 0 | Arlington | Arlington County | Virginia |
| 164 | 8 | Virginia Square | Arlington County | Virginia |
| PersonId | Baltimore County | Howard County |
|---|---|---|
| 1 | 3 | 3 |
| PersonId | Arlington County | FairFax County |
|---|---|---|
| 1 | 257 | 0 |

- 2.
- Key Workflow
- Binary: 00000011 (8-bit format)
-
Set Bits:
- ○
- Bit 0: Columbia MD (Node 7.3 in Figure A.14.1)
- ○
- Bit 1: Ellicott City (Node 7.4 in Figure A.14.1)
- UI: Both cities are checked in Figure A.17.5.
- Binary: 00000011 (8-bit format)
-
Set Bits:
- ○
- Bit 0: Arbutus (Node 7.1 in Figure A.14.1)
- ○
- Bit 1: Catonsville (Node 7.2 in Figure A.14.1)
- UI: Both cities are checked in Figure A.17.5.
- Binary: 100000001 (9-bit format)
-
Set Bits:
- ○
- Bit 0: Arlington (Node 7.5 in Figure A.14.1)
- ○
- Bit 8: Virginia Square (Node 7.6 in Figure A.14.1)
- UI: Both cities are checked in Figure A.17.5.
- Binary: 00000000 (8-bit format)
- Interpretation: No cities selected
- UI: All cities under Fairfax County are unselected and not shown in Figure A.17.5.
- Selected cities are stored as bitmasks in State Level tables (e.g., Maryland, Virginia) under county columns (Tables A.17.10. and Tables A.17.11).
- 4.
- Workflow Impact
- Selection: Selected cities are encoded as bitmasks within their respective parent county columns (e.g., Columbia MD, stored in the Howard County column).
- Deselection: Unchecking a city (e.g., Virginia Square) updates the bitmask and nullifies its data.
- 5.
- State Machine Context
- Current State: S4 (Level_5_Processing_Validating) (Table A.14.1)
- TLE Structure: Processes Child Level 7 embedded in Grandparent Level 5 (e.g., Maryland table)
- Transition: Advances to S6 (Finalize_All) via PBFD9
Appendix A.17.7. The Report Page
-
Caching Mechanism:
- ○
- Metadata Cache: Preloads table/column names (e.g., ContinentGrandparent, North America)
- ○
- Data Cache: Stores hierarchical data (e.g., continent-country mappings)
- Recursive CTE Engine: Constructs hierarchical paths using SQL Common Table Expressions
- Bitwise Decoder: Resolves selected nodes from stored bitmasks (e.g., Continent = 21 → North America + Europe + Asia)
-
Queue Initialization:
- ○
- Starts from the root node (ContinentGrandparent, Node 1 in Figure A.14.1) and processes checked nodes breadth-first
-
TLE Rule Traversal:
- ○
- Grandparent: Active table (e.g., ContinentGrandparent)
- ○
- Parent: Columns representing child nodes of grandparents (e.g., North America)
- ○
- Child: Bitmasks encoding grandchild node selections (e.g., United States and Canada under North America)
-
Path Generation:
- ○
- Uses recursive CTEs to build paths (e.g., Continent → North America → United States)
- Aggregation: Combines visited paths into a unified report (Figure A.17.6)

Appendix A.17.8. Development with CDD
- 1.
- Refactoring Journey
-
Initial Approach:
- ○
- Redundant Components: Each level (ContinentGrandparent, ContinentParent, and Continent) had dedicated models, views, and controllers.
- ○
- Bottleneck: Code duplication increased maintenance costs at the Continent Level (grandparent Level 3 in Figure A.14.1).
-
Realization of Shared Logic:
- ○
- Hierarchical Symmetry: Identified recurring patterns (TLE Rule) across levels
- ○
-
Refactoring:
- ▪
- Shared Models: LocationViewModel, LocationSaveService
- ▪
- Unified View: Dynamic UI rendering based on JSON configuration
- ▪
- Centralized Controller: LocationController handling all levels
-
Impact:
- ○
- Workflow Alignment: Aligns UI-centric child-level workflows with the database's grandparent table hierarchy. Curve a (See Figure A.14.1) depicts this mapping: As UI focus shifts from child data at Level 5 (e.g., States) up to Level 3 (e.g., Continents), the corresponding database operations target grandparent tables from Level 3 (e.g., the Continent table) up to Level 1 (e.g., the ContinentGrandparent table).
- 2.
- State Machine Context
-
Termination Assurance
- ○
- Per-level refinement limit: refinement_attempts[j] ≤ Rₘₐₓ = 50 (See Appendix A.14.3)
- ○
-
Error enforcement:
- ▪
- PBFD6: Level 1-3 failure after 50 attempts
- ▪
- PBFD9: Finalization failure
-
State Machine Conformance
- ○
-
TLE state mappings:
- ▪
- Continent: S0 → Grandparent Level 1
- ▪
- City: S4 → Grandparent Level 5
- ○
-
Refinement triggers:
- ▪
- Shared component refactoring: PBFD6 → S5 (See Table A.14.2)
-
Parameter Invariance
- ○
- Root-cause level: Jᵢ=1 (Grandparent Level)
- ○
-
Refinement scope:
- ▪
- Rᵢ = i - Jᵢ + 1 (Appendix A.14.3)
- ▪
- Example: Level 3 failure → Rᵢ=3 (Levels 1-3)
- Complexity Bounds (See Table A.17.12)
| Metric | PBFD Value | Reference |
|---|---|---|
| Hierarchy Depth (L) | 5 | Table A.14.4 |
| States (⎥Q⎥) | 9 | Table A.14.1 |
| Transitions (⎥δ⎥) | 10 | Table A.14.2 |
| Max Attempts Recorded | 1 (<< Rₘₐₓ=50) | Appendix A.17.8 |
- 4.
- Key Advantage
Appendix A.17.9. Backtracking to complete the application
-
Country Level Completion
- ○
- Existing Parents: Added missing countries under continents (e.g., Japan under Asia)
- ○
- Validation: Verified bitmask updates in the ContinentParent table (e.g., Asia’s bitmask expanded to include Japan)
-
State Level Expansion
- ○
- Existing Parents: Added missing states under countries (e.g., Kanto under Japan)
- ○
- Testing: Confirmed state bitmasks in the Asia table (e.g., Japan’s Kanto = 1)
-
County/City Integration
- ○
- Existing Parents: Added counties under states (e.g., Tokyo Metropolis under Kanto) and cities under counties (e.g., Tokyo City)
- ○
- Regression Testing: Ensured no conflicts with existing data (e.g., Maryland’s counties unaffected)
- Current State: S6 (Finalize_All) (Table A.14.1)
- TLE Structure: Processes Child Levels 3-7 embedded in Grandparent Levels 1-5
- Transition: Finalizes processing, entering completion phase (S7) via PBFD10
- Failure Handling: Exceeding Rₘₐₓ = 50 refinement attempts in S5 transitions to S8 (Validation_Failure), terminating the workflow
- Hierarchical Integrity: Maintains the TLE Rule (e.g., Asia → Japan → Kanto)
-
Testing:
- ○
- Bitwise Validation: Ensures new additions (e.g., Japan) do not corrupt existing selections (e.g., China)
- ○
- UI Consistency: Confirms new nodes appear in workflows (Figure A.14.1)
- Hierarchical Flexibility: The TLE Rule allows seamless addition of nodes at any level.
- Efficiency: Leveraging similarities between neighboring nodes (e.g., Maryland/Virginia counties) reduces redundant coding.
Appendix A.18. Comparative Analysis of PDFD and PBFD MVP Implementations
Appendix A.18.1. Foundational Similarities
- Hierarchical Data Modeling: Both approaches structure information using explicit parent–child relationships (e.g., Continent → Country → State). At a finer granularity, nodes are modeled as individual units in a directed graph, supporting localized validation and dependency tracking.
- Component-Driven Architecture: Modular MVC components (views, models, and controllers) promote reusability and maintenance across hierarchical levels.
- User Interaction Workflows: Dynamic forms and multi-level selection UIs are driven by back-end traversal logic.
- Hybrid Methodology Integration: Both leverage elements of DFD, BFD, and CDD to enable top-down progression, subtree resolution, and refinement cycles.
Appendix A.18.2. Key Differences in Methodological Strategy
| Aspect | PDFD | PBFD |
|---|---|---|
| Core Approach | Hybrid Depth-First: Vertical slice traversal with concurrent processing of same-level nodes | Hybrid Breadth-First: Pattern-grouped traversal with selective vertical descent |
| Key Strategy | Sequential subtrees with bounded vertical depth | Pattern compaction and horizontal aggregation using TLE and bitmasks |
| Key Technology | Feature-based selective traversal (e.g., BF-by-Two) | Bitmask encoding and Three-Level Encapsulation (TLE) |
Appendix A.18.3. Graph Traversal Workflow
| Aspect | PDFD | PBFD |
|---|---|---|
| Node Selection | Feature-selected nodes per level | Pattern-based node groups |
| Progression | Vertical-first traversal | Horizontal-first compaction followed by vertical descent |
| Refinement Scope | Narrow, vertical chains | Broad pattern groups spanning multiple levels via TLE |
Appendix A.18.4. Pilot Tunnelling Strategies
| Aspect | PDFD | PBFD |
|---|---|---|
| Tunneling Analogy | Small pilot tunnel → feature-driven scaling | Large pilot tunnel → pattern-driven scaling |
| Focus | Vertical validation with minimal breadth | Horizontal breadth with controlled depth |
| Efficiency Driver | Early risk detection | Early structural optimization via TLE patterns |
| Scale | Suitable for small to mid-sized systems | Designed for enterprise-grade and distributed systems |
Appendix A.18.5. Development Workflow
| Aspect | PDFD | PBFD |
|---|---|---|
| Core Workflow Pattern | Depth-first exploration with subtree completion | Breadth-first pattern grouping followed by selective descent |
| Branching Strategy | Narrow branching (few nodes per level) | Wide branching across three-level spans (grandparent–child) |
| CDD Iterations | Higher (3 iterations during refinement) | Lower (pre-optimized structure reduces iteration count to 1) |
Appendix A.18.6. Database Architecture
| Aspect | PDFD | PBFD |
|---|---|---|
| Lookup Table | Multiple normalized tables with foreign key relationships | Single adjacency-list table (e.g., Locations table in Table A.14.3) |
| Base Table | Per-level normalized relational tables | Per-grandparent dynamic tables using TLE |
| Query Complexity | JOIN-heavy SQL queries | Bitwise queries within denormalized bitmask tables |
Appendix A.18.7. Data Storage Models
| Aspect | PDFD | PBFD |
|---|---|---|
| Data Model | Row-based (1 record per selected node) | Bitmask-based (1 row encodes multiple selections) |
| Storage Efficiency | Higher overhead due to repeated foreign keys | Compact, bit-level efficiency |
| Scalability | Limited by relational constraints and locking | Optimized for horizontal scaling and parallel operations |
Appendix A.18.8. Relational Table Structures
| Aspect | PDFD | PBFD |
|---|---|---|
| Schema Design | Dedicated table per hierarchical level | Per-grandparent table generated dynamically via TLE |
| Scalability | Constrained by row growth and indexing | Scales through distributed grandparent tables |
| Join Complexity | Multi-table joins for full traversal | Joins only between grandparent tables and the global Person table |
Appendix A.18.9. MVC Architecture
| Aspect | PDFD | PBFD |
|---|---|---|
| Model | Static models per level (e.g., CountryModel, StateModel) | Unified dynamic view model (LocationViewModel) derived from metadata |
| View | Level-specific Razor views | Shared Razor view for all hierarchical levels |
| Controller | Multiple specialized controllers | Single reusable controller (e.g., LocationController) |
Appendix A.18.10. Performance & Scalability
| Aspect | PDFD | PBFD |
|---|---|---|
| Query Speed | Slower due to multi-join queries (O(n)) | Faster using in-place bitwise operations (O(1)) |
| Write Efficiency | Multiple-row inserts/updates (O(n)) | Single-row bitmask updates (O(1)) |
| Storage Footprint | Higher due to normalized rows | Lower due to compact binary encoding |
| Distributed Support | Challenging due to ACID across tables | Optimized for horizontal sharding via table-level separation |
Appendix A.18.11. Comparative Strengths and Tradeoffs
| Approach | Strengths | Limitations |
|---|---|---|
| PDFD | Intuitive for traditional developers Simpler debugging workflows |
Inefficient for large-scale graphs High storage/query costs |
| PBFD | High performance and scalability Optimized for modern cloud systems |
Higher implementation complexity Limited mainstream tooling support |
Appendix A.18.12. Example Workflows
- Level 1: Continents → North America, Asia
- Level 2: Countries → USA, Canada
- Level 3: States → Maryland, Virginia
- Level 3: Compact all continents into bitmasks (e.g., `00010101` for North America, Asia, Europe)
- Level 4: Compact countries under selected continents (e.g., North America = `00000011` for USA + Canada)
- Level 5: Compact states under selected countries (e.g., USA = `264,192` for Maryland + Virginia)
Appendix A.18.13. Methodology Suitability Guidelines
- Use PDFD for small-to-medium systems with limited depth, or where team familiarity and debugging clarity are essential
- Use PBFD for complex, deeply nested systems requiring performance, compact storage, and horizontal scalability
Appendix A.19. Real-World Structural Workflow Mermaid Code
Appendix A.20. Observational Case Study on Development Effort
Appendix A.20.1. Methodological Context and Related Work
- Unit of Comparison: Development methodology (PBFD vs. relational vs. OmniScript)
- Evaluation Focus: Person-month effort, calendar duration, scope completeness
- Controlled Variables: Shared enterprise context, comparable functional requirements, consistent audit logging
- Independent Variable: Implementation methodology and platform
- Study Type: Longitudinal observational case study with embedded effort estimation
Appendix A.20.2. Project Characteristics Overview
| Implementation | Methodology/Platform | Team Size | Time Required (Calendar Months) | Year | Scope Delivered |
|---|---|---|---|---|---|
| Effort A (PBFD Enterprise) | PBFD, bitmask, TLE | 1 primary developer | 1 (Jun–Jul) | 2016 | Full System (Production) |
| Effort B (Relational Port) | Traditional relational schema (SQL Server) | 2 part-time developers (0.35 & 0.15 FTE) | 9 | 2021–2022 | DB schema and data migration (No UI/Middleware) |
| Effort C (Salesforce) | Salesforce OmniScript | 7 developers | 24 | 2022–2024 | UI + logic (undeployed) |
- For Effort A: The "1 primary developer" refers to the PBFD inventor. Two auxiliary developers contributed non-overlapping, sequential efforts (including code development, validation, and training) spanning approximately one to two weeks. The primary developer estimated that replicating this auxiliary work would have required only 1-2 additional days. Because this effort was minimal, non-overlapping, and not part of the core PBFD development activity, it is excluded from the primary metrics. It is a critical threat to validity that the principal developer was also the methodology inventor, a known confound in productivity studies [147,148]. We acknowledge this limits the ability to draw definitive causal inference solely on the methodology.
- For Effort B: The same individual who was the primary developer for Effort A contributed 0.35 FTE to Effort B.
- For Effort C: Involved a team of 7 developers with varying engagement: 2 core developers (each at ~0.3 FTE) and 5 nominal developers (contributors with assigned roles but limited, sustained effort at ~0.05 FTE each), totaling an estimated 20.4 FTE-months over 24 calendar months. Effort C is included to illustrate platform-specific development challenges and provide context for comparative effort estimation, despite its incomplete status. This effort remained incomplete and undeployed, making direct quantitative comparison challenging.
Appendix A.20.3. Scope of Delivered Functionality
- Hierarchical question flow (up to 8 hierarchical levels)
- Conditional branching logic with enable/disable rules
- Diverse input types: checkboxes, multi-select dropdowns, text fields
- Real-time validation and navigation
- Secure submission pipeline with persistence and audit logging.
- Storage Optimization
| Key Aspect | Effort A (PBFD) | Effort B (Relational Port) | Effort C (Salesforce OmniScript) |
|---|---|---|---|
| End-to-End Claim Form | ✅ Production | ❌ (DB schema only, no UI/middleware) | ⚠️ Incomplete |
| Full UI/UX Integration | ✅ Production | ❌ (UI layer not implemented) | ⚠️ Incomplete |
| Question Hierarchy Support (8 levels) | ✅ (Native PBFD bitmasking) | ✅ (via complex SQL JOINs) | ⚠️ Incomplete |
| Dynamic Flow + Conditionals | ✅ Production | ✅ (Logic in DB) | ⚠️ Incomplete |
| Storage Optimization | ✅ (bitmask encoding) | ❌ (normalized schema, higher redundancy) | ❌ (Platform-managed) |
| Deployment Readiness | ✅ (in production since 2016) | ❌ (no front-end, not deployable) | ⚠️ In progress (not deployed) |
Appendix A.20.4. Observed Efficiency Comparison
| Comparison | Observed Ratio (Calculation) | Context and Justification |
|---|---|---|
| PBFD vs. Relational Port (A vs B) | ~9x ( (4.5 FTE-months * 2) / 1 FTE-month ) | Full-stack system (A: 1 FTE-month) vs. backend-only implementation (B: 4.5 FTE-months). A multiplier of 2x was applied to Effort B's DB effort to estimate the missing UI/middleware effort. This multiplier is derived from organizational historical data for projects of similar logic complexity and aligns with conservative expert judgment in software project estimation [149]. This estimates a total ~9 FTE-month effort for a full relational stack. |
| PBFD vs. OmniScript (A vs C) | ~20x (20.4 FTE-months / 1 FTE-month) | Full-stack system (A: 1 FTE-month) vs. incomplete UI+logic (C: ≥20.4 estimated FTE-months). The credibility of this FTE-month estimate is supported by its close alignment with the 24-month calendar timeline (see Section A.20.2). Effort C's incomplete status suggests the actual ratio upon completion would be higher. This comparison is primarily illustrative of the platform-specific challenges encountered. |
Appendix A.20.5. Summary of Threats to Validity
- Developer Expertise Variation: While all implementations were led by expert developers, skill levels and methodology familiarity vary across individuals. Development of both PBFD and the relational baseline was led by the methodology’s inventor, while OmniScript implementations were carried out by other expert developers, some of whom possessed decades of development experience.
- OmniScript Incomplete Implementation: The OmniScript comparison measures effort at an incomplete state, while PBFD reached full production deployment. This introduces scope normalization challenges.
- Same-Developer Learning Asymmetry (PBFD vs. Relational): The same developer led both implementations, possessing 25+ years of relational database expertise, in contrast to concurrent learning while inventing PBFD, which created an expertise asymmetry favoring relational approaches.
- Temporal Span: Implementations span 2016–2024, introducing potential confounds from evolving tools and practices.
- Method Inventorship: The inventor of PBFD/PDFD led the PBFD implementation, which may introduce bias toward more efficient realization of the methodology. This threat is mitigated by the conservative biases described above.
Appendix A.21. A Longitudinal Performance Evaluation of PBFD Versus Traditional Relational Approaches
Appendix A.21.1. Methodology
- Hardware & OS: Identical CPU, memory, storage, and Windows Server instance.
- Database Server: Shared SQL Server instance with identical configuration, buffer pools, and query execution resources.
- Network: No inter-module latency; all communication occurred over the same internal path.
- Load & Time: Both modules operated concurrently under the same production traffic and infrastructure conditions, though workload characteristics varied by controller and logic path.
- PBFD operations: A scoped, read-optimized workload, identified in the audit log as ControllerName = 'MainController' AND ActionName NOT IN ('UpdateX','DeleteX','SaveX'). These operations typically involve multi-level hierarchical navigation and complex pattern matching.
- Traditional operations: Traditional operations represent a heterogeneous mix of CRUD operations, reporting queries, and business logic processing across approximately 11 controllers. While not functionally identical to PBFD’s read-optimized scope, this aggregate baseline reflects the realistic complexity of enterprise systems against which PBFD must perform.
- P5 (5th percentile): Infrastructure/middleware floor
- P50 (median): Typical user experience
- P95 (95th percentile): Tail latency, critical for scalability
- Average (mean): Reported for completeness but interpreted with caution due to skew
Appendix A.21.2. Experimental Environment
| Component | Specification |
|---|---|
| Application Framework | ASP.NET MVC on .NET Framework 4.8 |
| Web Server | IIS 10.0 on Windows Server 2016 Std. |
| Database Server | Microsoft SQL Server 2016 |
| Web Server CPU | Quad-Core, 2.6 GHz (Model 55) |
| Database Server CPU | 8-Core, 2.6 GHz (Model 55) |
| Web Server RAM | 16 GB |
| Database Server RAM | 99 GB |
| Network | vmxnet3 Ethernet Adapter (~4 Gb/s) |
| Storage | SSD-backed (RAID configuration) |
Appendix A.21.3. SQL Query
Appendix A.21.4. Results
| Metric (ms) | P5 | P50 | P95 | Average |
|---|---|---|---|---|
| PBFD | 16 | 47 | 406 | 118.46 |
| Traditional | 16 | 359 | 3469 | 881.49 |
| (Trad/PBFD) | 1 | 7.64 | 8.54 | 7.44 |
- A ratio of 1.0 at P5 indicates both methodologies hit the same infrastructural latency floor, confirming that performance differences are due to application- and database-level processing.
- The consistency of performance ratios across all percentiles (P50, P95, average) and the large sample size (46+ million events) provide strong evidence for the observed performance differences, though formal statistical testing was not performed given the complete population data.
Appendix A.21.5. Key Findings
- Median Performance (P50): PBFD processed requests 7.64× faster than the traditional aggregate, improving efficiency for typical operations.
- Tail Latency (P95): PBFD reduced slow-response outliers by 8.54×, showing superior scalability under load. In deeply-nested architectures, high tail latencies can cascade and become the dominant factor in overall user-perceived performance, making their mitigation a critical engineering goal [152].
- Average Latency: PBFD achieved a 7.44× improvement, confirming consistent performance gains.
- Performance Floor (P5): Both shared a 16 ms lower bound, reflecting a common infrastructure/middleware baseline.
- Effect Size: The 7–8× performance improvement represents a large effect size by conventional standards in software performance evaluation, particularly notable given that both systems operated under identical environmental constraints.
Appendix A.21.6. Threats to Validity
- Construct Validity (Workload heterogeneity): The traditional baseline encompassed ~11 controllers with diverse workloads, not all directly comparable to PBFD’s read-optimized scope. This heterogeneity—which includes simpler operations alongside complex ones—may understate PBFD’s efficiency but provides a realistic enterprise baseline. Reported ratios should be interpreted as conservative lower-bound estimates.
- Internal Validity (Implementation factors): While infrastructure was controlled, minor differences in query patterns or transient load conditions may exist. The long (8-year) observation window helps mitigate transient effects. Furthermore, the use of percentiles over means reduces the impact of outlier events on the overall results [150,151].
- External Validity (Generalizability): Results stem from a single large-scale enterprise deployment. While ecologically valid [97], replication in other environments is necessary to establish generalizability.
Appendix A.21.7. Conclusion
Appendix A.22. A Comparative Analysis of Storage Efficiency: PBFD vs. Traditional Relational Deployment
Appendix A.22.1. Methodology
-
Unit of Comparison: Two alternative schema architectures instantiated over the same dataset:
- ○
- Traditional 3NF (multi-table, join-based)
- ○
- PBFD/TLE (wide-form, bitmask-encoded, minimal table count)
-
Evaluation Focus:
- ○
- Structural reduction (tables, rows, junctions, indexing strategy)
- ○
- Physical storage usage (reserved space, index size, unused space, row volume)
-
Controlled Variables:
- ○
- Same DBMS
- ○
- Same hardware and configuration
- ○
- Same source dataset used for schema population
- ○
- Same total record volume mapped according to each schema’s structure
- Independent Variable: Schema design paradigm (join-centric 3NF vs. compact PBFD/TLE
- Data Source Handling: The dataset is identical in origin, but table counts and row distributions differ due to schema architecture (e.g., 4.7M rows normalized vs. 170K rows in PBFD per Table A.22.2)
- Study Type: Controlled schema-level experiment focused on structural and storage efficiency
| Feature | Traditional 3NF | PBFD |
|---|---|---|
| Core Transactional Tables | 6 | 2 (Wide-form, bitmask-encoded) |
| Explicit Junction Tables | 7 | 0 |
| Indexing Strategy | Per-entity and per-relationship (join-focused) | Minimal (payload- and query-focused) |
- Complex hierarchical structures (8-level nested claims).
- Dynamic validation and conditional branching logic.
- Comprehensive, timestamped audit logging and versioning.
- Tool: sp_spaceused executed via sp_msforeachtable across all user-defined tables [153]
- Timing: Immediately after scheduled index maintenance to standardize fragmentation
- Scope: User-defined tables and indexes only; system metadata excluded
- Dataset: 8 years of production data (Traditional: 4.7M rows across all tables; PBFD: 170K rows in core tables).
Appendix A.22.2. Results
| Metric | Traditional | PBFD | Ratio (Trad/PBFD) |
|---|---|---|---|
| Core Tables | 6 | 2 | 3.0× |
| Total Rows | 4.7M | 170K | 27.6× |
| Reserved Space (KB) | 658,768 | 56,168 | 11.7× |
| Index Size (KB) | 37,040 | 432 | 85.7× |
| Unused Space (KB) | 5,448 | 48 | 113.5× |
Appendix A.22.3. Key Findings
- Structural Simplification: PBFD’s schema required 3× fewer core tables and eliminated all 7 junction tables, drastically simplifying the data model and query execution paths.
- Storage Efficiency: PBFD achieved 11.7× reduction in reserved space, 85.7× reduction in index overhead, and 113.5× improvement in page utilization.
- Operational Performance Linkage: The drastic reduction in row count and index size directly lowers I/O pressure and improves buffer pool cache locality. This optimized data footprint complements bitmask encoding as a key contributor to the 7–8× faster query performance documented in Appendix A.21, as query processing involves scanning fewer data pages.
- Methodological Traceability: This experiment isolates schema structure as the independent variable, aligning with the controlled design dimensions in Table 55.
- Formal Integration: PBFD’s schema design is consistent with the TLE model in Section 4.2, linking empirical outcomes to theoretical guarantees.
Appendix A.22.4. Threats to Validity
- Construct Validity: Metrics focus exclusively on user data storage. System metadata is excluded. Lookup tables are omitted from comparison ratios due to their optional role in downstream functionality and inconsistent presence across implementations.
- Internal Validity: Traditional schema may include legacy optimizations. Post-maintenance measurements minimize index fragmentation bias.
- External Validity: The results are most directly applicable to systems managing complex hierarchical data. The efficiency gains for flat, transactional data may differ. Furthermore, the absolute savings are influenced by SQL Server’s storage engine (e.g., 8KB page size), though the relative gains are expected to hold across relational platforms.
Appendix A.22.5. Conclusion
References
- Skillcrush. 8 Full-Stack Development Trends to Look Out for in 2025. Skillcrush 2025. Available online: https://skillcrush.com/blog/full-stack-developer-trends/ (accessed on 15 May 2025).
- GeeksforGeeks. Top 10 Full Stack Development Trends in 2025. GeeksforGeeks 2025. Available online: https://www.geeksforgeeks.org/blogs/full-stack-development-trends/ (accessed on 15 May 2025).
- IBM. IBM Full Stack Software Developer Professional Certificate. Coursera 2024. Available online: https://www.coursera.org/professional-certificates/ibm-full-stack-cloud-developer (accessed on 15 May 2025).
- Talent500. Full Stack Developer Roadmap 2025: Skills & Guide. Talent500 2025. Available online: https://talent500.com/blog/full-stack-developer-roadmap-2025 (accessed on 15 May 2025).
- Stack Overflow. Developer Survey 2025. Stack Overflow 2025. Available online: https://survey.stackoverflow.co/2025 (accessed on 15 May 2025).
- Beck, K.; Beedle, M.; van Bennekum, A.; Cockburn, A.; Cunningham, W.; Fowler, M.; et al. Manifesto for Agile Software Development. Agile Alliance 2001. Available online: https://agilemanifesto.org (accessed on 15 May 2025).
- Tsilionis, K.; Ishchenko, V.; Wautelet, Y.; Simonofski, A. Scaling Agility in Large Software Development Projects: A Systematic Literature Review. In Research and Innovation Forum 2023; Springer Proceedings in Complexity; Visvizi, A., Troisi, O., Corvello, V., Eds.; Springer: Cham, 2024; pp. 1–15. [Google Scholar]
- Santos, P.d.; de Carvalho, M.M. Exploring the challenges and benefits for scaling agile project management to large projects: a review. Require. Eng. 2022, 27, 117–134. [Google Scholar] [CrossRef]
- Stojanovic, Z.; Dahanayake, A.; Sol, H.G. Modeling and Architectural Design in Agile Development Methodologies. In Proceedings of the 8th CAISE/IFIP8.1 International Workshop on Evaluation Methods in System Analysis and Design; Velden, M., Ed.; 2003; pp. 180–189. [Google Scholar]
- Mognon, F.; C. Stadzisz, P. Modeling in Agile Software Development: A Systematic Literature Review. In Agile Methods; Silva da Silva, T., Estácio, B., Kroll, J., Mantovani Fontana, R., Eds.; Communications in Computer and Information Science, Vol. 680; Springer: Cham, 2017; pp. 1-15.
- Northwood, C. The Full Stack Developer: Your Essential Guide to the Everyday Skills Expected of a Modern Full Stack Web Developer; Apress: New York, 2018. [Google Scholar]
- Zammetti, F. Modern Full-Stack Development: Using TypeScript, React, Node.js, Webpack, Python, Django, and Docker; Apress: New York, 2022. [Google Scholar]
- Mkaouer, W.; Kessentini, M.; Sahraoui, H.; Bechikh, S.; Deb, K. Many-objective software remodularization using NSGA-III. ACM Trans. Softw. Eng. Method. 2015, 24, 1–45. [Google Scholar] [CrossRef]
- Recker, J. Opportunities and constraints: the current struggle with BPMN. Bus. Process Manag. J. 2010, 16, 181–201. [Google Scholar] [CrossRef]
- Kandogan, E.; Kraska, T.; Li, F.; Wu, E. Orchestrating Agents and Data for Enterprise: A Blueprint Architecture for Compound AI. In Proceedings of the 2025 IEEE 41st International Conference on Data Engineering Workshops; IEEE: New York, 2025; pp. 18–27. [Google Scholar]
- Liu, D. Primary Breadth-First Development (PBFD): An Approach to Full Stack Software Development. arXiv 2025, arXiv:2501.10624. [Google Scholar] [CrossRef]
- Liu, D. PBFD and PDFD: Formally Defined and Verified Methodologies and Empirical Evaluation for Scalable Full-Stack Software Engineering; Zenodo, 2025. [Google Scholar] [CrossRef]
- Besker, T.; Martini, A.; Bosch, J. Software developer productivity loss due to technical debt. J. Syst. Softw. 2019, 156, 41–61. [Google Scholar] [CrossRef]
- Perera, J.; Tempero, E.; Tu, Y.-C.; Blincoe, K. A systematic mapping study exploring quantification approaches to code, design, and architecture technical debt. ACM Trans. Softw. Eng. Method. 2024, 1, 1–35. [Google Scholar] [CrossRef]
- Kretschmer, R.; Khelladi, D.E.; Lopez-Herrejon, R.E.; Egyed, A. Consistent change propagation within models. Softw. Syst. Model. 2021, 20, 539–555. [Google Scholar] [CrossRef]
- Tkalich, A.; Klotins, E.; Moe, N.B. Identifying critical dependencies in large-scale continuous software engineering. In Proceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering; ACM: New York, 2025; pp. 157–168. [Google Scholar]
- Behutiye, W.N.; Rodriguez, P.; Oivo, M.; Tosun, A. Analyzing the concept of technical debt in the context of agile software development: A systematic literature review. Inf. Softw. Technol. 2017, 82, 139–158. [Google Scholar] [CrossRef]
- Arulraj, A.; Pavlo, A.; Menon, V. Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads. In Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data; ACM: New York, 2016; pp. 583–598. [Google Scholar]
- Meyer, A.N.; Fritz, T.; Murphy, G.C.; Zimmermann, T. Software developers' perceptions of productivity. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering; ACM: New York, 2014; pp. 19–29. [Google Scholar]
- Etikyala, S.P.; Etikyala, V. Efficiency in Cloud-Enabled Asynchronous Services: Analysis of Workflow Orchestrators. In Proceedings of the World Congress on Computer and Information Technology; WCCIT; 2023. [Google Scholar]
- University of Oxford. FDR Documentation. University of Oxford 2025. Available online: https://cocotec.io/fdr/manual/ (accessed on 15 May 2025).
- Gibson-Robinson, T.; Armstrong, P.; Boulgakov, A.; Roscoe, A.W. FDR3 — A Modern Refinement Checker for CSP. In Tools and Algorithms for the Construction and Analysis of Systems; Ábrahám, E., Havelund, K., Eds.; Lecture Notes in Computer Science, Vol. 8413; Springer: Berlin, 2014; pp. 1-15.
- Liu, D. PDFD-MVP. GitHub 2025. Available online: https://github.com/IBM-Consulting-Formal-Methods/PDFD-MVP (accessed on 15 May 2025).
- Liu, D. PBFD-MVP. GitHub 2025. Available online: https://github.com/IBM-Consulting-Formal-Methods/PBFD-MVP (accessed on 15 May 2025).
- Lenarduzzi, V.; Taibi, D. MVP Explained: A Systematic Mapping Study on the Definitions of Minimal Viable Product. In Proceedings of the 2016 42th Euromicro Conference on Software Engineering and Advanced Applications; IEEE: New York, 2016; pp. 112–119. [Google Scholar]
- Evans, E. Domain-Driven Design: Tackling Complexity in the Heart of Software; Addison-Wesley: Boston, 2003. [Google Scholar]
- Brandolini, A. Introducing EventStorming: An Act of Deliberate Collective Learning; Leanpub: Victoria, BC, Canada, 2025. [Google Scholar]
- Vernon, V. Domain-Driven Design Distilled; Addison-Wesley: Boston, 2016. [Google Scholar]
- Ihirwe, F.; Di Ruscio, D.; Mazzini, S.; Pierini, P.; Pierantonio, A. Low-code engineering for Internet of Things: A state of research. In Proceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings; ACM: New York, 2020; pp. 1–8. [Google Scholar]
- Sahay, A.; Indamutsa, A.; Di Ruscio, D.; Pierantonio, A. Supporting the understanding and comparison of low-code development platforms. In Proceedings of the 2020 46th Euromicro Conference on Software Engineering and Advanced Applications; IEEE: New York, 2020; pp. 171–178. [Google Scholar]
- Goguen, J.A.; Burstall, R.M. Introducing institutions. In Proceedings of the Carnegie Mellon Workshop on Logic of Programs; Springer: New York, 1984; pp. 221–256. [Google Scholar]
- Spivey, J.M. The Z Notation: A Reference Manual; Prentice Hall: New York, 1992. [Google Scholar]
- Jackson, D. Software Abstractions: Logic, Language, and Analysis; MIT Press: Cambridge, 2016. [Google Scholar]
- Woodcock, J.; Larsen, P.G.; Bicarregui, J.; Fitzgerald, J. Formal methods: Practice and experience. ACM Comput. Surv. 2009, 41, 1–36. [Google Scholar] [CrossRef]
- Chechik, M.; Combemale, B.; Gray, J.; et al. Formal methods in the scope of the Software and Systems Modeling journal. Softw. Syst. Model. 2025, 24, 271–272. [Google Scholar] [CrossRef]
- Schmidt, D.C. Model-driven engineering. Computer 2006, 39, 25–31. [Google Scholar] [CrossRef]
- France, R.; Rumpe, B. Model-driven development of complex software: A research roadmap. In 2007 Future of Software Engineering; IEEE: New York, 2007; pp. 37–54. [Google Scholar]
- Brambilla, M.; Cabot, J.; Wimmer, M. Model-Driven Software Engineering in Practice, Second Edition; Morgan & Claypool: San Rafael, 2017. [Google Scholar]
- Hutchinson, J.; Rouncefield, M.; Whittle, J. Model-driven engineering practices in industry. In Proceedings of the 2011 33rd International Conference on Software Engineering; ACM: New York, 2011; pp. 633–642. [Google Scholar]
- Hoare, C.A.R. Communicating Sequential Processes; Prentice Hall: New York, 1985. [Google Scholar]
- Clarke, E.M.; Grumberg, O.; Peled, D.A. Model Checking; MIT Press: Cambridge, 1999. [Google Scholar]
- Hopcroft, J.E.; Ullman, J.D. Introduction to Automata Theory, Languages, and Computation; Addison-Wesley: Boston, 1979. [Google Scholar]
- Peterson, J.L. Petri Net Theory and the Modeling of Systems; Prentice Hall: New York, 1981. [Google Scholar]
- Zimmermann, T.; Weissgerber, P.; Diehl, S.; Zeller, A. Mining version histories to guide software changes. IEEE Trans. Softw. Eng. 2005, 31, 429–445. [Google Scholar] [CrossRef]
- McIntosh, S.; Kamei, Y.; Adams, B.; Hassan, A.E. An empirical study of the impact of modern code review practices on software quality. Empir. Softw. Eng. 2016, 21, 2146–2189. [Google Scholar] [CrossRef]
- Abadi, D.J.; Madden, S.R.; Ferreira, M. Integrating compression and execution in column-oriented database systems. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, New York; ACM, 2006; pp. 671–682. [Google Scholar]
- Elmasri, R.; Navathe, S. Fundamentals of Database Systems, 7th Edition; Pearson: New York, 2016. [Google Scholar]
- Stonebraker, M.; et al. C-Store: A column-oriented DBMS. In Proceedings of the 31st International Conference on Very Large Data Bases; VLDB, 2005; pp. 553–564. [Google Scholar]
- Garcia-Molina, H.; Ullman, J.D.; Widom, J. Database Systems: The Complete Book, 2nd Edition; Pearson: New York, 2008. [Google Scholar]
- Abadi, D.J.; Boncz, P.A.; et al. Column-Stores vs. Row-Stores: How Different Are They Really? In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data; ACM: New York, 2008; pp. 967–980. [Google Scholar]
- van der Aalst, W.M.P. The application of Petri nets to workflow management. J. Circuits Syst. Comput. 1998, 8, 21–66. [Google Scholar] [CrossRef]
- Milner, R. Communicating and Mobile Systems: The π-Calculus; Cambridge University Press: Cambridge, 1999. [Google Scholar]
- Liskov, B.; Zilles, S. Specification techniques for data abstractions. ACM SIGPLAN Notices 1975, 10, 72–87. [Google Scholar] [CrossRef]
- Harel, D. Statecharts: a visual formalism for complex systems. Sci. Comput. Program. 1987, 8, 231–274. [Google Scholar] [CrossRef]
- Pnueli, A. The temporal logic of programs. In Proceedings of the 18th Annual Symposium on Foundations of Computer Science; IEEE: New York, 1977; pp. 46–57. [Google Scholar]
- Dijkstra, E.W. A Discipline of Programming; Prentice-Hall: New York, 1976. [Google Scholar]
- Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Introduction to Algorithms, 4th Edition; MIT Press: Cambridge, 2022. [Google Scholar]
- Knuth, D.E. The Art of Computer Programming, Volume 1: Fundamental Algorithms, 3rd Edition; Addison-Wesley: Boston, 1997.
- Moore, E.F. The shortest path through a maze. In Proceedings of an International Symposium on the Theory of Switching; Harvard University Press: Cambridge, 1959; pp. 285–292. [Google Scholar]
- Bass, L.; Clements, P.; Kazman, R. Software Architecture in Practice, 3rd Edition; Addison-Wesley: Boston, 2012. [Google Scholar]
- Poppendieck, M.; Poppendieck, T. Lean Software Development: An Agile Toolkit; Addison-Wesley: Boston, 2003. [Google Scholar]
- Jones, C. Software Methodologies: A Quantitative Guide; Auerbach Publications: New York, 2018. [Google Scholar]
- Edison, H.; Wang, X.; Conboy, K. Comparing Methods for Large-Scale Agile Software Development: A Systematic Literature Review. IEEE Trans. Softw. Eng. 2022, 48, 2709–2731. [Google Scholar] [CrossRef]
- Verdecchia, R.; Kruchten, P.; Lago, P. Architectural Technical Debt: A Grounded Theory. In Software Architecture; Springer: Cham, 2020; pp. 202–219. [Google Scholar]
- Curran, G.M.; Bauer, M.; Mittman, B.; Pyne, J.M.; Stetler, C. Effectiveness-Implementation Hybrid Designs: Combining Elements of Clinical Effectiveness and Implementation Research to Enhance Public Health Impact. Med. Care 2022, 50, 217–226. [Google Scholar] [CrossRef]
- Holzmann, G. The SPIN Model Checker: Primer and Reference Manual; Addison-Wesley: Boston, 2004. [Google Scholar]
- McCreesh, C.; Prosser, P. The shape of the search tree for the maximum clique problem and the implications for parallel branch and bound. ACM Trans. Parallel Comput. 2015, 2, 1–27. [Google Scholar] [CrossRef]
- Womack, J.P.; Jones, D.T. Lean Thinking: Banish Waste and Create Wealth in Your Corporation; Free Press: New York, 2003. [Google Scholar]
- Larman, C.; Basili, V.R. Iterative and Incremental Development: A Brief History. Computer 2003, 36, 47–56. [Google Scholar] [CrossRef]
- van der Aalst, W. Process Mining: Data Science in Action; Springer: Berlin, 2016. [Google Scholar]
- Derrick, J.; Boiten, E. Refinement: Semantics, Languages and Applications; Springer: Cham, 2018. [Google Scholar]
- Wiratunga, N.; Craw, S. Incorporating Backtracking in Knowledge Refinement. In Validation and Verification of Knowledge Based Systems; Springer: Boston, 1999; pp. 1–15. [Google Scholar]
- Boehm, B.W. A spiral model of software development and enhancement. Computer 1988, 21, 61–72. [Google Scholar] [CrossRef]
- Gamma, E.; Helm, R.; Johnson, R.; Vlissides, J. Design patterns: Elements of reusable object-oriented software; Addison-Wesley: Boston, 1994. [Google Scholar]
- Parnas, D.L. On the Criteria To Be Used in Decomposing Systems into Modules. Commun. ACM 1972, 15, 1053–1058. [Google Scholar] [CrossRef]
- Yourdon, E.; Constantine, L.L. Structured Design: Fundamentals of a Discipline of Computer Program and System Design; Prentice Hall: New York, 1979. [Google Scholar]
- Ruijters, E.; Stoelinga, M. Fault tree analysis: A survey of the state-of-the-art in modeling, analysis and tools. Comput. Sci. Rev. 2015, 15, 29–62. [Google Scholar] [CrossRef]
- Boehm, B.; Turner, R. Using risk to balance agile and plan-driven methods. Computer 2003, 36, 57–66. [Google Scholar]
- Clements, P.; Bachmann, F.; Bass, L.; Garlan, D.; Ivers, J.; Little, R.; Merson, P.; Nord, R.; Wood, B. Documenting Software Architectures: Views and Beyond, 2nd Edition; Addison-Wesley: Boston, 2010. [Google Scholar]
- Martin, R.C. Clean Architecture: A Craftsman's Guide to Software Structure and Design; Prentice Hall: New York, 2017. [Google Scholar]
- Lehman, M.M. Programs, life cycles, and laws of software evolution. Proc. IEEE 1980, 68, 1060–1076. [Google Scholar] [CrossRef]
- ISO/IEC/IEEE 12207:2017; Systems and software engineering — Software life cycle processes. International Organization for Standardization 2017. (accessed on 15 May 2025).
- Lamport, L. The Temporal Logic of Actions (TLA). ACM Trans. Program. Lang. Syst. 1994, 16, 872–923. [Google Scholar] [CrossRef]
- Feathers, M.C. Working Effectively with Legacy Code; Prentice Hall: New York, 2004. [Google Scholar]
- Sadalage, P.J.; Fowler, M. NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence; Addison-Wesley: Boston, 2012. [Google Scholar]
- Silberschatz, A.; Korth, H.F.; Sudarshan, S. Database System Concepts, 7th Edition; McGraw-Hill: New York, 2019. [Google Scholar]
- Novotný, P.; Wild, J. Relational modeling of hierarchical data in biodiversity databases. Database 2024, 2024, baae107. [Google Scholar] [CrossRef]
- Selinger, P.G.; Astrahan, M.M.; Chamberlin, D.D.; Lorie, R.A.; Price, T.G. Access Path Selection in a Relational Database Management System. In Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data; ACM: New York, 1979; pp. 23–34. [Google Scholar]
- Knuth, D.E. Bitwise Tricks & Techniques. In The Art of Computer Programming, Volume 4A: Combinatorial Algorithms, Part 1; Addison-Wesley: Boston, 2011; pp. 1-62.
- Warren, H.S., Jr. Hacker's Delight, 2nd Edition; Addison-Wesley: Boston, 2013. [Google Scholar]
- Angles, R.; Gutierrez, C. Survey of graph database models. ACM Comput. Surv. 2008, 40, 1:1–1:39. [Google Scholar] [CrossRef]
- Runeson, P.; Höst, M. Guidelines for conducting and reporting case study research in software engineering. Empir. Softw. Eng. 2009, 14, 131–164. [Google Scholar] [CrossRef]
- Kitchenham, B.A.; Charters, S. Guidelines for performing systematic literature reviews in software engineering. Keele University Technical Report; 2007; p. EBSE-2007-01. [Google Scholar]
- Basili, V.R.; Rombach, H.D. The TAME project: towards improvement-oriented software environments. IEEE Trans. Softw. Eng. 1988, 14, 758–773. [Google Scholar] [CrossRef]
- Sittig, D.F.; Singh, H. Design and Evaluation of a Structured Incident Reporting System for Healthcare. Int. J. Med. Inform. 2013, 82, 1188–1195. [Google Scholar]
- Knuth, D.E. The Art of Computer Programming, Vol. 1: Fundamental Algorithms, 3rd Edition; Addison-Wesley: Boston, 1997.
- Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd Edition; Wiley: New York, 2006. [Google Scholar]
- Solingen, R.; Basili, V.; Caldiera, G.; Rombach, H.D. The Goal Question Metric Approach. In Encyclopedia of Software Engineering; John Wiley & Sons: New York, 2002. [Google Scholar]
- Easterbrook, S.; Singer, J.; Storey, M.A.; Damian, D. Selecting Empirical Methods for Software Engineering Research. In Guide to Advanced Empirical Software Engineering; Shull, F., Singer, J., Sjøberg, D.I.K., Eds.; Springer: London, 2008; pp. 1–25. [Google Scholar]
- Kitchenham, B.; Pfleeger, S.L.; Pickard, L.M.; Jones, P.W.; Hoaglin, D.C.; El Emam, K.; Rosenberg, J. Preliminary guidelines for empirical research in software engineering. IEEE Trans. Softw. Eng. 2002, 28, 721–734. [Google Scholar] [CrossRef]
- Shadish, W.R.; Cook, T.D.; Campbell, D.T. Experimental and Quasi-Experimental Designs for Generalized Causal Inference; Cengage Learning: Boston, 2002. [Google Scholar]
- Wohlin, C.; Runeson, P.; Höst, M.; Ohlsson, M.C.; Regnell, B.; Wesslén, A. Experimentation in Software Engineering; Springer: Berlin, 2012. [Google Scholar]
- LaToza, T.D.; Myers, B.A. Hard-to-answer questions about code. In Evaluation and Usability of Programming Languages and Tools; ACM: New York, 2010; pp. 1–8. [Google Scholar]
- Stonebraker, M. SQL databases v. NoSQL databases. Commun. ACM 2010, 53, 10–11. [Google Scholar] [CrossRef]
- Beck, K. Extreme Programming Explained: Embrace Change, 2nd Edition; Addison-Wesley: Boston, 2004. [Google Scholar]
- Sommerville, I. Software Engineering, 10th Edition; Pearson: New York, 2015. [Google Scholar]
- Pressman, R.S.; Maxim, B.R. Software Engineering: A Practitioner's Approach, 9th Edition; McGraw-Hill: New York, 2019. [Google Scholar]
- Robinson, I.; Webber, J. Graph Databases, 2nd Edition; O'Reilly: Sebastopol, 2015. [Google Scholar]
- Florescu, D.; Kossmann, D. Storing and Querying XML Data Using an RDMBS. IEEE Data Eng. Bull. 1999, 22, 27–34. [Google Scholar]
- Wu, K.; Otoo, E.J.; Shoshani, A. Using Bitmap Indexing Technology for Combined Numerical and Text Queries. LBNL-59254; LBNL Technical Report. 2006. [Google Scholar]
- Roscoe, A.W. Understanding Concurrent Systems; Springer: London, 2010. [Google Scholar]
- Emerson, E.A. Temporal and modal logic. In Handbook of Theoretical Computer Science;Formal Models and Semantics; Elsevier: Amsterdam, 1990; Vol. B, pp. 995–1072. [Google Scholar]
- Elmasri, R.; Navathe, S.B. Fundamentals of Database Systems, 7th Edition; Pearson: New York, 2015. [Google Scholar]
- Jackson, M. Problem Frames: Analysing and Structuring Software Development Problems; Addison-Wesley: Boston, 2001. [Google Scholar]
- Rumpe, B. Modeling with UML: Language, Concepts, Methods; Springer: Berlin, 2016. [Google Scholar]
- Stahl, T.; Voelter, M. Model-Driven Software Development: Technology, Engineering, Management; Wiley: New York, 2006. [Google Scholar]
- Fitzgerald, B.; Stol, K.-J. Continuous software engineering: A roadmap and agenda. J. Syst. Softw. 2017, 123, 176–189. [Google Scholar] [CrossRef]
- Leite, L.; Rocha, C.; Kon, F.; Milojicic, D.; Meirelles, P. A survey of DevOps concepts and challenges. ACM Comput. Surv. 2020, 52, 1–35. [Google Scholar] [CrossRef]
- Podelski, A.; Rybalchenko, A. A Complete Method for the Synthesis of Linear Ranking Functions. In Verification, Model Checking, and Abstract Interpretation;Lecture Notes in Computer Science; Steffen, B., Levi, G., Eds.; Springer: Berlin, 2004; Vol. 2937, pp. 239–251. [Google Scholar]
- Bradley, C.; Manna, Z.; Sipma, H. Linear Ranking with Reachability. In Computer Aided Verification;Lecture Notes in Computer Science; Etessami, K., Rajamani, S.K., Eds.; Springer: Berlin, 2005; Vol. 3576, pp. 491–504. [Google Scholar]
- Colón, M.A.; Sipma, H.B. Synthesis of Linear Ranking Functions. In Tools and Algorithms for the Construction and Analysis of Systems;Lecture Notes in Computer Science; Margaria, T., Yi, W., Eds.; Springer: Berlin, 2001; Vol. 2031, pp. 1–15. [Google Scholar]
- Cook, B.; Podelski, A.; Rybalchenko, A. Termination Proofs for Systems Code. ACM SIGPLAN Notices 2006, 41, 415–426. [Google Scholar] [CrossRef]
- Larraz, D.; Oliveras, A.; Rodríguez-Carbonell, E.; Rubio, A. Proving termination of imperative programs using Max-SMT. In Proceedings of the 2013 Formal Methods in Computer-Aided Design; IEEE: New York, 2013; pp. 218–225. [Google Scholar]
- Chatterjee, K.; Goharshady, E.K.; Novotný, P.; Zárevúcky, J.; Žikelić, Đ. On Lexicographic Proof Rules for Probabilistic Termination. In Formal Methods;Lecture Notes in Computer Science; Huisman, M., Păsăreanu, C., Zhan, N., Eds.; Springer: Cham, 2021; Vol. 13047, pp. 1–20. [Google Scholar]
- Roscoe, A.W. The Theory and Practice of Concurrency; Prentice-Hall: New York, 2005. [Google Scholar]
- Vardi, M.Y. The Complexity of Relational Query Languages. In Proceedings of the 14th ACM SIGACT Symposium on Theory of Computing; ACM: New York, 1982; pp. 137–146. [Google Scholar]
- Celko, J. Joe Celko's Trees and Hierarchies in SQL for Smarties, 2nd Edition; Morgan Kaufmann: Burlington, 2012. [Google Scholar]
- Tropashko, V. Nested Intervals Tree Encoding in SQL. ACM SIGMOD Rec. 2006, 35, 47–52. [Google Scholar] [CrossRef]
- Hellerstein, J.M.; Stonebraker, M.; Hamilton, J. Architecture of a Database System. Found. Trends Databases 2007, 1, 141–259. [Google Scholar] [CrossRef]
- Knebl, H. Algorithms and Data Structures: Foundations and Probabilistic Methods for Design and Analysis; Springer: Cham, 2020. [Google Scholar]
- Date, C.J. Database Design and Relational Theory: Normal Forms and All That Jazz, 2nd Edition; Apress: New York, 2019. [Google Scholar]
- Griffiths, P.P.; Wade, B.W. An Authorization Mechanism for a Relational Database System. Commun. ACM 1976, 19, 429–437. [Google Scholar] [CrossRef]
- Abadi, D.J. Query execution in column-oriented database systems. PhD Dissertation, Massachusetts Institute of Technology, Cambridge, MA, 2006. [Google Scholar]
- Neumann, T. Efficiently compiling efficient query plans for modern hardware. Proc. VLDB Endow. 2011, 4, 539–550. [Google Scholar] [CrossRef]
- Bauer, C.; King, G. Java Persistence with Hibernate; Manning Publications: New York, 2006. [Google Scholar]
- Verbitski, A.; Gupta, A.; Saha, D.; Brahmadesam, M.; Gupta, K.; Mittal, R.; et al. Amazon Aurora: Design considerations for high throughput cloud-native relational databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data, New York; ACM, 2017; pp. 1041–1052. [Google Scholar]
- Fowler, M. Patterns of Enterprise Application Architecture; Addison-Wesley: Boston, 2002. [Google Scholar]
- Kuesel, T.R.; King, E.H.; Bickel, J.O. Tunnel Engineering Handbook, 2nd Edition; Springer: New York, 1996. [Google Scholar]
- Li, S.; Zhang, Y.; Cao, M.; Wang, Z. Study on excavation sequence of pilot tunnels for a rectangular tunnel using numerical simulation and field monitoring method. Rock Mech. Rock Eng. 2022, 55, 3507–3523. [Google Scholar] [CrossRef]
- Basili, V.R. The Role of Controlled Experiments in Software Engineering Research. In Empirical Software Engineering Issues;Lecture Notes in Computer Science; Basili, V.R., Rombach, D., Schneider, K., Kitchenham, B., Pfahl, D., Selby, R.W., Eds.; Springer: Berlin, 2007; Vol. 4336, pp. 1–12. [Google Scholar]
- Sjoberg, D.I.; Hannay, J.E.; Hansen, O.; Kampenes, V.B.; Karahasanovic, A.; Liborg, N.K.; et al. A survey of controlled experiments in software engineering. IEEE Trans. Softw. Eng. 2005, 31, 733–753. [Google Scholar] [CrossRef]
- Sackman, H.; Erikson, W.J.; Grant, E.E. Exploratory experimental studies comparing online and offline programming performance. Commun. ACM 1968, 11, 3–11. [Google Scholar] [CrossRef]
- Forsgren, N.; Storey, M.A.; Maddila, C.; Zimmermann, T.; Houck, B.; Butler, J. The SPACE of developer productivity. Commun. ACM 2021, 64, 46–53. [Google Scholar] [CrossRef]
- Jørgensen, M.; Shepperd, M. A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 2007, 33, 33–53. [Google Scholar] [CrossRef]
- Jain, R. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling; Wiley: New York, 1991. [Google Scholar]
- Georges, A.; Buytaert, D.; Eeckhout, L. Statistically Rigorous Java Performance Evaluation. In Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications; ACM: New York, 2007; pp. 57–76. [Google Scholar]
- Dean, J.; Barroso, L.A. The Tail at Scale. Commun. ACM 2013, 56, 74–80. [Google Scholar] [CrossRef]
- Microsoft Docs. sp_spaceused (Transact-SQL). Microsoft 2024. Available online: https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-spaceused-transact-sql (accessed on 15 May 2025).
















| Research Area | Typical Limitations in Prior Work | PBFD/PDFD Contributions |
|---|---|---|
| Domain-Driven Design & Collaborative Modeling [31,32] | Heuristic, non-executable, lacks formal consistency guarantees | Formal semantics with executable workflow rules; ensures verifiable consistency |
| Formal Methods & LTL [39,40,44,48] | High abstraction, steep learning curve, limited integration with practice | Embedded rigor within accessible workflows; verification of temporal properties (liveness, safety, eventual completion) |
| State Machines & Traversal Algorithms [47,48] | Used as auxiliary tools, not primary development drivers | Traversal as a first-class development primitive; enables derivation of correctness properties, rollback safety |
| Model-Driven Engineering [41,42,43,44] | Struggles with evolving requirements, scalability, and industrial adoption | Pragmatic adaptability combined with formal foundation; scales to enterprise systems |
| Low-Code Development Platforms [34,35] | Opaque orchestration, limited extensibility, correctness not guaranteed | Transparent, graph-based orchestration; ensures structural correctness and extensibility |
| Encoded Data Structures, Columnar Encoding, Bitmap Indexes [52,54,55] | Encoding used internally by DBMS for query acceleration; hierarchical relations still require recursive/nested traversal (O(log n)); no formal semantics for hierarchy or correctness | Declarative bitmask-based hierarchical schema (TLE); O(1) lookup/update/traversal; externalizes encoding at schema design level; preserves explicit hierarchical semantics and enables formal verification (CSP/LTL) |
| Symbol | Description |
|---|---|
| G | Directed Acyclic Graph with vertices V and edges E |
| D(v) | Direct dependencies of node v: {u|(u, v) ∈ E} |
| Characteristic | Description |
|---|---|
| Acyclic Enforcement | Ensures that the development dependency graph remains acyclic, preventing circular dependencies and infinite traversal loops |
| Scalability | Supports incremental addition of nodes and edges, provided that the overall graph preserves its acyclic structure |
| State ID | Phase | Description |
| S₀ | Initialization | Load DAG G and validate acyclicity |
| S₁ | Node Processing | Process node v ∈ V (e.g., develop component) and enqueue its children |
| S₂ | Dependency Check | Verify the completeness of v's dependencies, D(v) |
| S₃ | Graph Extension | Add new nodes or edges to resolve unmet dependencies while preserving acyclicity |
| T | Termination | Final validation and workflow conclusion |
| Rule ID | Source State | Target State | Condition | Operational Step |
|---|---|---|---|---|
| DA1 | S₀ | S₁ | DAG G is loaded and validated as acyclic. | Initialize processing queue with the root node |
| DA2 | S₁ | S₂ | A node v is dequeued for processing. | Initiate a check for all dependencies D(v) |
| DA3 | S₂ | S₁ | ∀u ∈ D(v): processed(u) (All dependencies are resolved). | Enqueue the dependencies of v for processing |
| DA4 | S₂ | S₃ | ∃u ∈ D(v): ¬processed(u) (An unresolved dependency exists). | Extend the DAG by adding a new node vₙ₊₁ or edge |
| DA5 | S₃ | S₁ | DAG extension is complete and acyclicity is preserved. | Enqueue the new node vₙ₊₁ for processing |
| DA6 | S₁ | T | ∀v ∈ V: processed(v) (All nodes are processed). | Perform final validation and terminate the workflow |
| Property | CSP Assertion | FDR Result | Engineering Significance |
|---|---|---|---|
| Core Safety | DAD :[deadlock free [F]] | ✓ Passed | Ensures no circular dependencies or blocking states during processing |
| Core Liveness | DAD :[divergence free] | ✓ Passed | Confirms absence of infinite loops or τ-cycles in dependency checking |
| Determinism | DAD :[deterministic [F]] | ✓ Passed | Guarantees predictable topological execution order |
| Dequeue-Process Sequencing | DequeueThenProcess [T= DAD_Core] | ✓ Passed | Ensures dequeued nodes are immediately processed (local atomicity, DA2) |
| Process-Validate Sequencing | ProcessThenValidate [T= DAD_Core] | ✓ Passed | Verifies that processing a node triggers dependency validation (DA2 → DA3/DA4) |
| Dependency Completion Logic | DepsProcessedThenGenerate [T= DAD_Core] | ✓ Passed | Enforces children generation only after all dependencies completed (DA3) |
| Child Enqueueing Logic | GenerateThenEnqueue [T= DAD_Core] | ✓ Passed | Ensures generated children are properly scheduled for processing (DA3) |
| Graph Extension Control | MissingDepThenExtend [T= DAD_Core] | ✓ Passed | Triggers DAG extension for missing dependencies while maintaining acyclicity (DA4 & DA5) |
| Final Validation Timing | AllProcessedThenValidate [T= DAD_Core] | ✓ Passed | Confirms final validation occurs after all nodes are processed (DA6) |
| Termination Guarantee | TerminationAllowed [T= DAD_Core] | ✓ Passed | Ensures system can always reach a successful or error termination state |
| Property | Formal Specification | Description |
|---|---|---|
| Acyclicity Invariant | □(∀v ∈ V, ∄ cycle(v₀, ..., vₖ)) | No cycles are introduced during operation. Rule DA4 triggers graph extension, which is implemented by the ExtendGraph function (Appendix A.2.3) to guarantee acyclicity is preserved. |
| Dependency Completeness | □(processed(v) ⇒ ∀u ∈ D(v), processed(u)) | A node is processed only after all its dependencies are processed (Rules DA2, DA3). |
| Liveness of Processing | □(dequeue(v) ⇒ ◊process(v)) | Every dequeued node is eventually processed (Enabled by DA2-DA5 and the acyclicity invariant). |
| Fairness (No Starvation) | □∀v ∈ V, ◊processed(v) | Every node in the graph is eventually processed (Guaranteed by DA6 and the exhaustive traversal semantics). |
| Termination Guarantee | □(start(DAD) ⇒ ◊terminate(DAD)) | The process eventually terminates for any finite DAG (Rule DA6). |
| Property | Advantage |
|---|---|
| Cycle Prevention | Eliminates circular dependencies and development deadlocks |
| Dependency Isolation | Isolation of branch changes |
| Incremental Scaling | Supports evolutionary system growth |
| Impact Analysis | Traceable dependency chains aid debugging and planning |
| Symbol | Description |
|---|---|
| Tr | Rooted, finite, acyclic tree structure with nodes V and edges E |
| D(v) | Direct dependencies of node v: { u ∣ (u, v) ∈ E } |
| Cᵢ | The current node being processed in the traversal |
| Bⱼ | A backtrack point (a node on the current path with unvisited siblings) |
| Characteristic | Description |
|---|---|
| Vertical Progression | Prioritizes traversing a single dependency path to its deepest point before exploring other branches |
| Exhaustive Traversal | Ensures all nodes and their subtrees are eventually visited and processed by combining vertical progression and backtracking |
| Backtracking Enablement | Allows returning to a parent node to explore unvisited sibling branches after a path is completed |
| State ID | Phase | Description |
| S₀ | Initialization | Load tree Tr and initialize stack with root node |
| S₁ | Vertical Processing | Process current node Cᵢ and push its direct dependencies onto the stack |
| S₂ | Backtracking | Return to a parent node (Bⱼ) after processing a leaf or a completed branch |
| S₃ | Validation | Validate the fully explored subtree rooted at the current backtrack point |
| T | Termination | Final state after all nodes are processed and validated |
| Rule ID | Source State | Target State | Condition | Operational Step |
|---|---|---|---|---|
| DF1 | S₀ | S₁ | Tree Tr is loaded and valid. | Initialize stack with root node C₁ |
| DF2 | S₁ | S₁ | Cᵢ is a non-leaf node. | Process Cᵢ, then push its direct dependencies D(Cᵢ) onto the stack |
| DF3 | S₁ | S₂ | Cᵢ is a leaf node. | Process Cᵢ, then set backtrack point Bⱼ to parent(Cᵢ) |
| DF4 | S₂ | S₁ | Bⱼ has an unprocessed sibling. | Process the next sibling of Bⱼ, push it onto the stack |
| DF5 | S₂ | S₃ | Bⱼ has no unprocessed siblings. | Initiate validation for the subtree rooted at Bⱼ |
| DF6 | S₃ | S₂ | Stack is not empty. | Continue backtracking to the parent of Bⱼ |
| DF7 | S₃ | T | Stack is empty. | Perform final validation and terminate |
| Property | CSP Assertion | FDR Result | Engineering Significance |
|---|---|---|---|
| Core Safety | DFD :[deadlock free [F]] | ✓ Passed | Ensures no blocking states occur during subtree processing or backtracking |
| Core Liveness | DFD :[divergence free] | ✓ Passed | Confirms absence of τ-cycles or infinite descent during traversal |
| Determinism | DFD :[deterministic [F]] | ✓ Passed | Guarantees predictable recursion and unambiguous subtree completion |
| Local Processing Safety | DequeueThenProcess [T= DFD_Core] | ✓ Passed | Ensures each dequeued node is immediately processed (DF2 & DF3) |
| Non-Leaf Descent Logic | NonLeafPushesChildren [T= DFD_Core] | ✓ Passed | Enforces DF2: non-leaf nodes must push their children before continuing descent |
| Leaf/Backtrack Initiation | LeafToBacktrack [T= DFD_Core] | ✓ Passed | Enforces DF3: processing a leaf correctly triggers parent-level backtracking |
| Validation Control Flow | ValidationSequence [T= DFD_Core] | ✓ Passed | Ensures validation transitions lead only to backtracking or termination (DF5–DF7) |
| Termination Reachability | TerminationAllowed [T= DFD_Core] | ✓ Passed | Confirms the system can always reach the final successful state |
| Property | Formal Specification | Description |
|---|---|---|
| Single Path Completion | □∀P = (C₀, ..., Cᴸ) ∈ G: (processed(Cᴸ) ⇒ ∀Cⱼ ∈ P, processed(Cⱼ)) | A path is processed completely before moving to siblings (Rules DF2, DF3). |
| Subtree Validation Completeness | □(validated(Bⱼ) ⇒ ∀Cₖ ∈ Subtree(Bⱼ), validated(Cₖ)) | A subtree is only validated after all nodes within it are processed (Rules DF5, DF6). |
| Liveness (No Starvation) | ∀ v ∈ V, ♢processed(v) | Every node is eventually processed (Rules DF4, DF6). |
| Termination Guarantee | □(start(DFD) ⇒ ◊terminate(DFD)) | The process eventually terminates for any finite tree (Rule DF7). |
| Property | Advantage |
|---|---|
| Early Validation | Foundational logic (e.g., country → state → city) is validated early. |
| Modular Testing | Bugs are isolated within narrow vertical paths. |
| Incremental Scaling | New nodes or branches can be integrated without restructuring validated paths. |
| Symbol | Description |
|---|---|
| Q | Global queue tracking nodes to process |
| Nₖ | Set of nodes at level k |
| L | Maximum depth level of the tree |
| D(v) | Set of direct successors to node v, i.e., {u∣(v,u)∈E} |
| Characteristic | Description |
|---|---|
| Horizontal Progression | All nodes at a given level must be processed before the algorithm proceeds to the next level. |
| Layered Advancement | Advancement from level k to k+1 occurs only after all nodes at level k are processed and validated. |
| Level Synchronization | Maintains level integrity, ensuring consistency across parallel node implementations within the same level. |
| State ID | Phase | Description |
| S₀ | Initialization | Load graph and initialize level queues |
| S₁ | Level Processing | Process nodes at level k |
| S₂ | Validation | Validate all nodes at level k |
| T | Termination | Final state after all levels are completed |
| Rule ID | Source State | Target State | Condition | Operational Step |
|---|---|---|---|---|
| BF1 | S₀ | S₁ | Graph loaded. | Initialize queue Q with root |
| BF2 | S₁ | S₁ | Q≠∅∧(∃c∈Nₖ:¬processed(c)) | Process next node in current level |
| BF3 | S₁ | S₂ | ∀c∈ Nₖ:processed(c) | Validate level k |
| BF4 | S₂ | S₁ | k<L | Advance to level k+1 |
| BF5 | S₂ | T | k=L | Terminate |
| Property | CSP Assertion | FDR Result | Engineering Significance |
|---|---|---|---|
| Core Safety | BFD :[deadlock free [F]] | ✓ Passed | Guarantees liveness across node and level processing (no terminal blocking states) |
| Core Liveness | BFD :[divergence free] | ✓ Passed | Confirms absence of livelock and infinite internal loops (τ-cycles) |
| Determinism | BFD :[deterministic [F]] | ✓ Passed | Ensures that queue and node processing decisions are uniquely defined for predictable execution |
| Safety: Dequeue Implies Process | DequeueImpliesProcess [T= BFD_Core] | ✓ Passed | Confirms that each dequeued node is immediately processed, preserving workflow correctness (BF2) |
| Level Validation Before Advancement | ValidateBeforeAdvance [T= BFD_Core] | ✓ Passed | Ensures that all nodes at level k are validated before moving to level k+1 (BF3 & BF4) |
| Post-Validation Behavior | AfterValidation [T= BFD_Core] | ✓ Passed | Guarantees that after level validation, the process either advances or terminates (BF4 & BF5), ensuring progress. |
| Successful Termination | terminate_successfully_actual -> SKIP [T= CanReachTerminate] | ✓ Passed | Demonstrates that BFD completes all levels and nodes successfully (BF5) |
| Termination at End | TerminationAtEnd [T= BFD_Core] | ✓ Passed | Confirms that termination occurs only after all processing and validation steps are complete |
| Property | Formal Specification | Description |
|---|---|---|
| Layer Completion | □∀k≤L: (processed(Nₖ) ⇒ ¬∃Cⱼ∈Nₖ: ¬processed(Cⱼ)) | All nodes in a level are processed before proceeding (Rules BF2, BF3). |
| Order Preservation | □∀k<L: (validated(Nₖ) ⇒ ◊processed(Nₖ₊₁)) | Level k+1 is entered only after all nodes at level k are validated (Rules BF3, BF4). |
| Termination Guarantee | □(start(BFD) ⇒ ◊terminate(BFD)) | Process reaches completion (Rules BF4, BF5). |
| Liveness (No Starvation) | □∀v∈V, ◊processed(v) | Every node in the graph is eventually processed. |
| Property | Advantage |
|---|---|
| Consistency | Uniform implementation across layers (e.g., all Level 1 nodes completed before Level 2) |
| Parallelization | Nodes at the same level can be processed concurrently |
| Predictability | Clear level-based rules simplify debugging (errors are localized to a single level) |
| Symbol | Description |
|---|---|
| G = (V, E) | Directed graph (possibly cyclic) with nodes V and edges E, representing development flow and dependencies |
| Iₖ | Incremental delivery milestone k, representing a validated subset of the system |
| Fₖ | Feedback trigger mechanism (e.g., validation failure, stakeholder input) associated with milestone k |
| Rₘₐₓ | Maximum allowed refinements per node to ensure convergence |
| Characteristic | Description |
|---|---|
| Controlled Feedback Loops | Feedback is allowed only when externally triggered and is bounded to prevent infinite iteration. |
| Incremental Delivery | Components are delivered in validated increments to support continuous integration and testing. |
| State ID | Phase | Description |
| S₀ | Initialization | Load graph and initialize dependencies |
| S₁ | Node Processing | Develop components under the current milestone |
| S₂ | Refinement | Iterate based on validation failure or stakeholder feedback |
| S₃ | Validation | Evaluate milestone Iₖ for completeness and correctness |
| T | Termination | Final increment successfully validated and delivered |
| Rule ID | Source State | Target State | Condition | Operational Step |
|---|---|---|---|---|
| CD1 | S₀ | S₁ | Graph loaded | Initialize development graph |
| CD2 | S₁ | S₁ | Node processed | Continue node development |
| CD3a | S₁ | S₂ | test_failed(Cᵢ) | Rework after failure |
| CD3b | S₁ | S₂ | feedback_triggered(Cᵢ) | Apply bounded feedback loop |
| CD4a | S₂ | S₁ | refinement_complete(Cᵢ) | Resume development on node |
| CD4b | S₂ | T | refinement_failed(Cᵢ) ∨ refinement_count(Cᵢ) ≥ Rₘₐₓ | Terminate with error |
| CD5 | S₁ | S₃ | all_components_written(Iₖ) | Validate increment |
| CD6 | S₃ | S₂ | feedback_received(Iₖ) ∨ validation_failed(Iₖ) | Revision required |
| CD7 | S₃ | T | all_increments_validated | Finalize delivery |
| CD8 | S₃ | S₁ | validation_successful(Iₖ) ∧ (k < L) | Advance to milestone Iₖ₊₁ |
| Property | CSP Assertion | FDR Result | Engineering Significance |
|---|---|---|---|
| Core Safety | CDD :[deadlock free] | ✓ Passed | Guarantees liveness throughout the deployment lifecycle (no terminal blocking states) |
| Core Liveness | CDD :[divergence free] | ✓ Passed | Confirms absence of livelock and infinite internal loops. |
| Protocol Compliance (Trace) | ProtocolChecker [T= CDDProtocolView] | ✓ Passed | Observable deployment traces conform to the defined protocol |
| Protocol Compliance (Liveness) | CDDProtocolView :[divergence free] | ✓ Passed | Livelock-free protocol abstraction |
| Safety: Initial Guard | NoEarlyTermination [T= CDD] | ✓ Passed | Prevents termination before mandatory initialization (load_graph, initialize_dependencies) |
| Dependency Respect (Contribution N4) | DependencySpec_N4 [T= CDD] | ✓ Passed | Proves N4 cannot execute before both N2 and N3 complete |
| Dependency Respect (Contribution N5) | DependencySpec_N5 [T= CDD] | ✓ Passed | Proves N5 cannot execute before N4 completes |
| Robustness: Bounded Refinement (Deadlock) | CDD_Hostile :[deadlock free] | ✓ Passed | Liveness retention and error-termination reachability under adversarial failure |
| Robustness: Bounded Refinement (Divergence) | CDD_Hostile :[divergence free] | ✓ Passed | Shows the system does not livelock under persistent failures; termination is guaranteed |
| Internal Consistency | ConditionalConsistency [T= STOP] | ✓ Passed | Ensures mutually exclusive conditional events do not conflict |
| Property | Formal Specification | Description |
|---|---|---|
| Cycle Integrity | □(processed(Cⱼ) ⇒ ◊refine(Cⱼ)) ∧ □(refinement_count(Cⱼ) ≤ Rₘₐₓ) | Bounded feedback loops are permitted (CD3a/CD3b). |
| Incremental Soundness | □(◊finalize(Iₖ) ⇒ ∀C ∈ Iₖ, validated(C)) | All components in a milestone must be validated before release (CD5, CD7). |
| Bounded Refinement | □∀v ∈ V: (refinement_count(v) ≤ Rₘₐₓ) | The number of refinements for any node is strictly bounded by Rₘₐₓ. |
| Termination Guarantee | □(start(CDD) ⇒ ◊T) | The process eventually reaches successful termination. |
| Property | Advantage |
|---|---|
| Adaptability | Supports bounded iteration in response to validation results or stakeholder feedback |
| Risk Reduction | Enables early defect detection through milestone-based validation |
| Agile Compliance | Aligns with sprint-style incremental delivery while maintaining formal convergence guarantees |
| Symbol | Description |
|---|---|
| Kᵢ | Progression Threshold: The minimum number of nodes (representing features or components) at level i that must reach a finalized state (P(n)=2) before development can progress to level i+1. This threshold acts as a configurable Work-In-Progress (WIP) limit, which can be set statically based on team capacity or adjusted dynamically in real-time based on evolving system constraints and priorities [66]. It enforces structured synchronization points, preventing uncontrolled parallelism and managing complexity |
| Jᵢ | Start of refinement: Earliest level impacted by failures at i, where Jᵢ = trace_origin(i)). |
| L | Maximum depth (leaf level) of the hierarchical tree. |
| Rᵢ | Refinement range: The number of levels to reprocess, calculated as Rᵢ = i - Jᵢ + 1 (bounded by L). |
| Rₘₐₓ | Iteration limit: Maximum refinement attempts per level. Predefined to ensure termination. |
| Characteristic | Description | Theoretical Basis / Inspiration |
|---|---|---|
| Vertical Progression | Processing descends level-by-level in a depth-first manner, leveraging DFD principles for focused development paths. | Depth-First Search (Graph Theory), DFD |
| Controlled Concurrency | Progression to deeper levels depends on meeting a per-level feature threshold Kᵢ of finalized nodes, integrating a controlled breadth-first-like synchronization derived from BFD. | Bounded Parallelism, WIP Limits (Lean/Agile), BFD |
| Iterative Refinement | The methodology reprocesses and validates levels [Jᵢ, i] to resolve failures, then resumes progression from Jᵢ, directly incorporating CDD's feedback mechanisms. | Iterative Development, Feedback Loops (Spiral Model, Agile) [74], dependency-directed backtracking [77], CDD |
| Targeted Refinement | Limits rework to Rₘₐₓ attempts per level, balancing precision and scope in iterative cycles. | Bounded Iteration (CDD) |
| Bottom-Up Finalization | Subtree completion of validated nodes is performed in a bottom-up manner, ensuring localized integrity. It allows backtracking to refinement if unprocessed nodes fail validation and earlier levels have attempts remaining. | Bottom-Up Validation |
| Top-Down Completion | Finalizes and inherently validates any remaining unprocessed nodes from root to leaves after bottom-up closure, ensuring comprehensive system-wide consistency. Like Bottom-Up Finalization, backtracking to bounded refinement is allowed. | Top-Down Validation |
| Termination Guarantee | Guarantees process termination once all required conditions are satisfied, considering bounded refinements and finite tree structures. | Formal Methods |
| State ID | Phase | Description |
| S₀ | Initialization | Load tree and initialize features |
| S₁(i) | Current Level | Processes selected nodes in level i |
| S₁(i+1) | Next Level (Children) | Represents the state of actively processing level i+1, which is derived from children of nodes in level i |
| S₁(j) | Refinement Level | Reprocess level j (where j ≤ i) due to failure propagated from a later level i |
| S₂(i) | Level Validation | Validate processed nodes in level i |
| S₂(j) | Refinement Validation | Validates reprocessed nodes in level j during refinement |
| S₃(i) | Bottom-Up Process | Initiate bottom-up subtree completion for the subtrees rooted at finalized nodes (P(n)=2) in level i |
| S₄(i) | Completion Level | Finalize unprocessed nodes in level i during the top-down pass |
| S₅ | Error | Terminates due to unresolved validation failures after exhausting Rₘₐₓ |
| T | Termination | All nodes processed and finalized |
| Rule ID | Source State | Target State | Condition | Operational Step |
|---|---|---|---|---|
| PD1 | S₀ | S₁(i) | i = 1 | Begin root-level processing |
| PD2 | S₁(i) | S₂(i) | processing_complete(i) ∧pd∃n ∈level(i): ¬validated(n) | Validate current level’s nodes |
| PD2a | S₂(i) | S₁(j) | j = trace_origin(i) ∧ refinement_attempts(j) < | Backtrack to level j and begin refinement if validation fails at level i |
| PD2b | S₂(i) | S₁(i+1) | ∑_{n ∈ level(i)} [P(n)=2]≥ Kᵢ | Advance to next level after processing batch |
| PD3 | S₁(j) | S₂(j) | processing_complete(j) ∧ ∃n ∈level(j): ¬validated(n) | Validate level j again after refinement |
| PD3a | S₂(j) | S₁(j+1) | ∀n ∈ level(j): validated(n) and j<i | Resume processing at next level within refinement scope after successful validation |
| PD3b | S₂(j) | S₂(i) | ∀n ∈ level(j): validated(n) and j=i | Refinement validation complete; return to original current level for forward pass continuation |
| PD3c | S₂(j) | S₁(j) | ∃n ∈ level(j): ¬validated(n) ∧ refinement_attempts(j) < Rₘₐₓ | Retry refinement processing at level j |
| PD4 | S₂(i) | S₃(i) | i=L ∨ | Transition to bottom-up process (prematurely or at leaf) |
| PD4a | S₃(i) | S₃(i-1) | ∀n ∈level(i): validated(n) ∧ all_descendants_validated(n) | All unprocessed nodes in the subtree of the processed nodes at level i have been processed and validated; move to level i-1 |
| PD4b | S₃(i) | S₁(j) | processing_complete(j) ∧ ∃n∈level(i):¬validated(n)∧j=trace_origin(i)∧refinement_attempts(j)< Rₘₐₓ | Backtrack from bottom-up phase to refinement processing |
| PD5 | S₃(2) | S₄(1) | i=2 in bottom up | Transition to top-down finalization |
| PD6 | S₄(i) | S₄(i+1) | ∀n ∈ level(i): validated(n) | All nodes at level i validated; move to level i+1 |
| PD6a | S₄(i) | S₁(j) | ∃n∈level(i):¬validated(n)∧j=trace_origin(i)∧refinement_attempts(j)< Rₘₐₓ | Backtrack from completion phase to refinement processing |
| PD6b | S₄(i) | S₅ | ∃n∈level(i):¬validated(n) ∧ refinement_attempts(trace_origin(i)) ≥ Rₘₐₓ | Terminate due to unvalidated nodes with no refinement options |
| PD7 | S₄(L) | T | ∀i ∈ [1, L], ∀n ∈ level(i): validated(n) | All nodes validated |
| PD8 | S₁(j) | S₅ | Terminate due to refinement cycle exhaustion |
| Property | CSP Assertion | FDR Result | Engineering Significance |
|---|---|---|---|
| Core Safety | System :[deadlock free], System :[livelock free] | ✓ Passed | Ensures progress by eliminating blocking and non-productive cyclic states |
| Core Liveness | System :[divergence free] | ✓ Passed | Confirms absence of infinite internal loops, supporting guaranteed termination |
| Structural Integrity | System :[deterministic [F]] | ✓ Passed | Establishes that behavior is fully determined by environment conditions |
| Protocol Robustness | SystemProtocolView :[divergence free] | ✓ Passed | Confirms that abstracted conditional events do not introduce livelock |
| General Consistency | ConditionConsistency [T= STOP] | ✓ Passed | Validates that the composite conditional environment is non-contradictory |
| Mutual Exclusivity (5 checks) | ConditionConsistency_ThresholdMet [T= STOP], etc. | ✓ Passed | Confirms that all five core PD decision pairs are logically disjoint and sound |
| Property | Formal Specification | Description & Justification |
|---|---|---|
| Total Correctness | □(start ⇒ ((T ∧ Structural Invariants) ∨ S₅)) | Theorem A.8.8: The methodology always terminates (T or S₅) and, upon successful termination (T), guarantees that all nodes are validated and all structural invariants are satisfied. |
| Termination | □(start ⇒ ◊(T ∨ S₅)) | Lemma A.8.4: The algorithm always terminates, either in success (all nodes finalized, T) or bounded failure (refinement exhausted, S₅). |
| Bounded Refinement | ∀k ∈ [1, L], □(refinement_attempts(k) ≤ Rₘₐₓ) | Lemma A.8.2: The number of refinement attempts for any level k is strictly bounded by the constant Rₘₐₓ. |
| Refinement Convergence | □∀j: (refining(j) ⇒ ◊(¬refining(j) ∨ refinement_attempts(j) = Rₘₐₓ)) | Lemmas A.8.2. & A.8.3: Each refinement cycle either resolves the issue and exits refinement, or exhausts its attempt bound, ensuring refinement doesn't stall indefinitely within the bounded attempts. |
| Finalization Monotonicity | □((◯k₁ ≤ k₁) ∨ (◯k₁ > k₁ ∧◯k₂ < k₂)) | Lemma A.8.3: The global count of unfinalized nodes (k₁) is non-increasing. A strict increase in k₁ (reset) is strictly compensated by a decrease in k₂ (remaining refinement attempts), ensuring lexicographic progress. |
| Finalization Permanence | ∀n∈G: □((P(n)=2 ∧ ¬∃j:(refining(j) ∧ n∈affected_nodes(j))) ⇒ ◯(P(n)=2)) | Corollary A.8.3.1: A finalized node's status is permanent except when an active, guarded refinement backtrack resets it; such resets are bounded and compensated by a strict decrease in k₂ (remaining refinement attempts). |
| Descendant Finalization Invariant | ∀n: □(P(n)=2 ⇒ ∀d ∈ descendants(n) ∩ processed_subtree(n), P(d)=2) | Lemma A.8.5: A node is not finalized unless all nodes in its processed subtree are also finalized. Enforced by guards in PD4a, PD6, PD7. |
| Refinement Locality | □∀i,j: ((state = S₂(i) ∧ ◯state = S₁(j)) ∨ (state = S₃(i) ∧ ◯state = S₁(j)) ∨ (state = S₄(i) ∧ ◯state = S₁(j))) ⇒ (j ≤ i ∧ j = trace_origin(i)) | Lemma A.8.5: All backtracking transitions target a valid anchor level j within the current progression frontier, and j is the origin of the current trace. |
| Progression Condition | □∀i: ((S₂(i) ∧ (∑_{n ∈ level(i)} [P(n)=2] ≥ Kᵢ)) ⇒ ◯(S₁(i+1))) | Rule PD2b (Table A.8.2): The system advances to the next level's Initialization phase (S₁) when enough nodes (Kᵢ) at the current level are finalized. |
| Guarded Progression Invariant | □((state = S₂(i) ∧ ∑_{n∈level(i)}[eligible(n)] ≥ Kᵢ) ⇒ ◯(S₁(i+1) ∧ selected_subtree ⊆ trace(i))) |
Rule PD2b (Table A.8.2): Progression to the next level is guarded by eligibility criteria and trace constraints, ensuring bounded advancement. |
| Bottom-Up Finalization | □∀i: ((S₂(i) ∧ (i = L ∨ level(i+1)=∅)) ⇒ ◯(S₃(i))) | Rule PD4 (Table A.8.2): Finalization initiation is triggered upon reaching a leaf node or an empty level, ensuring the transition from progression to completion. |
| Top-Down Finalization | □∀i: ((S₄(i) ∧ (∀n ∈ level(i): P(n)=2)) ⇒ ◯S₄(i+1) ∨ ◯T ∨ ◯S₅) | Rule PD6 (Table A.8.2): The top-down completion phase progresses to the next level once the current level is fully finalized (or the process terminates). |
| Global Consistency | □(T ⇒ (∀n ∈ G, P(n)=2)) | Rule PD7 (Table A.8.2): Successful termination implies all nodes in the graph are finalized. |
| Vertical Closure (Forward Guarantee) |
□((P(n)=2 ∧ children(n) ≠ ∅) ⇒ ◊∀d ∈ children(n): P(d) ∈ {1,2} ∨ T ∨ S₅) | Implied by PD4/PD6 (Table A.8.2): If a parent is finalized, its children are guaranteed to be addressed in the process flow (either by forward progression or completion), barring system termination. |
| Soundness | T ⇒ (∀n∈G: consistent(n) ∧ dependencies_satisfied(n)) | Theorem A.8.8: Successful termination implies all nodes are internally consistent and satisfy their architectural dependencies, ensuring the final system is semantically correct. |
| Unified Progress | □((¬T ∧ ¬S₅) ⇒ ∃enabled_transition) | Lemma A.8.7: From any non-terminal state, at least one transition rule is enabled, ensuring the system never deadlocks. |
| Liveness (Progress) | □((¬T ∧ ¬S₅) ⇒ ◯(M <_{lex} M)) | Lemma A.8.7: From any non-terminal state, an enabled transition exists, which decreases the lexicographic measure M, guaranteeing forward movement and preventing deadlock. |
| Well-Foundedness | M = (k₁, k₂, k₃, k₄) where k₁ ∈ [0, |V|], k₂ ∈ [0, L·Rₘₐₓ], k₃ ∈ {0,1,2,3,4}, k₄ ∈ [0, max_batch_size] | Lemma A.8.4: Each component of the lexicographic measure M is bounded and ranges over a well-ordered set, ensuring no infinite decreasing sequences exist. |
| Property | Advantage |
|---|---|
| Early Validation | Depth-first traversal enables early detection of structural and behavioral issues in the hierarchy. |
| Controlled Concurrency | Parameter Kᵢ regulates concurrent workload distribution in real time. |
| Targeted Refinement | Parameter Rₘₐₓ bounds rework iterations per level, balancing precision and efficiency. |
| Completeness Guarantee | Combined bottom-up and top-down closure ensures that all components are fully processed. |
| Scalable Design | Dynamic parameters adapt traversal behavior to diverse tree structures. |
| Hierarchical Closure | Systematic traversal guarantees complete coverage from root to leaves. |
| Symbol | Description |
|---|---|
| L | Maximum depth (leaf level) of the hierarchical tree |
| Jᵢ | Start of refinement: Earliest level impacted by failures in Patternᵢ (at level i), computed via trace_origin(i) (see PDFD, Section 3.4.2) |
| Rᵢ | Refinement range: Number of levels (Rᵢ = i - Jᵢ + 1) to reprocess, spanning patterns from level Jᵢ to i, bounded by L |
| Rₘₐₓ | Iteration limit: Maximum refinement attempts per level (Patternⱼ), matching PDFD’s per-level refinement cap (Section 3.4.2) |
| Patternᵢ | A formal model: A cohesive, feature/function-grouped subset of nodes (data, logic, UI artifacts) at hierarchical level i, encapsulating a distinct unit of business logic [79,80,84]; Patternᵢ₊₁ is a selected subset of ∪_{n∈Patternᵢ} children(n), chosen based on critical path, dependencies, and development priorities |
| rⱼ | Current refinement attempt index for Patternⱼ |
| Characteristic | Description | Theoretical Basis / Inspiration |
|---|---|---|
| Pattern-Driven Traversal | Nodes are grouped into patterns and processed level-by-level, with selective advancement to critical child nodes at each step, and may be optimized for O(1) data-access efficiency using techniques like bitmask encoding. | Breadth-First Search (BFD), Architectural Patterns [79,84,85] |
| Depth Transition | Children of current pattern nodes are promoted as the next pattern (Patternᵢ₊₁) | Dependency Tracing [65], DFD Principles |
| Pattern-Based Refinement | On validation failure, PBFD rewinds to prior levels (Patternⱼ) to correct impacted nodes. Example: Reprocessing level 1’s “data access” pattern due to a failure in level 2’s “security” pattern. | Iterative Development, Feedback Loops (CDD) [78], Software Evolution [86] |
| Parallelism | Nodes within a pattern are processed concurrently. Advancement to the next state occurs only after all processed nodes within the pattern are successfully validated. | Scalable Parallelism, Horizontal Concurrency |
| Top-Down Finalization | Finalization iterates from the root (level 1) to the leaf level (L), ensuring all dependencies are resolved and complete processing from root to leaves is achieved. | Top-Down Validation, Structured Design [81] |
| Termination Guarantee | Process termination is guaranteed once all required conditions are satisfied, considering bounded refinements and finite tree structures. | Formal Methods, Well-Founded Measures [61], Model Checking (CSP/SPIN) [45,71,87] |
| Rule ID | Source State | Target State | Condition | Operational Step |
|---|---|---|---|---|
| PB1 | S₀ | S₁(i) | i = 1 | Begin pattern processing at root level |
| PB2 | S₁(i) | S₂(i) | ∃n ∈ Patternᵢ: ¬validated(n) | Validate current pattern nodes |
| PB2a | S₁(i) | S₃(i) | ∀n ∈ Patternᵢ: validated(n) | Current pattern processing successful; proceed to depth resolution |
| PB3 | S₂(i) | S₁(j) | (∃n ∈ Patternᵢ: ¬validated(n)) ∧ j = trace_origin(i) ∧ refinement_attempts(j) < Rₘₐₓ | Backtrack to level j and begin refinement |
| PB3a | S₁(j) | S₂(j) | ∃n ∈Patternⱼ: ¬validated(n) | Validate Patternⱼ again after refinement |
| PB3a1 | S₂(j) | S₃(j) | ∀n ∈ Patternⱼ: validated(n) | Resume depth resolution after refinement |
| PB3a2 | S₂(j) | S₁(j) | ∃n ∈ Patternⱼ: ¬validated(n) ∧ refinement_attempts(j) < Rₘₐₓ | Retry refinement processing at level j |
| PB3a3 | S₂(j) | S₅ | ∃n ∈ Patternⱼ: ¬validated(n) ∧ refinement_attempts(j) ≥ Rₘₐₓ | Terminate due to unresolved validation failures after exhausted refinement attempts |
| PB3b | S₁(j) | S₃(j) | ∀n ∈ Patternⱼ: validated(n) | Refinement validated; proceed to resolve depth of the finalized nodes (P(n)=2) in level j |
| PB3c | S₂(i) | S₅ | (∃n ∈ Patternᵢ: ¬validated(n)) ∧ (trace_origin(i) undefined ∨ refinement_attempts(trace_origin(i)) ≥ Rₘₐₓ) | Terminate due to Patternᵢ has unvalidated nodes but refinement is impossible |
| PB4 | S₂(i) | S₃(i) | ∀n ∈ Patternᵢ: validated(n) | Proceed to resolve depth and prepare next |
| PB4a | S₃(i) | S₁(i+1) | i < L ∧ Patternᵢ₊₁ ≠ ∅ | Patternᵢ₊₁:= select_critical_children(Patternᵢ); Recurse to level i+1 for processing |
| PB4b | S₃(i) | S₄(1) | i=L ∨ Patternᵢ₊₁ = ∅ | Transition to top-down finalization (prematurely or at leaf) |
| PB5 | S₃(j) | S₁(j+1) | j<i | Resume pattern processing at next level within refinement scope |
| PB6 | S₃(j) | S₃(i) | j=i | Refinement range complete; return to original current level for forward pass continuation |
| PB7 | S₄(i) | S₄(i+1) | ∀n ∈ Patternᵢ: processed(n) | All nodes at level i finalized; move to level i+1 |
| PB7a | S₄(i) | S₁(j) | ∃n∈Patternᵢ:¬processed(n)∧j=trace_origin(i)∧refinement_attempts(j)< Rₘₐₓ | Backtrack from completion phase to refinement processing |
| PB7b | S₄(i) | S₅ | ∃n∈Patternᵢ:¬processed(n)∧¬(j=trace_origin(i)∧refinement_attempts(j)< Rₘₐₓ) | Terminate due to unprocessed nodes with no refinement options |
| PB8 | S₄(L) | T | ∀i ∈ [1, L], ∀n ∈ Patternᵢ: validated(n) | All nodes completed |
| PB9 | S₁(j) | S₅ | refinement_attempts(j) ≥ Rₘₐₓ | Terminate due to refinement cycle exhaustion |
| Property | CSP Assertion | FDR Result | Engineering Significance |
|---|---|---|---|
| Core Safety | System: [deadlock free] | ✓ Passed | Prevents premature halts |
| Core Liveness | System: [divergence free]; SystemSync: [divergence free] | ✓ Passed | Eliminates infinite internal cycles |
| Initialization Safety | S0: [deadlock free]; S1_InitialProcess(L1): [deadlock free] | ✓ Passed | Confirms PB1 startup behavior from Table 40 |
| Hostile Robustness | HostileSystem: [deadlock free]; HostileSystemSync: [deadlock free] | ✓ Passed | Ensures correctness under non-cooperative inputs |
| Conditional Consistency | LegalCondEnv [T = NoContradictions] | ✓ Passed | Verifies mutual exclusivity across all decision predicates |
| State-Level Safety | 26 assertions | ✓ Passed | All operational and terminal states (S0–S5, T) verified across all level combinations |
| Property | Formal Specification | Description & Justification |
|---|---|---|
| Total Correctness | □(start ⇒ ((T ∧ Structural Invariants ) ∨ S₅)) | Theorem A.8.8: The methodology always terminates (T or S₅), and, upon successful termination (T), guarantees that all nodes are validated and all structural invariants are satisfied. |
| Termination | □(start ⇒ ◊(T ∨ S₅)) | Lemma A.8.4: Always, if the system starts, it eventually reaches the successful Termination (T) or bounded Error (S₅) state [61]. |
| Well-Foundedness | M = (k₁, k₂, k₃, k₄) where k₁ ∈ [0, |V|], k₂ ∈ [0, L·Rₘₐₓ], k₃ ∈ {0,1,2,3,4}, k₄ ∈ [0, max_batch_size] | Lemma A.8.4: Each component of the lexicographic measure M is bounded and ranges over a well-ordered set, ensuring no infinite decreasing sequences exist. |
| Bounded Refinement | ∀k ∈ [1, L], □(refinement_attempts(k) ≤ Rₘₐₓ) | Lemma A.8.2: The number of refinement attempts for any level (k) is strictly bounded by the constant Rₘₐₓ (e.g. Rₘₐₓ =50) [65,78]. A practical limit, such as Rₘₐₓ = 50, is used in the PBFD MVP implementation (Appendix A.14). |
| Refinement Convergence | □∀j:(refining(j) ⇒ ◊(¬refining(j)∨refinement_attempts(j) = Rₘₐₓ)) | Lemmas A.8.2 & A.8.3: Each refinement cycle eventually resolves the issue or exhausts its attempt bound, ensuring refinement is not indefinitely stalled [78]. |
| Finalization Monotonicity | □((◯ k₁≤ k₁) ∨ (◯ k₁> k₁∧◯k₂ < k₂)) | Lemma A.8.3: The global count of unfinalized nodes (k₁) is non-increasing. It strictly decreases during commit transitions (PB4a, PB7) and can only increase during a guarded, bounded refinement reset that is compensated by a strict decrease in k₂. |
| Finalization Permanence | ∀n∈G:□((P(n)=2∧¬∃j:(refining(j)∧n∈affected_nodes(j))) ⇒ ◯(P(n)=2)) | Corollary A.8.3.1: A finalized node's status is permanent unless actively reset by a guarded, bounded refinement backtrack. |
| Pattern Processing Order | □∀i:((S₃(i)∧(i<L ∧ Patternᵢ₊₁ ≠ ∅)) ⇒ ◯(S₁(i+1))) | Lemma A.8.6 (Level-wise Ordering Invariant): Progression to the next level's pattern (Patternᵢ₊₁) only occurs after the current pattern (Patternᵢ) is fully resolved. |
| Top-Down Finalization Order | □∀i:((S₄(i) ∧ (∀n ∈ Patternᵢ: processed(n))) ⇒ ◯S₄(i+1) ∨ ◯T ∨ ◯S₅) | Lemma A.8.6 (Top-down Finalization Invariant): The completion phase strictly finalizes levels in sequence from root to leaf. [81]. |
| Refinement Scope | □∀i,j: (backtrack(i,j) ⇒ (j = trace_origin(i) ∧ j ≤ i)) | Lemma A.8.6 (Refinement Locality Invariant): Backtracking always targets the calculated trace origin within the current progression frontier i, j ≤ i. |
| Vertical Closure | □((P(n)=2 ∧ children(n) ≠ ∅) ⇒ ♢(∀c ∈ children(n): P(c) ∈ {1,2} ∨ T ∨ S₅)) | Implied by Lemma A.8.6 invariants: If a parent is finalized, its children are guaranteed to be addressed in the process flow, barring system termination. |
| Global Consistency | T ⇒ (∀n ∈ G, P(n)=2) | Rule PB8 (Table A.8.3): Successful termination (T) guarantees that every single node in the system is finalized [88]. |
| Soundness | T ⇒ (∀n∈G: consistent(n) ∧ dependencies_satisfied(n)) | Theorem A.8.8: Successful termination implies all nodes are internally consistent and satisfy their architectural dependencies. [88] |
| Liveness (Progress) | □((¬T ∧ ¬S₅) ⇒ ◯(M <_{lex} M)) | Lemma A.8.7: From any non-terminal state, an enabled transition exists that strictly decreases the lexicographic measure M, guaranteeing forward movement and preventing deadlock. [61] |
| Selective Progression Invariant | □((state = S₃(i) ∧ i < L ∧ Patternᵢ₊₁ ≠ ∅) ⇒ ◯(state = S₁(i+1) ∧ Patternᵢ₊₁=select_critical_children(Patternᵢ))) | Rule PB4a (Table A.8.3): Progression is guarded by the selection of the next pattern, ensuring only critical nodes are considered for the next processing cycle. |
| Completion Phase Invariant | □(state=S₄(i)⇒ (♢state=S₄(i+1) ∨ ♢T ∨ ♢S₅)) | Rule PB7 (Table A.8.3): The sequential progression S₄(1) → S₄(2) → ... → S₄(L) ensures that finalization is strictly top-down for global completeness. |
| Property | Advantage |
|---|---|
| Hybrid Flexibility | Combines the strengths of breadth-first (BFD), depth-first (DFD), and cyclic refinement (CDD) models |
| Pattern-Centric Traversal | Promotes modular grouping and processing of nodes by feature, layer, or function [89] |
| Scalable Parallelism | Enables concurrent processing within a pattern (horizontal parallelism) |
| Controlled Refinement | Supports bounded iteration (via Rₘₐₓ) to avoid infinite rework loops |
| Predictable Finalization | Ensures all nodes are finalized through structured top-down traversal |
| Fine-Grained Dependency Recovery | Allow precise backtracking to affected pattern levels through validation-triggered refinements. |
| Termination Guarantee | Strong guarantees of convergence and termination, even with partial failures |
| Node Name | Level | Bit Index | Binary Mask | Decimal Mask (Per Level) |
|---|---|---|---|---|
| North America | 3 | 0 | 0b00001 | 1 |
| Asia | 3 | 4 | 0b10000 | 16 |
| United States | 4 | 0 | 0b00001 | 1 |
| Canada | 4 | 1 | 0b00010 | 2 |
| Mexico | 4 | 2 | 0b00100 | 4 |
| Operation | Symbol | Example | Description |
|---|---|---|---|
| OR | | | parent_bitmask |= US_mask | Set a child node's bit (ensures selection while preserving prior selections) |
| AND | & | parent_bitmask & Canada_mask != 0 | Check if a specific child node is selected in the parent's bitmask |
| XOR | ^ | parent_bitmask ^= Mexico_mask | Toggle the selection status of a child node |
| NOT | ~ | parent_bitmask &= ~Europe_mask | Clear a child node's bit (deselected the child) |
| Feature | Traditional (Row-based) | Bitmask-based |
|---|---|---|
| Storage | O(n rows) | O(1) for n≤64 children; O(⌈n/w⌉) with minimal factor for n>64 |
| Query | Recursive join (O(n)) | Bitwise check (O(1)) |
| Update | Row insert/delete (O(n)) | Bitwise OR/AND (O(1)) |
| Integration | SQL joins | Native bitwise ops in SQL & C-style languages, parallelizable |
| Hierarchy Level | Logical TLE Component | Relational Implementation | Example Value |
|---|---|---|---|
| Level N | Grandparent | Table Name | dbo.[United States] |
| Level N+1 | Parent | Column Name | [Maryland], [California], [Virginia] |
| Level N+2 | Children | Cell Value (Bitmask) | 5 (Binary 0b101 for counties in [Maryland]: Allegany, Baltimore) |
| State | Phase | Abstract Description |
|---|---|---|
| S₀ | Idle | The TLE structure is at rest; no active unit of work. |
| S₁ | Data Loaded | A TLE data unit (e.g., a grandparent row) has been loaded into a processing context. |
| S₂ | Hierarchy Resolved | The grandparent and parent levels have been identified and validated. |
| S₃ | Children Evaluated | Child node states have been read and logically processed (e.g., filtered, validated). |
| S₄ | Children Updated | Child node states have been modified via bitmask writes. |
| S₅ | Changes Committed | All modifications to the TLE structure are persisted to the grandparent entity. |
| S₆ | Workflow Finalized | The unit of work is complete; the system is ready for the next task (via transition TLE10 to S₀ in the CSP model to ensure system liveness). |
| Rule ID | From State | To State | Transition Condition/Trigger | Core TLE Operation/Action |
|---|---|---|---|---|
| TLE1 | [*] | S₀ | System Start | - |
| TLE2 | S₀ | S₁ | initiate_workflow(Grandparent) | LOAD(Grandparent) |
| TLE3 | S₁ | S₂ | resolve_hierarchy() | (Internal resolution) |
| TLE4 | S₂ | S₃ | evaluate_children() | Iterative READ(Parent, Child) |
| TLE5 | S₃ | S₄ | update_required ∧ apply_update() | WRITE(Parent, Child, State) |
| TLE6 | S₃ | S₅ | ¬update_required | - |
| TLE7 | S₄ | S₅ | persist_changes() | COMMIT(Grandparent) |
| TLE8 | S₅ | S₀ | has_next_unit() | - |
| TLE9 | S₅ | S₆ | ¬has_next_unit() | - |
| TLE10 | S₆ | S₀ | Workflow Complete | finalize_process() |
| TLE11 | S₀ | S₆ | ¬has_unprocessed_unit() | - |
| Property | CSP Assertion | FDR Result | Engineering Significance |
|---|---|---|---|
| Core System Safety | TLE_Process : [deadlock free], TLE_Process [T = TLE_Abstract_Process], TLE_Process [F = TLE_Abstract_Process], TLE_Process [FD = TLE_Abstract_Process] | ✓ Passed (4) | Confirms conformance to the abstract model and absence of halting executions; guarantees full behavioral refinement |
| State-Level Reliability | TLE_S0, TLE_S1.u1–u3, …, TLE_S6.u1–u3 (Implementation) TLE_Abstract_S0, TLE_Abstract_S1.u1–u3, …, TLE_Abstract_S6.u1–u3 (Abstract) |
✓ Passed (38) | Ensures deadlock freedom for all operational states across all unit parameters; validates unit-specific determinism |
| Liveness Guarantees | TLE_Process : [divergence free], TLE_Abstract_Process : [divergence free] | ✓ Passed (2) | Confirms absence of infinite internal activity; guarantees workflow continuity |
| Composition & Robustness | TLE_TwoUnits : [deadlock free], TLE_Abstract_TwoUnits : [deadlock free], TLE_Hostile_System : [deadlock free], TLE_HostileEnv : [deadlock free], TLE_Process : [deterministic [F]] | ✓ Passed (5) | Validates safe concurrent execution, robustness under adversarial inputs, and internal determinism of the TLE workflow |
| Characteristic | Operation /Complexity | Explanation |
|---|---|---|
| Storage Efficiency | = Ć / (ĉ · k) | Encodes child-relationship sets in bitmasks instead of foreign key rows. Ć = average bitmask size; ĉ = average children per parent; k = metadata overhead per relational child record. For sparse hierarchies where Ć ≪ ĉ · k, TLE yields substantial storage reduction. |
| Query Complexity | O(1) (n ≤ w), O(⌈n/w⌉) otherwise | Bitmask lookup enables constant-time child existence checks when the hierarchy fits within a standard word size. |
| Update Cost | O(1) (n ≤ w), O(⌈n/w⌉) otherwise | Updates (adding/removing child association) are performed via bitwise OR / AND / XOR instead of relational inserts/deletes. |
| Batch Parent Traversal | ) | A linear scan over all parent entities eliminates index lookups, since parent–child presence is determined from the mask. |
| Denormalization Cost | O(1) amortized | There are no join tables, as relationships are encoded directly in each parent row. |
| Property | Description | Formal Basis |
|---|---|---|
| Storage Efficiency | Replaces O(m) foreign key storage with O(Σ Cᵢ) bitmask storage, yielding an asymptotic reduction of O(1/k). Sparse hierarchies amplify the reduction factor | Theorem A.10.1 |
| Query Complexity | O(1) lookup of child-membership status when n ≤ w (word size) using bitwise tests; O(⌈n/w⌉) for larger hierarchies | Theorem A.10.2 |
| Update Complexity | O(1) bitwise update on the mask; does not require relational mutations | Theorem A.10.3 |
| Batch Processing | Direct sequential scan through bitmasks enables parent-level batch traversal in O() | Theorem A.10.4 |
| Semantic Expressiveness | Maintains explicit root → parent → child semantics; masks encode relationship cardinality constraints | Section 4.2 (Figs. 14–15), [96] |
| Behavioral Correctness | Verified deadlock-free lifecycle based on TLE state machine | FDR4 Proof (Appendix A.9) |
| Empirical Evidence | Demonstrated significant storage savings and faster query execution at MVP and enterprise deployment scale | Section 5 |
| Technique | Purpose | Role in Architecture | Benefits |
|---|---|---|---|
| Bitmask Encoding (4.1) | Efficient node selection and state tracking | Foundation: Encodes set membership at O(1) complexity | Compact storage, constant-time operations, parallelizable |
| Three-Level Encapsulation (4.2) | Structured hierarchical data management | Framework: Applies bitmask encoding to Grandparent-Parent-Children structure | Eliminates joins, O(1) relationship queries, scalable design |
| Aspect | PBFD Outcome | Reference & Notes |
|---|---|---|
| Development Speed | At least 9× faster than equivalent relational development and 20× faster than OmniScript; full-stack system delivered in 1 FTE-month | Appendix A.20 — longitudinal observational study [103,104] |
| Runtime Performance | 7.64× faster (P50), 8.54× faster (P95); P5 equal to baseline (identical latency floor); sustained across 8 years | Appendix A.21 — quasi-experimental runtime comparison under identical infrastructure [105,106] |
| Storage Efficiency | 11.7× less reserved space, 85.7× smaller index size, 113.5× better page utilization; eliminated junction tables | Appendix A.22 — controlled schema-level evaluation comparing PBFD vs. normalized designs [105,107] |
| System Stability | Zero critical defects, deadlocks, or regressions across 8 years | Internal monitoring; Longitudinal observational study [97] |
| Onboarding Efficiency | Junior developer delivered a production feature in one week | Internal engineering metrics — qualitative observational evidence [107] |
| Design Dimension | Development Speed | Runtime Performance | Storage Efficiency |
|---|---|---|---|
| Unit of Comparison | Implementation methodology (PBFD vs. relational vs. OmniScript) | Different UI endpoints within the same deployed application | Different schema designs (TLE vs. normalized) within the same database |
| Evaluation Focus | Effort and time required to implement equivalent functionality | Request latency and execution speed | Reserved space, index size, and page utilization |
| Controlled Variables | Shared enterprise context, functional requirements, audit logging | Same hardware and application context; workload varies by page logic | Same DBMS, hardware, and data volume |
| Independent Variable | Development methodology and platform | Page-level logic and rendering paths | Schema structure (TLE vs. normalized joins) |
| Study Type | Longitudinal observational case study | Quasi-experimental comparison | Controlled schema-level experiment |
| Scenario | Traditional FSSD Advantage | Trade-off with PDFD | Trade-off with PBFD |
|---|---|---|---|
| Small-Scale Projects | Minimal setup and tooling overhead consistent with lightweight processes [111] | Vertical slicing overhead unnecessary for trivial systems | Hierarchical encoding and TLE architecture add unnecessary complexity. |
| Rapid Prototyping | Drag-and-drop tools quick iteration enabled | Slower initial visibility due to vertical rigor | Architecture-first planning delays visible prototypes. |
| Non-Hierarchical Systems | Works well for simple CRUD apps and dashboards | Hierarchy modeling unnecessary | Hierarchical encoding (TLE, bitmasks) provides no benefit. |
| Legacy Integration | Compatible with existing monolithic, relational systems | Requires refactoring into vertical feature slices with explicit dependencies | Legacy schemas must be restructured into TLE's three-level hierarchical architecture. |
| Team Familiarity | Common practice with extensive tooling support [112] | Requires learning feature-first structuring and validation workflows | A solid understanding of TLE, bitmask encoding, and level-wise progression is required. |
| Criterion | Traditional FSSD | PDFD | PBFD |
|---|---|---|---|
| Method Focus | Iterative feature development with flexible layering [110] | Complete vertical feature slices (UI→Logic→DB) with early integration | Systematic layer-by-layer development with pattern-driven refinement |
| Progression Model | Flexible layer transitions; sprint-based iteration | Depth-first traversal per feature slice with bounded refinement (Rₘₐₓ) | Breadth-first level traversal with selective depth-first pattern elaboration and bounded refinement (Rₘₐₓ) |
| Early Deliverable | Partial features across layers; integration deferred | Fully functional end-to-end feature slice | Complete architectural skeleton with interface definitions across all layers |
| Risk Visibility | Late-stage integration and architectural risks [65] | Feature-level integration risks identified and resolved early | Interface contracts and architectural inconsistencies identified early |
| Concurrency | Sprint-based parallelism with cross-functional teams | Controlled parallel feature development via Kᵢ threshold (WIP limit per level) | Parallel layer development after interface stabilization |
| Architectural Discipline | Emergent architecture evolving through iterative refinement | Explicit dependency structure via directed acyclic graph (DAG) with feature-level adaptation | Strong upfront hierarchical design with DAG-enforced dependencies and TLE-encoded structure |
| Predictability | Variable integration timelines; architecture emerges over time | High predictability for vertical slice completion and feature delivery | High predictability for architectural coverage and systematic layer completion |
| Ideal Use Cases | Simple consumer applications, low-risk web/mobile projects | Enterprise applications requiring early end-to-end validation; safety-critical systems | Platform systems, distributed architectures, and deeply nested hierarchical data models |
| Aspect | Conventional Relational Schema | PBFD with TLE Schema |
|---|---|---|
| Hierarchy Representation | Foreign-key relationships; graph edges stored as references across tables | Bitmask encoding; child membership compressed into integer fields within parent columns |
| Hierarchy Resolution | Recursive queries or multi-hop joins (O(m log n) for m relationships with B-tree indexes) | Bitwise operations on encoded paths (O(1) per parent-child query) |
| Query Pattern | Multi-table joins traversing foreign keys | Single-table queries using bitwise predicates on bitmask columns |
| Scalability Approach | Functional or domain-based partitioning | Horizontal partitioning at grandparent level with independent TLE table instances |
| Relationship Storage Overhead | Foreign-key columns with supporting indexes (k bits per relationship) | Compact bitmask fields (1 bit per child node) |
| Update Operations | Multi-row INSERT/UPDATE/DELETE across related tables | Single-row bitwise updates within grandparent table cells |
| Approach | Strengths | Weaknesses | How PBFD/PDFD Address These |
|---|---|---|---|
| Relational | ACID compliance, mature tooling , strong consistency guarantees | Recursive joins required for hierarchies (O(n log n)); poor native hierarchy support | TLE architecture: Eliminates recursive joins via bitmask-encoded parent-child relationships, achieving O(1) hierarchy queries while preserving ACID guarantees |
| Graph (Neo4j) | Natural hierarchy traversal and relationship queries [113] | High storage overhead for edge metadata; lacks formal schema discipline | PDFD/PBFD structure: Enforces formal DAG-based schema with explicit dependency management; TLE encoding: Reduces edge storage via compact bitmask representation |
| Document Stores (MongoDB) | Schema flexibility; embedded document hierarchies | No formal hierarchy guarantees; inconsistent nested structure | PDFD/PBFD methodology: Provides formal hierarchical validation and state machine guarantees; TLE pattern: Enforces consistent three-level structure with verified state transitions |
| XML Databases | Native tree queries via XPath/XQuery [114] | Slow updates due to DOM manipulation; poor horizontal scalability | TLE implementation: Single-row atomic updates via bitwise operations; PBFD partitioning: Horizontal scaling through grandparent-level table distribution |
| Columnar Stores (Cassandra) | High-performance batch reads; excellent write throughput [52] | Weak transaction guarantees; limited join support | Hybrid TLE architecture: Combines relational ACID guarantees with columnar-style fixed-width encoding; achieves transactional safety with efficient batch processing |
| Aspect | Traditional Bitmap Indexing | PBFD Bitmask Encoding |
|---|---|---|
| Primary Purpose | Query optimization for filtering low-cardinality columns [115] | Hierarchical relationship representation and traversal |
| Granularity | One bitmap per distinct attribute value across all rows | One bit per child node within each parent's bitmask |
| Hierarchy Awareness | None; operates on flat attribute values only | Native support for multi-level hierarchies via Three-Level Encapsulation (TLE) |
| Storage | Separate bitmap for each distinct value (external index structure) | Bitmasks embedded within parent rows (one bitmask column per parent type) |
| Query Pattern | Accelerates WHERE clauses on indexed columns via bitmap operations | Enables O(1) parent-child membership queries via bitwise tests |
| Use Case | Data warehouse filtering on low-cardinality dimensions | Hierarchical data compaction and constant-time relationship traversal |
| Aspect | Multiple Columns | Multiple Rows | PBFD Bitmask Encoding |
|---|---|---|---|
| Storage Footprint | High: separate column for each child node (e.g., n columns for n children) | High: one row per selected child, requiring foreign keys and indexes | Compact: single integer field per parent (1 bit per child; n ≤ 64 fits in 64-bit word) |
| Query Complexity | O(n) column scans to check all children | O(n) joins or subqueries to aggregate selections | O(1) bitwise tests for membership checks (for n ≤ w) |
| Update Operations | O(n) column updates for batch changes | O(n) INSERT/DELETE operations for relationship changes | O(1) bitwise operations (OR, AND, XOR) for atomic updates |
| Scalability | Schema changes required to add new children (DDL operations) | Join complexity increases with relationship count | Bounded by word size w (typically 64); extensible to O(⌈n/w⌉) for n > w via multi-word encoding |
| Schema Flexibility | Rigid: requires DDL for each new child | Flexible: new relationships via INSERT | Semi-flexible: bounded by bitmask capacity; requires column type upgrade for n > w |
| Benefit | PDFD | PBFD |
|---|---|---|
| Development Velocity | Enables early completion of fully functional vertical feature slices | Accelerates development via pattern-driven modularity and level-wise batch processing |
| Scalability | Supports independent scaling of modular feature slices | Supports horizontal partitioning at the TLE grandparent level, enabling distributed processing [53] |
| Rigor and Quality | Enforces formal state transitions with bounded refinement cycles (Rₘₐₓ) ensuring termination | Combines pattern-level validation with bounded refinement cycles (Rₘₐₓ), ensuring both horizontal coverage and vertical correctness |
| Architectural Clarity | Enforces explicit feature boundaries and dependency structures via directed acyclic graphs | Enforces layered hierarchical design via directed graphs and Three-Level Encapsulation (TLE), aligning with architectural modularity principles [65] |
| Data Model | Proposed TLE Mapping | Key Research Question |
| Document Database (MongoDB) | Collection → Document → Nested bitmask fields | Do MongoDB's bitwise operators ($bitsAllSet) provide query advantages over array-based flags, or do index scan costs outweigh storage benefits in row-oriented BSON? |
| Key-Value Store (Redis) | Key namespace prefix → Structured keys → Bitmask values | Why does user→bitmask fail for cohort queries, and how does permission→bitmap achieve O(1) filtering with BITOP operations? |
| Graph Database (Neo4j) | Node labels → Node instances → Properties with bitmasks | When do bitmask properties undermine index-free adjacency, and how do native edges preserve traversal performance? |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).