5. Unity Profiling Tools for Performance and Memory Optimization
In this section, we provide an in-depth, analysis and optimization procedure for enhancing performance and Unity’s memory usage throughout scientific visualization simulation. The study employs a multi-faceted evaluation framework, including thermal wall condition-based performance assessment over a particular interval, in-depth profiling using Unity’s Performance Profiler along with the Garbage Collector (GC) incremental being disabled, and inspection of memory consumption trends across execution timelines in Unity’s Memory Profiler. Unity offers a comprehensive profiling ecosystem to analyze and optimize the performance of real-time applications. Two critical components in this suite are the Profile Analyzer and the Memory Profiler, which enable systematic evaluation of runtime execution behavior and memory utilization.
The Profile Analyzer operates on data captured from Unity’s native Profiler and facilitates statistical analysis across multiple frames. It supports detailed inspection of CPU-intensive operations, including markers such as PlayerLoop, EditorLoop, and render pipeline stages like ExecuteRenderGraph and DoRenderLoop_Internal. By aggregating data per frame, it computes aggregate metrics such as mean, median, min, max, and standard deviation, allowing developers to identify execution hotspots, temporal anomalies, and performance regressions across test conditions. This tool is significantly effective for comparative analysis between build variants, platform targets, or rendering configurations.
The Memory Profiler provides a snapshot-based segmented breakdown of memory allocations within Unity’s execution environment. It differentiates between Tracked Memory (memory explicitly managed by Unity, considering the resource files, materials, and scripts) and Untracked Memory (memory that is not directly visible to the engine, typically from native or external sources). The profiler also reports Managed Heap statistics: Specifically, In Use (actively referenced memory) and Reserved (pre-allocated but unused heap space), as well as fragmentation and object lifecycle data based on the Graphics Drivers, Executable, and DLLs. These metrics prove to be essential for evaluating garbage collection behavior through GC (Garbage Collector) incremental, asset loading overhead, and memory retention issues.
Profiling analysis conducted across multiple configurations of our Unity-based scientific visualization project revealed ten markers that consistently contributed the highest computational load. Notably, PlayerLoop and EditorLoop—which represent the core execution cycles of runtime and editor environments, respectively—demonstrated substantial performance consumption, indicating that both play a critical role in driving the animation and scene updates of our flow visualizations. Markers associated with Unity’s High Definition Render Pipeline (HDRP), including HDRenderPipelineRenderCamera, Inl_HDRenderPipelineRenderCamera, and Inl_HDRenderPipelineAllRenderRequest, reflect the rendering intensity inherent in our project, which involves high-resolution contour animations and shader-heavy visuals. Additionally, the ExecuteRenderGraph and Inl_ExecuteRenderGraph markers, responsible for coordinating render pass execution through HDRP’s Render Graph architecture, were significant contributors to frame time, highlighting the complexity of rendering layered flow fields across multiple podiums. The presence of Inl_RecordRenderGraph further emphasizes the cost of preparing and recording these passes, particularly when rendering large GLTF/GLB assets in real time. The marker UnityEngine.CoreModule.dll!UnityEngine.Rendering underscores the low-level rendering tasks initiated by Unity’s core systems, and its high resource usage suggests shader compilation and draw call management are nontrivial in our pipeline. Finally, Profiler.FlushCounters, though not directly part of the gameplay loop, consumed measurable resources due to the profiling process itself, particularly when capturing the data per frame. Together, these markers provide critical insights into optimization targets, especially under the memory and computational constraints imposed by our use of GLTF/GLB formats and real-time streaming of scientific data.
Secondly, enabling or disabling incremental garbage collection wherever appropriate and monitoring memory allocation through Unity’s Memory Profiler can help minimize GC-related spikes. Key optimization techniques involve minimizing memory fragmentation through proper asset loading/unloading management and leveraging object pooling to reduce allocation frequency. Compression of textures, shaders and meshes, along with efficient handling of native code (including graphics drivers and DLLs), can significantly reduce memory overhead. Additionally, minimizing the memory footprint of untracked allocations, such as those from external libraries or plugins, is crucial for overall optimization. To further optimize memory usage, it is critical to monitor memory spikes during scene transitions or asset loading and to ensure that unused objects and unmanaged memory are disposed of appropriately to prevent memory leaks. Continuous monitoring of both tracked and untracked memory using Unity’s Memory Profiler enables the reduction of GC-related performance issues. The process and outcomes of these attempts to optimize simulation execution for real-time applications are described in the following phases.
5.1. Thermal Wall Conditions Based Stability Evaluation: Hot, Cold and Adiabatic
To evaluate the effect of runtime period on performance and resource usage, the flow simulations were conducted at intervals of 5, 10, and 15 minutes. Performance metrics were recorded throughout the experiment to calculate how increase in simulation per affected execution time (in milliseconds) and system stability for both long-duration formats. Results demonstrate that performance stabilizes with an increase in runtime, indicating the potential for long-duration simulations with predictable resource consumption when optimized correctly. This procedure highlighted the variations in memory handling and performance stability between the GLTF and GLB formats, particularly in extended simulation scenarios.
Preliminary profiling data derived from the Unity-based implementation reveal that the Hot wall flow scenario consistently incurs longer execution times across the majority of key computational markers when compared to the Cold and Adiabatic boundary conditions. This empirical observation aligns with established theoretical expectations in both fluid dynamics and computational visualization. Specifically, hot-wall boundary conditions promote increased flow isotropy and the development of finer-scale turbulent structures, which in turn generate more complex iso-contour topologies and elevated particle interaction densities (heavier frame files). Within the Unity environment, such flow characteristics manifest as increased computational overhead, primarily due to intensified mesh deformation, a greater number of vertex manipulations, and more frequent shader evaluations executed per frame. Furthermore, the presence of thermally induced vortical structures necessitates the instantiation of denser particle fields, thereby amplifying draw call frequency and real-time physics computation demands during animation rendering.
Such elevated complexity in simulation and visualization directly affects performance metrics like PlayerLoop, Editorloop HDRenderPipelineRenderCamera, and ExecuteRenderGraph, leading to a measurable increase in frame time and resource usage for the Hot case. The render pipeline, especially in the HDRP framework, experiences additional computational overhead when processing complex materials and lighting conditions related to thermal plumes and gradients.
Figure 5.
Comparison of mean execution times for the top 10 most resource-consuming markers across Cold, Hot, and Adiabatic flow cases over a 15-minute runtime period.
Figure 5.
Comparison of mean execution times for the top 10 most resource-consuming markers across Cold, Hot, and Adiabatic flow cases over a 15-minute runtime period.
Notably, during extended runtime evaluations (e.g., 15-minute continuous execution), the Cold flow case begins to demonstrate computational resource utilization that is comparable to, or marginally exceeds, that of the Hot case for specific performance markers (
Figure 5). This observed shift is principally attributable to the cumulative memory allocation patterns and asset management behaviors inherent to the Unity engine. Specifically, the increasing complexity of mesh geometries and the rendering of high-resolution contour plots in the Cold scenario progressively impose greater demands on both memory and processing subsystems. Despite this transition, the initial observation—that the Hot flow configuration incurs a higher computational load—remains significant, as it provides critical insight into the influence of thermofluidic properties (e.g., elevated temperatures and intensified turbulence) on Unity’s real-time visualization efficiency.
5.2. Profiler Based Performance Comparison: GLTF vs GLB
A profiler-based comparative analysis was conducted using Unity’s Profile Analyzer to evaluate execution performance between Unity projects utilizing individual GLTF files and a merged GLB file. Each project contained 40 frames across three thermal flow cases - Cold, Hot, Adiabatic, with simulations executed over a 15-minute interval to ensure consistent performance environment. Profiling in this procedure focused on the top 10 highly effective markers
PlayerLoop,
EditorLoop,
HDRenderPipelineRenderCamera,
Inl_ExecuteRenderGraph, and
DoRenderLoop_Internal.
Figure 6 presents the mean execution time comparison, showing that the GLTF-based project (blue) consistently demonstrated lower execution times compared to the GLB-based project (red) across critical render and update loops. The results highlight that the GLTF format sustains additional runtime overhead, likely due to its distributed structure, which requires multiple discrete resource loading operations, along with an increase in shader/material update costs during real-time data rendering. Whereas, the GLB format having monolithic binary structure allows more streamlined memory access and faster resource decoding, resulting in lower execution times across complex rendering and simulation stages.
Furthermore, targeted optimizations involving Unity’s Garbage Collector (GC) configuration were evaluated to optimize memory stability during high-load simulations. Disabling incremental GC yielded a significant decrease in runtime memory spikes and improved overall simulation efficiency (
Figure 7), particularly during intensive flow animations. This evaluation aimed to assess computational efficiency and determine the optimal format for large-scale scientific visualization workflows in Unity. These findings suggest that, although GLTF format remains advantageous for modular resource management, GLB, along with disabled Garbage Collector (GC) configurations, is worthier for real-time computational performance and runtime efficiency in scientific visualization workflows in Unity. To maximize performance in expansive simulations, pre-processing strategies like merging individual GLTF frames into optimized GLB assets are essential, guaranteeing improved rendering stability and decreased loading overhead without compromising on any scientific detail.
Figure 6.
Comparison of mean execution times for the top 10 most performance-consuming profiler markers across GLTF and GLB projects.
Figure 6.
Comparison of mean execution times for the top 10 most performance-consuming profiler markers across GLTF and GLB projects.
Figure 7.
Comparison of Mean Execution Times for Top 10 Markers with Garbage Collection Enabled and Disabled.
Figure 7.
Comparison of Mean Execution Times for Top 10 Markers with Garbage Collection Enabled and Disabled.
5.3. Memory Utilization Analysis for GLTF vs GLB in Scientific Visualization
In high-fidelity scientific visualization applications such as Direct Numerical Simulation (DNS) flow animations, efficient memory management plays a pivotal role in ensuring smooth runtime performance and scalability, particularly in immersive environments like virtual and mixed reality. To evaluate runtime performance and optimize resource loading along with animation, two approaches were recorded: one utilizing 40 individually loaded and animated GLTF files, and the other employing 40 merged GLB files. Both configurations were evaluated using Unity’s Memory Profiler tool during the execution of animated flow visualizations that provided quantitative insight into the distribution of memory usage between managed, tracked, and untracked segments. The goal of this experiment was to analyze the memory utilization patterns associated with each method and to determine which approach offers a more optimized memory consumption for high-volume simulation data.
The GLTF-based project recorded a total memory usage of 17.22 GB (
Figure 8), with tracked memory at 14.82 GB in use (out of 16.10 GB reserved) and untracked memory totaling 1.12 GB. In contrast, the GLB-based setup reported a lower total memory usage of 15.44 GB (
Figure 9), with tracked memory at 14.01 GB (reserved: 14.48 GB) and untracked memory at 0.98 GB. This notable reduction of approximately 10.3% in total usage illustrates how binary asset consolidation in the GLB format leads to more efficient memory handling. Additionally, a lower memory reservation gap in the GLB project indicates tighter memory control, reducing fragmentation, and improving predictability.
Table 2.
Memory Consumption Metrics for 40 Individual GLTFs and 40 Merged GLB.
Table 2.
Memory Consumption Metrics for 40 Individual GLTFs and 40 Merged GLB.
| Metric |
40 GLTFs (Individual) |
40 GLB (Merged) |
Reduction |
| Total Memory Usage |
17.22 GB |
15.44 GB |
1.78 GB (10.3%) |
| Tracked Memory (In Use / Reserved) |
14.82 GB / 16.10 GB |
14.01 GB / 14.48 GB |
810 MB in use |
| Untracked Memory |
1.12 GB |
0.98 GB |
140 MB |
| Managed Heap (In Use / Reserved) |
215.6 MB / 850 MB |
93.7 MB / 124.8 MB |
121.9 MB used |
A significant difference was also observed in managed heap usage, which directly relates to the .NET runtime’s memory allocation. The GLTF version used 215.6 MB of the heap (reserved: 850 MB), while the GLB version used only 93.7 MB (reserved: 124.8 MB). This stark contrast implies higher garbage collection overhead and frequent allocations in the GLTF approach, likely due to the repetitive instantiation of metadata such as GameObjects, materials, and meshes. In contrast, the GLB strategy minimized these allocations by embedding all assets into a single reusable structure, reducing the load on the garbage collector, and improving runtime consistency.
Figure 8.
Memory Profiler Snapshot Summary for 40 individual GLTFs.
Figure 8.
Memory Profiler Snapshot Summary for 40 individual GLTFs.
Figure 9.
Memory Profiler Snapshot Summary for 40 merged GLBs.
Figure 9.
Memory Profiler Snapshot Summary for 40 merged GLBs.
Ultimately, the memory profiling results clearly establishes the advantages of using merged GLB files over individually loaded GLTF assets in Unity-based DNS visualization projects as a more memory-efficient solution for large-scale, frame-by-frame scientific animations. The GLB approach offers a more compact and controlled memory profile, with a lower total memory consumption, reduced untracked allocations, and a significantly smaller managed heap footprint. These characteristics are particularly beneficial in extended reality (XR) applications, where performance constraints are tighter and memory efficiency directly impacts user experience. Consequently, adopting a GLB-based asset pipeline is recommended for scalable and immersive scientific visualizations in Unity.
In the context of our GLTF vs. GLB performance analysis, targeted memory optimization plays a critical role in reducing runtime variability and enhancing stability. Unity’s Memory Profiler revealed that GLB files, by encapsulating geometry, textures, and animations into a compact binary format, significantly reduced managed heap fragmentation compared to individually loaded GLTF assets. This consolidation minimizes per-frame memory allocations and improves asset instantiation efficiency during runtime. Major approach applicable to GLB-based workflows include aggressive texture and mesh compression, which further reduces memory footprint without sacrificing visual fidelity. The unified nature of GLB also reduces the memory overhead associated with redundant asset references and inconsistent loading paths, commonly observed in GLTF workflows. Overall, these improvements aim to reduce frame time variability, supporting measured validation of runtime performance and memory and rendering efficiency in complex immersive environments, ensuring scalable performance for large-scale scientific visualizations in Unity.