Submitted:
27 January 2026
Posted:
28 January 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
2.1. Dynamic Scene Modeling for Autonomous Driving
2.2. Photometric Inconsistency and Appearance Modeling
2.3. Geometric Consistency and Surface Reconstruction
3. Methodology
3.1. Preliminaries: 3D Gaussian Splatting
3.2. Dynamic Gaussian Scene Graph Construction
3.2.1. Graph Node Definitions
- Sky Node (): We model the sky using a Far-Field Environment Map representation. To address the infinite depth of the sky, we initialize as a set of Gaussians distributed on a large bounding sphere with radius . These Gaussians are translation-invariant relative to the camera, with their appearance dependent solely on the viewing direction . This handles the high-dynamic-range background without introducing depth artifacts.
- Background Node (): The static urban environment (e.g., roads, buildings, vegetation) is represented by stationary 3D Gaussians in the world frame. Their parameters optimize the time-invariant geometry of the scene, providing a stable geometric backbone.
- Dynamic Node (): Moving agents (vehicles, pedestrians) are handled via object-centric graphs. Instead of modeling them in world space directly, we maintain a set of canonical Gaussians in a local coordinate system for each object k. This allows the model to share geometric features across timestamps.
3.2.2. Rigid and Deformable Object Modeling
3.2.3. Graph Composition and Rasterization
3.3. Hierarchical Exposure Compensation
3.3.1. Level 1: Global Exposure Affine Module
3.3.2. Level 2: Multi-Scale Bilateral Grid

3.4. Optimization Strategy
3.4.1. Geometry-Aware SSIM-Gating
- Geometric Warm-up (): When structural similarity is low, the gate is closed (). The exposure modules are effectively frozen (or reduced to identity). The optimization force focuses solely on adjusting the 3D Gaussian parameters (position, rotation, scaling) to match the scene structure using .
- Photometric Refinement (): Once the geometry is sufficiently reliable, the gate opens. The framework then jointly optimizes the exposure parameters to refine photometric alignment, correcting for sensor-specific discrepancies without corrupting the geometry.
3.4.2. Spatiotemporal Smoothness Constraints
3.4.3. Total Objective
4. Experiments
4.1. Experimental Setup
4.1.1. Datasets
- 1.
- Waymo Open Dataset[27]: A large-scale driving dataset providing high-quality synchronized camera images and LiDAR point clouds. We select 5 challenging sequences (approx. 1000 frames) featuring significant lighting variations, such as strong shadows, sunlight glare, and dynamic exposure adjustments.
- 2.
- Custom Surround-View Dataset: To evaluate performance in unconstrained "in-the-wild" scenarios, we collected data using a vehicle-mounted rig of 6 cameras configured as a surround-view system. Each sensor captures high-resolution images () with independent auto-exposure (AE) and auto-white-balance (AWB) enabled. The dataset spans diverse driving conditions, including urban low-speed navigation in crowded streets, high-speed cruising on city expressways, and varying illumination from daytime to nighttime. Consequently, it is characterized by rapid inter-frame brightness shifts, severe lens vignetting, and extreme dynamic range changes, posing significant challenges for photometric consistency across the field of view.
4.1.2. Evaluation Metrics
- Photometric Metrics: We report PSNR (↑), SSIM (↑), and LPIPS (↓). These metrics measure pixel-wise signal fidelity, structural similarity, and perceptual quality, respectively. All photometric metrics are computed between the final compensated rendering and the ground truth sensor images.
- Geometric Metric (LiDAR-based): To strictly validate the physical correctness of the reconstructed scene, we utilize LiDAR point clouds as the absolute ground truth. We evaluate the depth accuracy using RMSE (Root Mean Square Error) (↓):where is the estimated depth rendered from our Gaussian splatting model, and is the sparse but accurate ground truth depth obtained by projecting accumulated LiDAR points onto the camera image plane. denotes the set of pixels with valid LiDAR readings. This metric explicitly penalizes "floating" artifacts or geometric deformations that do not align with the physical LiDAR measurements.
4.1.3. Baselines
- 3DGS[2]: The vanilla 3D Gaussian Splatting baseline.
- Street[22]: Representative dynamic urban scene reconstruction methods based on 3DGS, which model dynamic objects but typically lack explicit exposure handling mechanisms.
- OmniRe[1] A recent state-of-the-art framework that constructs hierarchical scene representations to unify static backgrounds and dynamic entities, serving as a strong baseline for holistic urban scene reconstruction.
4.1.4. Implementation Details
4.2. Comparative Analysis
4.2.1. Quantitative Evaluation on Public Benchmark
4.2.2. Quantitative Evaluation on Our Self-Collected "In-the-wild" Dataset
- Photometric Inconsistency: Independent Auto-Exposure (AE) and Auto-White-Balance (AWB) cause drastic brightness shifts.
- LiDAR-Vision FoV Mismatch: Our setup exhibits a significant Field-of-View (FoV) gap between the cameras and the sparse LiDAR.
4.2.3. Qualitative Comparison
4.3. Geometric Consistency and Ablation Study
- Global Affine Only: Adding global compensation yields a significant boost in photometric quality (PSNR: 28.15 → 30.50 dB), confirming that global sensor sensitivity shifts are the primary source of reconstruction error in auto-exposure footage.
- The Overfitting Trap (Grid w/o Gate): When the Multi-Scale Bilateral Grid is introduced without gating, the model achieves the highest photometric scores (PSNR peaks at 33.10 dB). However, this comes at the cost of geometric integrity: the Depth RMSE degrades significantly to 2.95m. This quantitative evidence, validated against LiDAR ground truth, supports our hypothesis that a powerful appearance model, if left unchecked, will "overfit" to photometric inconsistencies by deforming the scene geometry (creating artifacts to minimize RGB loss).
- Efficacy of SSIM Gating (Full Method): By enabling SSIM-Gated Optimization, we successfully resolve this ambiguity. Although there is a negligible drop in PSNR (-0.32 dB compared to the non-gated version), the geometric accuracy improves drastically (RMSE drops from 2.95m to 1.89m). This result quantitatively demonstrates that our gating strategy effectively prioritizes correct physical structure over pixel-perfect photometric overfitting, achieving the best balance between rendering quality and geometric fidelity.
5. Discussion
5.1. Resolving the Texture-Geometry Ambiguity
5.2. Bridging the Gap to Production Data
5.3. Limitations and Future Directions
6. Conclusion
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Chen, Z.; Yang, J.; Huang, J.; de Lutio, R.; Martinez Esturo, J.; Ivanovic, B.; Litany, O.; Gojcic, Z.; Fidler, S.; Pavone, M.; et al. Omnire: Omni urban scene reconstruction. arXiv 2024. arXiv:2408.16760. [CrossRef]
- Kerbl, B.; Kopanas, G.; Leimkühler, T.; Drettakis, G. 3D Gaussian Splatting for Real-time Radiance Field Rendering. ACM Transactions on Graphics 2023, 42(4), 1–14. [Google Scholar] [CrossRef]
- Mildenhall, B.; Srinivasan, P. P.; Tancik, M. T.; Barron, J. T.; Ramamoorthi, R.; Ng, R. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM 2021, 65(1), 99–106. [Google Scholar] [CrossRef]
- Zhang, S.; Ye, B.; Chen, X.; Chen, Y.; Zhang, Z.; Peng, C.; Shi, Y.; Zhao, H. Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty. In Proceedings of the British Machine Vision Conference (BMVC) arXiv, 2024. [Google Scholar]
- Kulhanek, J.; Peng, S.; Kukelova, Z.; Pollefeys, M.; Sattler, T. Wildgaussians: 3d gaussian splatting in the wild. arXiv 2024. arXiv:2407.08447. [CrossRef]
- Ye, S.; Dong, Z.-H.; Hu, Y.; Wen, Y.-H.; Liu, Y.-J. Gaussian in the Dark: Real-Time View Synthesis From Inconsistent Dark Images Using Gaussian Splatting. Proceedings of Pacific Graphics 2024 (PG 2024) arXiv, 2024. [Google Scholar]
- Chen, X.; Xiong, Z.; Chen, Y.; Li, G.; Wang, N.; Luo, H.; Chen, L.; Sun, H.; Wang, B.; Chen, G.; Ye, H.; Li, H.; Zhang, Y.-Q.; Zhao, H. DGGT: Feedforward 4D Reconstruction of Dynamic Driving Scenes using Unposed Images. arXiv arXiv:2512.03004. [CrossRef]
- Tonderski, A.; Lindström, C.; Hess, G.; Ljungbergh, W.; Svensson, L.; Petersson, C. Neurad: Neural rendering for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. [Google Scholar]
- Wang, Y.; Wang, C.; Gong, B.; Xue, T. Bilateral guided radiance field processing. ACM Transactions on Graphics (TOG) 2024, 43(4), 1–13. [Google Scholar] [CrossRef]
- Wang, P.; Liu, L.; Liu, Y.; Theobalt, C.; Komura, T.; Wang, W. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv arXiv:2106.10689.
- Martin-Brualla, R.; Radwan, N.; Sajjadi, M. S. M.; Barron, J. T.; Dosovitskiy, A.; Duckworth, D. NeRF in the Wild: Neural radiance fields for unconstrained photo collections. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021; pp. 2–8. [Google Scholar]
- Tancik, M.; Weber, E.; Ng, E.; Li, R.; Yi, B.; Wang, T.; Kristoffersen, A.; Austin, J.; Salahi, K.; Ahuja, A.; et al. Nerfstudio: A modular framework for neural radiance field development. ACM Transactions on Graphics (TOG), 2023; 2. [Google Scholar]
- Huang, N.; Wei, X.; Zheng, W.; An, P.; Lu, M.; Zhan, W.; Tomizuka, M.; Keutzer, K.; Zhang, S. S3Gaussian: Self-Supervised Street Gaussians for Autonomous Driving. arXiv 2024. arXiv:2405.20323.
- Fischer, T.; Kulhanek, J.; Rota Bulò, S.; Porzi, L.; Pollefeys, M.; Kontschieder, P. Dynamic 3D Gaussian fields for urban areas. arXiv 2024, 3. [Google Scholar] [CrossRef]
- Fridovich-Keil, S.; Meanti, G.; Warburg, F. R.; Recht, B.; Kanazawa, A. K-Planes: Explicit radiance fields in space, time, and appearance. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023, 2, 7, 9. [Google Scholar]
- Zhang, D.; Wang, C.; Wang, W.; Li, P.; Qin, M.; Wang, H. Gaussian in the wild: 3d gaussian splatting for unconstrained image collections. In Proceedings of the European Conference on Computer Vision (ECCV) arXiv, 2024; pp. 341–359. [Google Scholar]
- Wu, Z.; Liu, T.; Luo, L.; Zhong, Z.; Chen, J.; Xiao, H.; Hou, C.; Lou, H.; Chen, Y.; Yang, R.; et al. Mars: An instance-aware, modular and realistic simulator for autonomous driving. CAAI International Conference on Artificial Intelligence, 2023; Springer; pp. pages 3–15. [Google Scholar]
- Yuan, S.; Zhao, H. SlimmeRF: Slimmable Radiance Fields. In Proceedings of the 2024 International Conference on 3D Vision (3DV), 2024; pp. 64–74. [Google Scholar]
- Liu, H.; Jiang, P.; Huang, J.; Lu, M. Lumos3D: A Single-Forward Framework for Low-Light 3D Scene Restoration. arXiv arXiv:2511.09818.
- Wang, N.; Chen, Y.; Xiao, L.; Xiao, W.; Li, B.; Chen, Z.; Ye, C.; Xu, S.; Zhang, S.; Yan, Z.; Merriaux, P.; Lei, L.; Xue, T.; Zhao, H. Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting. arXiv arXiv:2506.05280. [CrossRef]
- Afifi, M.; Zhao, L.; Punnappurath, A.; Abdelsalam, M. A.; Zhang, R.; Brown, M. S. Time-Aware Auto White Balance in Mobile Photography. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Honolulu, Hawaii, USA, Oct 19–23; 2025, pp. 64–74.
- Yan, Y.; Lin, H.; Zhou, C.; Wang, W.; Sun, H.; Zhan, K.; Lang, X.; Zhou, X.; Peng, S. Street Gaussians: Modeling dynamic urban scenes with gaussian splatting. In Proceedings of the European Conference on Computer Vision, 2024; Springer; pp. 156–173. [Google Scholar]
- Zhou, X.; Lin, Z.; Shan, X.; Wang, Y.; Sun, D.; Yang, M.-H. DrivingGaussian: Composite Gaussian splatting for surrounding dynamic autonomous driving scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024; pp. pages 21634–21643. [Google Scholar]
- He, L.; Li, L.; Sun, W.; Han, Z.; Liu, Y.; Zheng, S.; Wang, J.; Li, K. Neural Radiance Field in Autonomous Driving: A Survey. arXiv 2024. arXiv:2404.13816. [CrossRef]
- Du, Y.; Zhang, Y.; Yu, H.-X.; Tenenbaum, J. B.; Wu, J. Neural radiance flow for 4D view synthesis and video processing. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021; 2. [Google Scholar]
- Dahmani, H.; Bennehar, M.; Piasco, N.; Roldao, L.; Tsishkou, D. SWAG: Splatting in the wild images with appearance-conditioned gaussians. arXiv 2024. arXiv:2403.104273, 7, 8, 9.
- Sun, P.; Kretzschmar, H.; Dotiwalla, X.; Chouard, A.; Patnaik, V.; Tsui, P.; Guo, J.; Zhou, Y.; Chai, Y.; Caine, B.; et al. Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020; pp. 2446–2454. [Google Scholar]
- Wilson, B.; Qi, W.; Agarwal, T.; Lambert, J.; Singh, J.; Khandelwal, S.; Pan, B.; Kumar, R.; Hartnett, A.; Pontes, J. K.; et al. Argoverse 2: Next generation datasets for self-driving perception and forecasting. arXiv 2023. arXiv:2301.00493. [CrossRef]
- Xu, B.; Xu, Y.; Yang, X.; Jia, W.; Guo, Y. Bilateral grid learning for stereo matching networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021; pp. 12497–12506. [Google Scholar]
- Liu, M.; Liu, J.; Zhang, Y.; Li, J.; Yang, M. Y.; Nex, F.; Cheng, H. 4DSTR: Advancing Generative 4D Gaussians with Spatial-Temporal Rectification for High-Quality and Consistent 4D Generation. In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI) arXiv, 2025. [Google Scholar]
- Guédon, A.; Lepetit, V. Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering. arXiv 2023. arXiv:2311.12775.
- Jiang, Y.; Tu, J.; Liu, Y.; Gao, X.; Long, X.; Wang, W.; Ma, Y. GaussianShader: 3D Gaussian Splatting with Shading Functions for Reflective Surfaces. arXiv 2023. arXiv:2311.17977. [CrossRef]
- Fu, C.; Chen, G.; Zhang, Y.; Yao, K.; Xiong, Y.; Huang, C.; Cui, S.; Matsushita, Y.; Cao, X. RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS. arXiv arXiv:2512.04815.
- Huang, B.; Yu, Z.; Chen, A.; Geiger, A.; Gao, S. 2D Gaussian Splatting for Geometrically Accurate Radiance Fields. ACM SIGGRAPH 2024 Conference Papers, 2024; pp. 1–11. [Google Scholar]
- Huang, X.; Li, J.; Wu, T.; Zhou, X.; Han, Z.; Gao, F. Flying in Clutter on Monocular RGB by Learning in 3D Radiance Fields with Domain Adaptation. In Proceedings of the 2025 IEEE International Conference on Robotics and Automation (ICRA) 2025 arXiv, 2025. [Google Scholar]
- Wang, J.; Che, H.; Chen, Y.; Yang, Z.; Goli, L.; Manivasagam, S.; Urtasun, R. Flux4D: Flow-based Unsupervised 4D Reconstruction. arXiv arXiv:2512.03210.
- Huang, Y.; Bai, L.; Cui, B.; Li, Y.; Chen, T.; Wang, J.; Wu, J.; Lei, Z.; Liu, H.; Ren, H. Endo-4DGX: Robust Endoscopic Scene Reconstruction and Illumination Correction with Gaussian Splatting. In Proceedings of the 2025 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) 2025 arXiv, 2025. [Google Scholar]
- Wei, X.; Ye, Z.; Gu, Y.; Zhu, Z.; Guo, Y.; Shen, Y.; Zhao, S.; Lu, M.; Sun, H.; Wang, B.; Chen, G.; Lu, R.; Ye, H. ParkGaussian: Surround-view 3D Gaussian Splatting for Autonomous Parking. arXiv 2026. arXiv:2601.01386.



| (a) Ground Truth | ||
| (b) OmniRe (Severe Floating Artifacts) | ||
| (c) Lumina-4DGS (Ours) | ||
| Method | Photometric Quality | Geometric Quality | ||
|---|---|---|---|---|
| PSNR ↑ | SSIM ↑ | LPIPS ↓ | Depth RMSE (m)↓ | |
| 3DGS[2] | 26.00 | 0.918 | 0.117 | 2.80 |
| Street Gaussians [22] | 29.08 | 0.936 | 0.125 | 2.20 |
| OmniRe [45] | 34.61 | 0.938 | 0.079 | 2.05 |
| Lumina-4DGS (Ours) | 35.12 | 0.956 | 0.072 | 1.89 |
| Method | PSNR ↑ | SSIM ↑ | LPIPS ↓ |
|---|---|---|---|
| 3DGS [2] | 23.15 | 0.765 | 0.385 |
| Street Gaussians [22] | 23.82 | 0.772 | 0.368 |
| OmniRe [45] | 24.90 | 0.796 | 0.344 |
| Lumina-4DGS (Ours) | 27.23 | 0.811 | 0.112 |
| Components | Photometry | Geometry | |||
|---|---|---|---|---|---|
| Global | Grid | Gate | PSNR ↑ | SSIM ↑ | Depth RMSE↓ |
| - | - | - | 28.15 | 0.852 | 2.80 |
| √ | - | - | 30.50 | 0.880 | 2.75 |
| √ | √ | - | 33.10 | 0.920 | 2.95 (Degraded) |
| √ | √ | √ | 32.78 | 0.915 | 1.89(Restored) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).