Submitted:
27 May 2026
Posted:
27 May 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- 1.
- We design and implement a robust hybrid pipeline that leverages YOLOv8 for the precise detection of 18 distinct furniture and architectural classes, and a ResNet34-backed U-Net for the semantic segmentation of walls and room regions.
- 2.
- We propose an efficient data augmentation and labeling workflow. By applying geometric transformations (horizontal flipping and 90-degree rotations) to both raw images and their corresponding label coordinates (bounding boxes and polygons), we expand a manually annotated dataset of 101 images into a diverse dataset of 303 images, significantly reducing manual labor while preventing model overfitting.
- 3.
- We develop a practical area calculation algorithm that bridges the gap between pixel-level segmentation masks and real-world physical dimensions through interactive scale calibration.
- 4.
- We pioneer the integration of a local LLM (via Ollama) augmented with a Vector Database (RAG pipeline) to ingest the structured output of the vision models and generate professional, regulation-compliant architectural design guidance.
- 5.
- We deploy the integrated models into an interactive, Streamlit-based web application, providing end-users with an intuitive platform to upload images, visualize analysis results, and download comprehensive statistical reports alongside LLM-generated design guidance.
2. Related Work
2.1. Traditional Floor Plan Analysis
2.2. Deep Learning for Architectural Symbol Detection
2.3. Semantic Segmentation of Floor Plans
2.4. Large Language Models in Architectural Design
3. Methodology
3.1. Dataset Preparation and Augmentation Strategy
- Detection Labels: We utilized LabelImg to manually draw bounding boxes for 18 distinct classes (e.g., door, window, bed, dining table, sofa, TV, cupboard, toilet, washbasin, washing machine, air condition). These were saved in the standard YOLO text format.
- Segmentation Labels: We utilized LabelMe to meticulously draw polygons outlining two primary classes: wall and room. These polygon annotations (saved as JSON files) were subsequently converted into categorical 2D segmentation masks using a custom Python script.

3.2. Furniture and Fixture Detection Module
3.3. Room and Wall Segmentation Module
3.4. Real-World Area Calculation Algorithm
3.5. LLM-Integrated Design Guidance and RAG Pipeline
3.5.1. Local Deployment via Ollama
3.5.2. Retrieval-Augmented Generation (RAG)
3.5.3. Model Fine-Tuning for Domain Expertise
4. Experiments and Results
4.1. Experimental Setup
4.2. Object Detection Performance
4.3. Semantic Segmentation Performance


4.4. Evaluation of LLM Design Guidance
4.5. System Integration and Qualitative Evaluation



5. Discussion
6. Conclusion and Future Work
References
- Macedo, C.; et al. A survey on floor plan understanding in document image analysis. Pattern Recognit. Lett. 2015. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2016; pp. 779–788. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical image computing and computer-assisted intervention, 2015; Springer; pp. 234–241. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef]
- Kalervo, A.; Ylioinas, J.; Häikiö, M.; Karacan, A.; Kannala, J. CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis. In Proceedings of the Scandinavian Conference on Image Analysis, 2019; Springer; pp. 28–40. [Google Scholar]
- Nawari, N.O. Building Information Modeling: Automated Code Checking and Compliance Processes; CRC Press: Boca Raton, 2018. [Google Scholar] [CrossRef]
- Hjelseth, E. Foundations for BIM-based model checking systems: transforming regulations into computable rules in BIM-based model checking systems. PhD thesis, Norwegian University of Life Sciences, Ås, 2019. Accessed: Aug. 08, 2023.
- Taneja, S.; Akinci, B.; Garrett, J.H.; Soibelman, L. Algorithms for automated generation of navigation models from building information models to support indoor map-matching. Autom. Constr. 2016, 61, 24–41. [Google Scholar] [CrossRef]
- Zhu, J.; et al. Semantics-based connectivity graph for indoor pathfinding powered by IFC-Graph. Autom. Constr. 2025, 171, 106019. [Google Scholar] [CrossRef]
- Leitfaden Ingenieurmethoden des Brandschutzes. Technischer Bericht vfdb TB 04-01; Technical report. überarbeitete und ergänzte Auflage. 2020.
- Kuligowski, E.; Peacock, R.; Hoskins, B. A Review of Building Evacuation Models, 2nd Edition. National Institute of Standards and Technology, Technical Report NIST TN. Gaithersburg, MD, 2010. [Google Scholar]
- Zhang, Y.; Chai, Z.; Lykotrafitis, G. Deep reinforcement learning with a particle dynamics environment applied to emergency evacuation of a room with obstacles. Phys. A Stat. Mech. Its Appl. 2021, 571, 125845. [Google Scholar] [CrossRef]
- Jabi, W.; Chatzivasileiadi, A.; Wardhana, N.; Lannon, S.; Aish, R. The synergy of non-manifold topology and reinforcement learning for fire egress. In Proceedings of the Proceedings of eCAADe SIGraDi 2019, 2019; pp. 85–96. [Google Scholar] [CrossRef]
- Sharma, J.; Andersen, P.A.; Granmo, O.C.; Goodwin, M. Deep Q-Learning With Q-Matrix Transfer Learning for Novel Fire Evacuation Environment. IEEE Trans. Syst. Man. Cybern. Syst. 2021, 51, 7363–7381. [Google Scholar] [CrossRef]
- Wharton, A. Simulation and investigation of multi-agent reinforcement learning for building evacuation scenarios; Technical report; St Catherine’s College, 2009. [Google Scholar]
- Yao, Z.; Zhang, G.; Lu, D.; Liu, H. Data-driven crowd evacuation: A reinforcement learning method. Neurocomputing 2019, 366, 314–327. [Google Scholar] [CrossRef]
- Zhang, D.; et al. Deep reinforcement learning and 3D physical environments applied to crowd evacuation in congested scenarios. Int. J. Digit. Earth 2023, 16, 691–714. [Google Scholar] [CrossRef]
- Martinez-Gil, F.; Lozano, M.; Fernández, F. Emergent behaviors and scalability for multi-agent reinforcement learning-based pedestrian models. Simul. Model. Pract. Theory 2017, 74, 117–133. [Google Scholar] [CrossRef]
- Bauministerkonferenz. Musterbauordnung - MBO (Fassung November 2002), zuletzt geändert durch Beschluss der Bauministerkonferenz vom 22./23.09.2022, 2002. Accessed: Dec. 10, 2023.
- Kumar, S.S.; Cheng, J.C.P. A BIM-based automated site layout planning framework for congested construction sites. Autom. Constr. 2015, 59, 24–37. [Google Scholar] [CrossRef]
- Abotaleb, I.; Nassar, K.; Hosny, O. Layout optimization of construction site facilities with dynamic freeform geometric representations. Autom. Constr. 2016, 66, 15–28. [Google Scholar] [CrossRef]
- Boguslawski, P.; Mahdjoubi, L.; Zverovich, V.; Fadli, F. Automated construction of variable density navigable networks in a 3D indoor environment for emergency response. Autom. Constr. 2016, 72, 115–128. [Google Scholar] [CrossRef]
- Teo, T.A.; Cho, K.H. BIM-oriented indoor network model for indoor and outdoor combined route planning. Adv. Eng. Inform. 2016, 30, 268–282. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement learning: an introduction . In Adaptive computation and machine learning series, second ed.; The MIT Press: Cambridge, Massachusetts, 2018. [Google Scholar]
- Kwiatkowski, A. Simulating crowds with reinforcement learning. PhD thesis, Institut Polytechnique de Paris, 2023. [Google Scholar]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal Policy Optimization Algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar] [CrossRef]
- Bahamid, A.; Ibrahim, A.M.; Shafie, A.A. Crowd evacuation with human-level intelligence via neuro-symbolic approach. Adv. Eng. Inform. 2024, 60, 102356. [Google Scholar] [CrossRef]
- Hodge, V.J.; Hawkins, R.; Alexander, R. Deep reinforcement learning for drone navigation using sensor data. Neural Comput. Appl. 2021, 33, 2015–2033. [Google Scholar] [CrossRef]
- Kuo, P.H.; Yang, W.C.; Hsu, P.W.; Chen, K.L. Intelligent proximal-policy-optimization-based decision-making system for humanoid robots. Adv. Eng. Inform. 2023, 56, 102009. [Google Scholar] [CrossRef]
- Sinpan, N.; Sasithong, P.; Chaudhary, S.; Poomrittigul, S.; Leelawat, N.; Wuttisittikulkij, J. Simulative Investigations of Crowd Evacuation by Incorporating Reinforcement Learning Scheme. In Proceedings of the Proceedings of the 6th International Conference on Algorithms, Computing and Systems, Greece, 2022; pp. 1–5. [Google Scholar]
- Ruying, L.; Wanjing, W.; Burcin, B.G.; Gale, M.L. Enhancing Building Safety Design for Active Shooter Incidents: Exploration of Building Exit Parameters using Reinforcement Learning-Based Simulations. In Proceedings of the Proceedings of the 31st International Workshop on Intelligent Computing in Engineering, Vigo, Spain, 2024; pp. 569–579. Accessed: Jan. 30, 2025.
- Kim, M.; Ham, Y.; Koo, C.; Kim, T.W. Simulating travel paths of construction site workers via deep reinforcement learning considering their spatial cognition and wayfinding behavior. Automation in Construction 2023, 147, 104715. [Google Scholar] [CrossRef]

| Class | Precision | Recall | mAP50 (%) |
|---|---|---|---|
| Door | 0.970 | 0.734 | 78.8 |
| Window | 0.757 | 0.642 | 72.1 |
| Table | 0.834 | 0.911 | 93.0 |
| Chair | 0.982 | 0.974 | 98.5 |
| Bed | 0.979 | 1.000 | 99.4 |
| Sofa | 0.915 | 0.966 | 94.9 |
| Toilet | 0.978 | 0.936 | 97.9 |
| Sink | 0.917 | 0.932 | 94.9 |
| Bathtub | 0.988 | 1.000 | 99.5 |
| Stove | 0.947 | 0.913 | 97.0 |
| Refrigerator | 0.948 | 0.946 | 95.8 |
| Wardrobe | 0.940 | 0.999 | 98.3 |
| TV | 0.887 | 0.362 | 57.4 |
| Desk | 0.898 | 0.938 | 97.1 |
| Washing Machine | 0.891 | 0.909 | 94.9 |
| Load-bearing Wall | 0.940 | 0.970 | 97.2 |
| Air Condition | 0.975 | 1.000 | 99.4 |
| Cupboard | 0.910 | 0.870 | 94.5 |
| Overall | 0.925 | 0.889 | 92.3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.