Submitted:
19 February 2025
Posted:
20 February 2025
Read the latest preprint version here
Abstract
Keywords:
1. Introduction
Application of Congestion Prediction for Traffic Management
- Adaptive Traffic Signal Control: Predictions can help adjust traffic light timings dynamically to alleviate congestion, reducing unnecessary stops and improving travel time efficiency.
- Traffic Rerouting Strategies: Identifying congestion hotspots in advance allows for better traffic diversion plans, guiding vehicles towards less congested routes.
- Resource Allocation: Forecasting congestion patterns enables better deployment of traffic enforcement units, public transit prioritization, and road maintenance scheduling.
- Pollution Reduction: By reducing congestion, emissions from idling vehicles (CO, NO2, PM) can be minimized, contributing to improved urban air quality.
- Long-Term Infrastructure Planning: Insights from congestion predictions can guide urban planning decisions, such as road expansion, traffic signal placement, or alternative transport infrastructure investments.
Novelty and Contributions of the Research:
2. Literature Review
Analysis of Comparative Studies:
3. Dataset Description
3.1. Origin of the Dataset
3.2. Data Description
- Rolling_Avg_Congestion: The average congestion level over the last three time intervals;
- Previous_Congestion: The congestion level observed in the previous time step;
- Direction: The direction taken by the vehicle (14 possible directions, numbered 1 to 14);
- Type: The type of vehicle (Normal, Bus, Tram, Bike);
- Second: The second of the minute when the observation was recorded;
- Minute: The minute of the hour when the observation was recorded;
- Hour: The hour of the day;
- Weekday: The day of the week;
- Day: The day of the month;
- Month: The month of the year.
4. Methodology
4.1. Development Environment
4.2. Data Preprocessing
4.2.1. Temporal Data Consolidation
4.2.2. Weather-Related Attributes Conversion
4.2.3. Direction Encoding
4.2.4. Conversion of Wind Directions to Degrees
4.2.5. Encoding Categorical Variables
- Normal vehicles were assigned a value of 0;
- Buses were encoded as 1, trams as 2, and bikes as 3.
4.3. Handling Missing Values
4.4. Feature Engineering
- Weekday: Day of the week, represented numerically (0–6);
- Day: Day of the year;
- Month: Month of the year;
- Year: Year of the observation;
- IsWeekend: Binary indicator for weekends, where 1 represents a weekend, and 0 represents a weekday;
- Hour, Minute, and Second: Extracted to provide finer temporal granularity.
4.5. Assigning Congestion Severity
4.5.1. Estimating Lane Capacity
4.5.2. Computing the Volume-to-Capacity Ratio
4.5.3. Congestion Classification
- Low Congestion:
- Medium Congestion:
- High Congestion:
4.5.4. Implementation and Automation
4.6. Dual Importance Intersection Feature Selection (DIFS)
4.6.1. Principles of the DIFS Approach
- Random Forest: Evaluates feature importance by measuring its contribution to reducing uncertainty within a decision tree model. RF is well suited for capturing nonlinear relationships between features and the target variable.
- Chi-square: Measures the statistical association between categorical features and the target variable, prioritizing features with strong relationships based on statistical significance.
4.6.2. Advantages of the DIFS Method
- Robustness: By incorporating statistical relevance (Chi2) and model-based feature importance (RF), DIFS minimizes over-reliance on a single method, enhancing reliability.
- Redundancy Reduction: The intersection approach eliminates redundant or weakly relevant features, improving model efficiency and reducing overfitting.
- Interpretability: The selected features are statistically significant and influential in predictive modeling, enhancing interpretability.
- Computational Efficiency: Unlike wrapper-based methods, DIFS efficiently reduces feature dimensionality without requiring extensive computational resources.
- Adaptability: DIFS can be integrated with other feature selection techniques to optimize performance for different datasets.
4.6.3. Comparative Analysis with Traditional Methods
- Filter methods (e.g., Chi-square, mutual information) evaluate features based on their correlation with the target variable, potentially overlooking interactions.
- Wrapper methods (e.g., RFE, genetic algorithms) iteratively test different feature subsets within a model, improving performance but requiring significant computational resources.
- Embedded methods (e.g., LASSO, RF importance) integrate selection within model training, identifying important predictors efficiently but sometimes overfitting to training data.
4.6.4. Empirical Evaluation and Model Performance Improvements
- Baseline models without feature selection.
- Models using only RF-based selection.
- Models using only Chi-square selection.
- Models using DIFS (RF + Chi-square intersection).
- Higher classification accuracy (QWK = 0.54, F1-score = 0.75 for RF).
- More efficient computation, avoiding the high runtime costs of wrapper-based approaches.
- Lower feature redundancy, improving generalization and reducing noise in the model.
4.6.5. Conclusion: The Novelty of DIFS
- Combining statistical relevance and model-based importance to enhance feature selection robustness.
- Eliminating redundant features while maintaining interpretability and efficiency.
- Balancing computational efficiency with predictive performance, making it suitable for real-time applications.
- Handling class imbalance more effectively than traditional methods by leveraging RF’s adaptive capabilities.
4.7. Data Balancing Using SMOTE
4.8. Exploratory Data Analysis
4.8.1. Relationship Between Vehicle Count and Congestion Level
- Low and Medium Congestion Levels: For congestion levels categorized as Low and Medium, the number of vehicles is consistently low, predominantly concentrated around 1 or 2. This indicates that minimal vehicular presence corresponds to these lower congestion levels;
- High Congestion Level: For the High congestion level, the distribution of vehicle counts is much broader. The number of vehicles ranges significantly, with the density peak observed around 5. The spread and height of the distribution suggest that a larger number of vehicles is a key indicator of high congestion;
- Distribution Shape: The sharp increase in density for High congestion at lower vehicle counts, coupled with a long tail extending to higher counts, reflects the variability in vehicular presence during high congestion scenarios. This highlights the importance of accounting for such variability in predictive modeling.
4.8.2. Hourly Traffic Patterns
- Morning Traffic (8 AM - 10 AM): The number of congestion incidents steadily increases from 1,731 at 8 AM to 2,005 at 10 AM. This trend indicates increasing traffic flow during the morning rush hours, with a notable proportion of High and Medium severity levels;
- Afternoon Traffic (3 PM - 6 PM): Congestion incidents peak between 4 PM and 5 PM, reaching a maximum of 2,543 at 5 PM. This reflects the typical evening rush hour, where high congestion levels dominate, suggesting significant delays and traffic buildup;
- Severity Proportions: Throughout the day, Low congestion levels (green) form the main proportion of incidents, followed by Medium (yellow) and High (red). However, during peak hours, the proportion of High congestion levels increases significantly, highlighting critical traffic management challenges.
4.8.3. Day of the Week vs. Congestion Level
-
Higher Congestion on Weekdays:
- –
- Congestion incidents are notably higher from Monday to Thursday, with Tuesday and Wednesday showing the peak total congestion levels at 2,508 and 2,491 incidents, respectively;
- –
- Low Congestion (green) dominates, followed by Medium (yellow) and High (red) congestion levels. The presence of High Congestion highlights the impact of weekday commuting patterns.
-
Reduced Congestion on Weekends:
- –
- A significant reduction in congestion incidents is observed on Saturday and Sunday, with totals dropping to 1,670 and 1,721 incidents, respectively. This aligns with lower traffic volumes typically associated with weekends;
- –
- Low congestion incidents continue to occur most frequently on these days, while high congestion incidents occur less frequently compared to weekdays.
-
Transition on Friday:
- –
- Friday marks a transition between the high congestion levels of weekdays and the lower congestion of weekends, with 1,973 total incidents. This reflects the changing traffic dynamics as work-related commuting gives way to leisure and weekend activities.
4.8.4. Direction of Vehicles vs. Congestion Level
-
Dominant Congestion Directions:
- –
- Directions 6 and 13 account for the highest number of congestion incidents, with totals of 6,242 and 5,792, respectively. These directions exhibit a substantial proportion of High Congestion (red) and Medium Congestion (yellow), indicating their critical role in overall traffic congestion at the intersection;
- –
- This dominance suggests that these directions may correspond to main traffic inflow or outflow routes.
-
Lower Congestion in Other Directions:
- –
- Directions 2, 9, 10, 11, and 12 have significantly fewer incidents, with totals ranging between 13 and 207. These directions show predominantly Low Congestion (green), implying less frequent or less severe traffic issues.
-
Intermediate Congestion Levels:
- –
- Directions such as 1, 3, 5, 7, and 8 show moderate numbers of congestion incidents, with a mix of Low and Medium congestion levels. This pattern may reflect secondary traffic routes or turning lanes with moderate traffic density.
-
Traffic Dynamics at Intersections:
- –
- The evident disparity in congestion levels across directions suggests directional bias in traffic flow, likely influenced by factors such as road hierarchy, intersection design, or traffic signal timing.
4.8.5. Monthly Traffic Trends
-
Peak Congestion in May and July:
- –
- May records the highest total number of congestion incidents at 5,828, closely followed by July with 5,242 incidents. These months are dominated by Low Congestion (green), but both show a notable proportion of Medium (yellow) and High Congestion (red) levels. The high traffic volumes during these months may correspond to seasonal patterns or increased travel activity.
-
Drop in June and August:
- –
- June and August exhibit significantly lower congestion levels, with 2,492 and 1,460 total incidents, respectively. These months have smaller proportions of High Congestion incidents, indicating relatively smoother traffic flow during this period.
-
Severity Distribution:
- –
- Across all months, Low Congestion levels form the majority, followed by Medium and High Congestion. However, the share of High Congestion is more prominent in May and July, emphasizing the challenges of managing traffic during these peak months.
-
Temporal Variations:
- –
- The sharp contrast between months with high congestion (May and July) and those with lower congestion (June and August) highlights the importance of incorporating Month as a feature in predictive modeling. Understanding such temporal trends can significantly improve the model’s ability to anticipate congestion levels.
Keys Factors Influencing Congestion Level at the Intersection:
- Number of Vehicles: A strong correlation is observed between the number of vehicles and congestion severity, with higher vehicle counts associated with high congestion levels.
- Time of Day: Morning and evening rush hours significantly impact congestion levels, particularly during peak times (8-10 AM and 4-6 PM).
- Day of the Week: Weekdays experience higher congestion levels compared to weekends, driven by weekday commuting patterns.
- Direction of Vehicles: Traffic flow patterns, particularly from dominant directions such as 6 and 13, heavily influence congestion.
- Monthly Variations: Seasonal changes and monthly variations, as observed in May and July, highlight the importance of accounting for temporal trends in predictive modeling.
4.9. Development of the Predictive Model
5. Results and Discussion
5.1. Results
5.2. Discussion
5.2.1. Performance Analysis of Machine Learning Models
5.2.2. Environmental Impact of Traffic Congestion
Air Pollution Trends by Congestion Level:
- NO2 Concentrations: Nitrogen dioxide levels exhibit a direct correlation with congestion severity, with peaks observed on high-traffic weekdays such as Tuesday and Wednesday. NO2, a primary emission from combustion engines, is a major contributor to respiratory diseases and urban smog formation.
- Fine Particulate Matter (PM2.5, PM10): Particulate emissions rise significantly under high congestion conditions, particularly during morning and evening rush hours. Fine particulates pose serious health risks, including cardiovascular and pulmonary diseases.
- Carbon Monoxide (CO) Trends: CO levels increase with congestion, indicating inefficient combustion due to frequent stop-and-go traffic. Elevated CO exposure can lead to reduced oxygen transport in the bloodstream, affecting vulnerable populations such as children and the elderly.
- Ozone (O3) Fluctuations: Although ozone is a secondary pollutant formed through photochemical reactions involving NOx and volatile organic compounds, its levels show a delayed correlation with congestion trends. High congestion periods contribute indirectly to ozone formation, further exacerbating air quality issues.
Implications for Traffic Management and Emission Reduction:
- Improved Air Quality: A reduction in NO2 and particulate emissions directly benefits urban air quality, reducing risks of respiratory illnesses.
- Lower Greenhouse Gas Emissions: Traffic optimization can cut CO emissions, contributing to climate change mitigation efforts.
- Public Health Improvements: Reducing exposure to fine particulates and NO2 can lower rates of cardiovascular diseases and improve overall public health outcomes.
Implementation in the Predictive Model:
6. Conclusions and Future Research
- Geographic Scalability: Expanding the framework to diverse urban settings with varying traffic patterns and infrastructure to assess adaptability.
- Advanced Feature Engineering: Exploring additional predictors, such as real-time incident reports, weather conditions, and pedestrian flow, to refine congestion estimation.
- Real-Time Data Integration: Incorporating live feeds from sensors and intelligent transportation systems for dynamic congestion forecasting.
- Decision Support Systems: Developing interactive interfaces that visualize congestion predictions, aiding proactive traffic management.
- Environmental Impact Analysis: Integrating congestion predictions with emission control strategies to assess sustainability and mitigate air pollution.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- World Health Organization. Ambient (outdoor) air pollution. 2024. Available online: https://www.who.int/news-room/fact-sheets/detail/ambient-(outdoor)-air-quality-and-health (accessed on 10 December 2024).
- Minderytė, A.; Pauraite, J.; Dudoitis, V.; Plauškaitė, K.; Kilikevičius, A.; Matijošius, J.; Rimkus, A.; Kilikevičienė, K.; Vainorius, D.; Byčenkienė, S. Carbonaceous aerosol source apportionment and assessment of transport-related pollution. Atmospheric Environment 2022, 279, 119043. [Google Scholar] [CrossRef]
- Li, J.; Wang, C.; Abdoli, S.; Yuen, A.C.; Kook, S.; Yeoh, G.H.; Chan, Q.N. Economic burden of transport related pollution in Australia. Journal of Transport & Health 2024, 34, 101747. [Google Scholar]
- Bajwa, A.U.; Sheikh, H.A. Contribution of road transport to Pakistan’s air pollution in the urban environment. Air 2023, 1, 237–257. [Google Scholar] [CrossRef]
- Pietrzak, K.; Pietrzak, O. Environmental effects of electromobility in a sustainable urban public transport. Sustainability 2020, 12, 1052. [Google Scholar] [CrossRef]
- Balta, M.; Özcelik, I. Traffic signaling optimization for intelligent and green transportation in smart cities. In Proceedings of the 2018 3rd International conference on computer science and engineering (UBMK); IEEE, 2018; pp. 31–35. [Google Scholar]
- Shahid, N.; Shah, M.A.; Khan, A.; Maple, C.; Jeon, G. Towards greener smart cities and road traffic forecasting using air pollution data. Sustainable Cities and Society 2021, 72, 103062. [Google Scholar] [CrossRef]
- Zhong, H.; Chen, K.; Liu, C.; Zhu, M.; Ke, R. Models for predicting vehicle emissions: A comprehensive review. Science of the Total Environment 2024, 171324. [Google Scholar] [CrossRef]
- Yang, J.; Han, S.; Chen, Y. Prediction of traffic accident severity based on random forest. Journal of Advanced Transportation 2023, 2023, 7641472. [Google Scholar] [CrossRef]
- Zhong, W.; Du, L. Predicting Traffic Casualties Using Support Vector Machines with Heuristic Algorithms: A Study Based on Collision Data of Urban Roads. Sustainability 2023, 15, 2944. [Google Scholar] [CrossRef]
- Nematichari, A.; Pechlivanoglou, T.; Papagelis, M. Evaluating and forecasting the operational performance of road intersections. In Proceedings of the Proceedings of the 30th International Conference on Advances in Geographic Information Systems; 2022; pp. 1–12. [Google Scholar]
- Qin, K.; Xu, Y.; Kang, C.; Kwan, M.P. A graph convolutional network model for evaluating potential congestion spots based on local urban built environments. Transactions in GIS 2020, 24, 1382–1401. [Google Scholar] [CrossRef]
- Olayode, I.O.; Tartibu, L.K.; Alex, F.J. Comparative study analysis of ANFIS and ANFIS-GA models on flow of vehicles at road Intersections. Applied Sciences 2023, 13, 744. [Google Scholar] [CrossRef]
- Moumen, I.; Mahdaoui, R.; Raji, F.Z.; Rafalia, N.; Abouchabaka, J. Distributed Multi-Intersection Traffic Flow Prediction using Deep Learning. In Proceedings of the E3S Web of Conferences; EDP Sciences, 2024; Vol. 477, p. 00049. [Google Scholar]
- Katambire, V.N.; Musabe, R.; Uwitonze, A.; Mukanyiligira, D. Forecasting the Traffic Flow by Using ARIMA and LSTM Models: Case of Muhima Junction. Forecasting 2023, 5, 616–628. [Google Scholar] [CrossRef]
- Mirzahossein, H.; Gholampour, I.; Sajadi, S.R.; Zamani, A.H. A hybrid deep and machine learning model for short-term traffic volume forecasting of adjacent intersections. IET Intelligent Transport Systems 2022, 16, 1648–1663. [Google Scholar] [CrossRef]
- Chahal, A.; Gulia, P.; Gill, N.S.; Priyadarshini, I. A hybrid univariate traffic congestion prediction model for IOT-enabled smart city. Information 2023, 14, 268. [Google Scholar] [CrossRef]
- CHAOURA, C.; LAZAR, H.; JARIR, Z. Traffic Flow Prediction at Intersections: Enhancing with a Hybrid LSTM-PSO Approach. International Journal of Advanced Computer Science & Applications 2024, 15. [Google Scholar]
- Wang, J.; Duan, X.; Wang, P.; Qiu, A.G.; Chen, Z. Predicting urban signal-controlled intersection congestion events using spatio-temporal neural point process. International Journal of Digital Earth 2024, 17, 2376270. [Google Scholar] [CrossRef]
- Gwalani, A.; Pai, A.; Padalia, A.; Bhavathankar, P.; Devadkar, K. Prediction and Management of Traffic Congestion in Urban Environments. In Proceedings of the 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT); IEEE, 2024; pp. 1–6. [Google Scholar]
- AlKheder, S.; Alkhamees, W.; Almutairi, R.; Alkhedher, M. Bayesian combined neural network for traffic volume short-term forecasting at adjacent intersections. Neural Computing and Applications 2021, 33, 1785–1836. [Google Scholar] [CrossRef]
- Navarro-Espinoza, A.; López-Bonilla, O.R.; García-Guerrero, E.E.; Tlelo-Cuautle, E.; López-Mancilla, D.; Hernández-Mejía, C.; Inzunza-González, E. Traffic flow prediction for smart traffic lights using machine learning algorithms. Technologies 2022, 10, 5. [Google Scholar] [CrossRef]
- Giraka, O.; Selvaraj, V.K. Short-term prediction of intersection turning volume using seasonal ARIMA model. Transportation letters 2020, 12, 483–490. [Google Scholar] [CrossRef]
- Qu, W.; Li, J.; Yang, L.; Li, D.; Liu, S.; Zhao, Q.; Qi, Y. Short-term intersection traffic flow forecasting. Sustainability 2020, 12, 8158. [Google Scholar] [CrossRef]
- Tsalikidis, N.; Mystakidis, A.; Koukaras, P.; Ivaškevičius, M.; Morkūnaitė, L.; Ioannidis, D.; Fokaides, P.A.; Tjortjis, C.; Tzovaras, D. Urban traffic congestion prediction: a multi-step approach utilizing sensor data and weather information. Smart Cities 2024, 7, 233–253. [Google Scholar] [CrossRef]
- Tran, Q.H.; Fang, Y.M.; Chou, T.Y.; Hoang, T.V.; Wang, C.T.; Vu, V.T.; Ho, T.L.H.; Le, Q.; Chen, M.H. Short-term traffic speed forecasting model for a parallel multi-lane arterial road using GPS-monitored data based on deep learning approach. Sustainability 2022, 14, 6351. [Google Scholar] [CrossRef]
- Tang, B.; Hu, Y. Frequent congestion detection model based on critical intersection identification. Transportation research record 2023, 2677, 371–385. [Google Scholar] [CrossRef]
- Karunathilake, Thenuka and Zongo, Meyo and Amarawardana, Dinithi and Förster, Anna. CN+: Vehicular Dataset at Traffic Light Regulated Intersection in Bremen, Germany. Zenodo, 2023. [Online; accessed 19 November 2024].
- Karunathilake, T.; Zongo, M.; Amarawardana, D.; Förster, A. CN+: Vehicular Dataset at Traffic Light Regulated Intersection in Bremen, Germany. Scientific Data 2024, 11, 665. [Google Scholar] [CrossRef]
- Litman, T. Factors to Consider When Estimating Congestion Costs and Evaluating Potential Congestion Reduction Strategies. Victoria, Canada: Victoria Transport Policy Institute 2013. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Hassan, M.A.; Salem, H.; Bailek, N.; Kisi, O. Random forest ensemble-based predictions of on-road vehicular emissions and fuel consumption in developing urban areas. Sustainability 2023, 15, 1503. [Google Scholar] [CrossRef]
- Park, J.; Hwang, E. A two-stage multistep-ahead electricity load forecasting scheme based on LightGBM and attention-BiLSTM. Sensors 2021, 21, 7697. [Google Scholar] [CrossRef]
- Chahal, A.; Gulia, P.; Gill, N.S.; Priyadarshini, I. A hybrid univariate traffic congestion prediction model for IOT-enabled smart city. Information 2023, 14, 268. [Google Scholar] [CrossRef]











| Reference | Focus | Models/Techniques | Key Findings |
|---|---|---|---|
| [11] | Intersection traffic forecasting | Graph theory, trajectory mining | Real-time analytics with structural time series models. |
| [12] | Urban congestion spot identification | mGCN, DenseNet | Achieved 85.5% accuracy in predicting congestion spots. |
| [13] | Traffic flow prediction | ANFIS, ANFIS-GA | ANFIS-GA achieved R² = 0.9980, surpassing standalone ANFIS. |
| [14] | Multi-intersection traffic modeling | GRU | Demonstrated robust performance even with sparse datasets. |
| [15] | Traffic flow forecasting | LSTM, ARIMA | LSTM outperformed ARIMA in predictive reliability. |
| [16] | Volume prediction | GRU-LSTM, wavelet transform | Achieved 94% accuracy through hybrid noise reduction methods. |
| [17,18] | Hybrid modeling | SARIMA + Bi-LSTM, LSTM + PSO | Achieved low RMSE and high prediction accuracy across test cases. |
| [19,20] | Real-time management frameworks | STNPP, GRU + V2I communication | Effective in reducing congestion and optimizing travel times. |
| [21,22] | Bayesian and ensemble approaches | BCNN, MLP | High regression accuracy; robust multi-lane intersection predictions. |
| [23,24] | Time-series and stacking methods | SARIMA, KNN + Elman NN | Offered computationally efficient and scalable solutions. |
| [25,26,27] | IoT and directed networks | LightGBM, DSAM, LSTM | Reliable congestion management for urban planning and traffic optimization. |
| Current Work | Congestion level prediction at intersections | RF, XGBoost, LightGBM, CatBoost, ANN | Achieved perfect F1 and QWK scores using CN+ dataset with advanced feature selection (DIFS). |
| Temperature | Dew Point | Humidity | Wind Speed | Wind Gust | Pressure |
|---|---|---|---|---|---|
| 61 °F | 48 °F | 63 % | 15 mph | 0 mph | 29.71 in |
| 61 °F | 48 °F | 63 % | 15 mph | 0 mph | 29.71 in |
| 61 °F | 48 °F | 63 % | 15 mph | 0 mph | 29.71 in |
| 61 °F | 48 °F | 63 % | 15 mph | 0 mph | 29.71 in |
| 61 °F | 48 °F | 63 % | 15 mph | 0 mph | 29.71 in |
| Temperature | Dew Point | Humidity | Wind Speed | Wind Gust | Pressure |
|---|---|---|---|---|---|
| 61.0 | 48.0 | 63.0 | 15.0 | 0.0 | 29.71 |
| 61.0 | 48.0 | 63.0 | 15.0 | 0.0 | 29.71 |
| 61.0 | 48.0 | 63.0 | 15.0 | 0.0 | 29.71 |
| 61.0 | 48.0 | 63.0 | 15.0 | 0.0 | 29.71 |
| 61.0 | 48.0 | 63.0 | 15.0 | 0.0 | 29.71 |
| Feature | Importance (RF) | Score (Chi2) | Top 25 RF | Top 25 Chi2 | Selected (Final Features) |
|---|---|---|---|---|---|
| Second | 0.149726 | 1.393803 | True | True | True |
| Lane_Capacity | 0.135371 | 625.745138 | True | True | True |
| Minute | 0.128438 | 0.537253 | True | True | True |
| Rolling_Avg_Cong. | 0.115735 | 222.695906 | True | True | True |
| vc_ratio | 0.083693 | 219.885640 | True | True | True |
| Previous_Cong. | 0.074941 | 118.899532 | True | True | True |
| Direction | 0.074234 | 51.890878 | True | True | True |
| Number | 0.037306 | 31.259874 | True | True | True |
| Type | 0.021417 | 1161.434540 | True | True | True |
| o3 | 0.017822 | 1.313062 | True | True | True |
| no2 | 0.015387 | 0.417706 | True | True | True |
| pm2_5 | 0.015247 | 1.012076 | True | True | True |
| so2 | 0.014297 | 1.831040 | True | True | True |
| Temperature | 0.009806 | 7.999304 | True | True | True |
| Day | 0.009368 | 15.255901 | True | True | True |
| Humidity | 0.009286 | 8.865574 | True | True | True |
| Hour | 0.009038 | 14.022089 | True | True | True |
| Wind | 0.009033 | 5.855798 | True | True | True |
| Wind Speed | 0.008898 | 5.567105 | True | True | True |
| Weekday | 0.008526 | 40.147760 | True | True | True |
| Dew Point | 0.007091 | 1.925835 | True | True | True |
| Pressure | 0.007571 | 5.446331 | True | True | True |
| Month | 0.003691 | 8.535287 | True | True | True |
| pm10 | 0.015471 | 0.237498 | True | False | False |
| co | 0.014222 | 0.070474 | True | False | False |
| Class | Precision | Recall | F1-score | Accuracy | QWK Score | Support | Runtime (s) |
|---|---|---|---|---|---|---|---|
| Random Forest (RF) | |||||||
| Low | 0.84 | 0.91 | 0.87 | - | - | 3584 | - |
| Medium | 0.20 | 0.09 | 0.13 | - | - | 518 | - |
| High | 0.60 | 0.61 | 0.60 | - | - | 712 | - |
| Overall | 0.74 | 0.77 | 0.75 | 0.77 | 0.54 | 4814 | 0.081507 |
| XGBoost | |||||||
| Low | 0.85 | 0.82 | 0.84 | - | - | 3584 | - |
| Medium | 0.13 | 0.14 | 0.13 | - | - | 518 | - |
| High | 0.54 | 0.61 | 0.57 | - | - | 712 | - |
| Overall | 0.73 | 0.72 | 0.72 | 0.72 | 0.50 | 4814 | 0.579527 |
| LightGBM | |||||||
| Low | 0.85 | 0.80 | 0.82 | - | - | 3584 | - |
| Medium | 0.14 | 0.17 | 0.15 | - | - | 518 | - |
| High | 0.53 | 0.60 | 0.56 | - | - | 712 | - |
| Overall | 0.73 | 0.70 | 0.71 | 0.70 | 0.48 | 4814 | 6.460226 |
| CatBoost | |||||||
| Low | 0.85 | 0.81 | 0.83 | - | - | 3584 | - |
| Medium | 0.14 | 0.16 | 0.15 | - | - | 518 | - |
| High | 0.54 | 0.60 | 0.57 | - | - | 712 | - |
| Overall | 0.73 | 0.71 | 0.72 | 0.71 | 0.49 | 4814 | 0.376003 |
| Artificial Neural Network (ANN) | |||||||
| Low | 0.85 | 0.78 | 0.82 | - | - | 3584 | - |
| Medium | 0.15 | 0.25 | 0.19 | - | - | 518 | - |
| High | 0.57 | 0.54 | 0.55 | - | - | 712 | - |
| Overall | 0.73 | 0.69 | 0.71 | 0.69 | 0.48 | 4814 | 314.791580 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).