Deep learning has recently performed outstandingly well in various visual tasks, including scene perception, object identification, object tracking, 3D geometry estimation, image segmentation, and many more tasks pertinent to autonomous vehicles. Pioneering work like ImageNet [
29] is one of the finest examples of the deep neural network and dataset proposed on visual recognition. Similarly, many datasets have been proposed to improve visual recognition results in the last decade. More intensely, if we focused on the usefulness of similar type datasets in autonomous driving, then the KITTI [
30], Microsoft COCO [
31], and Cityscapes [
32] datasets contribute largely to adapting the usefulness of visual perception in the autonomous vehicle’s field. Other influential datasets which were contributed to the diversity and quantity of resources are Camvid [
33], Caltech [
34], Daimler-CB [
35], CVC [
36], NICTA [
37], Daimler-DB [
38], INRIA [
39], ETH [
40], TUDBrussels [
41], Leuven [
42], Daimler Arban Segmentation [
43], and many more examples enriched the repository. These datasets were collected from the real world (or synthetic) and used for various purposes, such as pedestrian classification, pedestrian detection, object detection, semantic segmentation, etc.
This study focuses on particular datasets containing various weather characteristics to achieve robust perception in harsh weather. The principal harsh weather components are snow, rain, fog, night, and sand, which are also containing sub-components such as mist, haze, smog, strong daylight, reflective night light, rainy night, rain storm, sand storm, dust tornado, clouds, overcast, sunset, shadow, etc. Some datasets worked to cover these weather characteristics but were limited to a few specific features independently. Therefore, we cannot concede those datasets uniquely for weather invariant perceptions nor ubiquitously considered useful in harsh weather. So, we have chosen some prosperous open datasets according to their feature, usefulness, and weather characteristics and planned to merge them to cover all features and generate a fruitful repository to eliminate incompleteness in environmental characteristics. Moreover, we know that deep learning methods, i.e., neural networks are extremely data-hungry; thus, the fusion of different datasets could be useful for learning useful features globally. There are ample datasets available for harsh weather, such as Radiate [
44], EU [
45], KAIST multispectral [
46], WildDash [
47], Raincouver [
48], 4Seasons [
49], Snowy Driving [
50], Waymo Open [
51], Argoverse [
52], DDD17 [
53], D2-City [
54], nuScenes [
55], CADCD [
56], LIBRE [
57], Foggy Cityscape [
58], etc. These datasets mostly contain camera images (some are with LiDAR, Radar, GPS, and IMU data) taken from the real world, considering various weather characteristics in the real environment. On the other hand, SYNTHIA [
59] and ALSD [
60] contain synthetic images from a computer-generated virtual world including some adverse weather features. Despite the huge progress in the autonomous driving data field, we have chosen a few particular datasets based on availability, features, geometrical variation, and combinations of more useful weather characteristics. The following datasets were collected online for further progress in this work. The data collection process followed the official data collection rule for the corresponding resources and registered on their websites to take official permission (if required) to use their data in future research.
For this investigation, our attention has been on camera images because the camera is the most crucial sensor for environmental scene perception, especially for traffic sign recognition, object identification, and object localization. On the other hand, LiDAR is also easily influenced by weather. Nevertheless, we can add Radar and IMU as additional sensors, but it is optional to improve those because they are insignificantly influenced by harsh weather. Among the image datasets, the Breakly Deep Drive (BDD) [
61] and Eurocity [
62] could be useful resources for performing the main contribution to the data merging for this work. Some other datasets may also subsidize with fewer images but very intensive features from the perspective of weather characteristics. The BDD dataset contains 100,000 camera images (collected from driving video) from various cities in the USA, such as New York, Berkeley, and San Francisco. Besides typical weather, it contains images with other weather features like rain, fog, overcast, cloud, snow, and night light. The BDD dataset created a benchmark for performing the particular ten tasks mentioned in their paper, and annotated every image according to their tasks. These are image tagging, lane detection, drivable area segmentation, road object detection, semantic segmentation, instance segmentation, multi-object detection tracking, multi-object segmentation tracking, domain adaptation, and imitation learning. Though the dataset is rich in the perspective of the number of images, it contains fewer images according to harsh weather features. Only 23 fog images, 213 rain images, 765 snow images, and 345 night images are useful to contribute to learning the weather features [
63]. So, a manual search was required to elicit those useful images from the huge dataset, which might not be feasible, and it is better to focus on a different useful dataset containing more harsh weather images and feature diversity. Another rich dataset was Eurocity, which contains 47,300 images collected from 31 cities in 12 European countries, characterized by geometrical varieties, covering various weather categories such as rain, fog, snow, and night light, besides normal conditions. But the dataset primarily focused on pedestrian detection in traffic scenes contains 238,200 persons. The Mapillary dataset [
64] collected 25,000 street images from the Mapillary street view app. The collection was distributed worldwide based on images taken in the rain, snow, fog, and night, besides natural weather conditions. So, this dataset was the most diverse dataset in geographical extent, which contains various scene perceptions from the world’s different geometry related to various traffic rules and road conditions. However, the dataset also has the same problem as the BDD dataset. This work focused on studying the diverse weather conditions, claiming that there should be adequate weather-diverse images compared to typical weather images. Playing for Benchmarks [
65], which contained 254,064 high-resolution image frames from the video collection, was the richest dataset regarding the number of images. However, the images were captured from a virtual environment created by computers. The ApolloScape dataset [
66] was considered for capturing driving in bright sunlight, as this kind of situation frequently occurs while driving. The dataset contains 143,906 images collected from four regions in China, but a small portion of them are useful for learning adverse situations, as this work targeted. Besides the strong light, driving vehicles also could face sun glare, which can quickly impair vision and result in serious accidents. Until recently, there was a shortage of autonomous driving datasets with images of objects to detect in sun glare, which the autonomous driving research community accidentally neglected. Among the few papers that did so, [
67] suggested a glare dataset for detecting traffic signs only. The “Adverse Condition Dataset with Correspondance” dataset [
63], also known as the “ACDC dataset”, had 4006 camera images from Zurich recorded in four weather conditions: rain, fog, snow, and night. The ACDC has all photos with one of any weather features and 4006 images evenly distributed for each weather characteristic, which was very useful despite having a much smaller number of images than the BDD or Eurocity datasets. Therefore, from the perspective of usefulness, this dataset was more prosperous than the other dataset described previously. The 19 classes provided by Cityscape [
32] were annotated on the ACDC dataset using pixel-level semantic segmentation and trustworthy ground truth. The paper tested multiple existing neural networks and compared their performance on the dataset. The “Vehicle Detection in Adverse Weather Nature” dataset, also known as the “DAWN dataset” [
68], which only contains 1,027 photos gathered from web searches on Google and Bing, was another highly helpful dataset. However, it was selected for its extremely harsh weather qualities, which can serve as a real-world example for training and testing under adverse conditions. It also includes several sandstorm images that offer distinctive aspects compared to other datasets mentioned earlier. Seven thousand eight hundred forty-five bounding boxes for vehicles, buses, trucks, motorcycles, bicycles, pedestrians, and riders were labeled in the DAWN dataset annotated by the LabelMe tool. The ACDC and DAWN datasets’ primary distinguishing feature includes every image in adverse weather. So, from the above discussion, the criteria for choosing the datasets are clear now. But, we can collect by manually selecting the relevant images from the abovementioned datasets, which might be time consuming but relevant to extend this work further. However, we discovered that the ACDC and DAWN datasets were the most helpful for our analysis.