Figure 1.
Derivatives of the intensity of pixel edges [
12].
Figure 1.
Derivatives of the intensity of pixel edges [
12].
Figure 2.
Derivatives of the intensity of pixel edges [
12].
Figure 2.
Derivatives of the intensity of pixel edges [
12].
Figure 3.
Hysteresis thresholding. [
12].
Figure 3.
Hysteresis thresholding. [
12].
Figure 4.
Canny edge detection in selected and rejected examples.
Figure 4.
Canny edge detection in selected and rejected examples.
Figure 5.
Selected images for dataset.
Figure 5.
Selected images for dataset.
Figure 6.
Rejected images for dataset due to quality (1-2), blur (3) and brightness (4).
Figure 6.
Rejected images for dataset due to quality (1-2), blur (3) and brightness (4).
Figure 7.
Example of a fully annotated container.
Figure 7.
Example of a fully annotated container.
Figure 8.
DeepLabv3+ encoder-decoder architecture using ASPP for feature extraction and decoder for upsampling. Source from [
25].
Figure 8.
DeepLabv3+ encoder-decoder architecture using ASPP for feature extraction and decoder for upsampling. Source from [
25].
Figure 9.
Decoder effect compared with bilinear upsampling. Source from [
25].
Figure 9.
Decoder effect compared with bilinear upsampling. Source from [
25].
Figure 10.
Intersection Over Union metric.
Figure 10.
Intersection Over Union metric.
Figure 11.
Largest area segmented mask output from SAM.
Figure 11.
Largest area segmented mask output from SAM.
Figure 12.
Predictor output with 2 points (red star) as prompt. Best Mask Score: 0.975.
Figure 12.
Predictor output with 2 points (red star) as prompt. Best Mask Score: 0.975.
Figure 13.
Black and white background container masks.
Figure 13.
Black and white background container masks.
Figure 14.
Comparison of ground truth masks and model predictions for containers with distinct and similar corrosion colors.
Figure 14.
Comparison of ground truth masks and model predictions for containers with distinct and similar corrosion colors.
Figure 15.
DeepLabv3+: Training loss on 300 epochs using default parameters.
Figure 15.
DeepLabv3+: Training loss on 300 epochs using default parameters.
Figure 16.
DeepLabv3+: Validation loss on 300 epochs using default parameters.
Figure 16.
DeepLabv3+: Validation loss on 300 epochs using default parameters.
Figure 17.
DeepLabv3+: Validation performance of corrosion IoU on 300 epochs using default parameters.
Figure 17.
DeepLabv3+: Validation performance of corrosion IoU on 300 epochs using default parameters.
Figure 18.
DeepLabv3+ (smoothed lines): Impact on validation performance using different backgrounds (default, white and black) on 300 epochs.
Figure 18.
DeepLabv3+ (smoothed lines): Impact on validation performance using different backgrounds (default, white and black) on 300 epochs.
Figure 19.
DeepLabv3+: Impact on validation performance with or without data augmentation on 300 epochs.
Figure 19.
DeepLabv3+: Impact on validation performance with or without data augmentation on 300 epochs.
Figure 20.
DeepLabv3+ (smoothed lines): Impact on validation performance using MobileNetv2, Resnet50 and Resnet101 backbone network models on 300 epochs.
Figure 20.
DeepLabv3+ (smoothed lines): Impact on validation performance using MobileNetv2, Resnet50 and Resnet101 backbone network models on 300 epochs.
Figure 21.
DeepLabv3+ (smoothed lines): Impact on validation performance using SGD (300 epochs), Adam and AdamW on 200 epochs.
Figure 21.
DeepLabv3+ (smoothed lines): Impact on validation performance using SGD (300 epochs), Adam and AdamW on 200 epochs.
Figure 22.
Validation: Comparison of the output prediction with the ground truth mask and the original image.
Figure 22.
Validation: Comparison of the output prediction with the ground truth mask and the original image.
Figure 23.
Test: Comparison of the output prediction with the ground truth mask and the original image.
Figure 23.
Test: Comparison of the output prediction with the ground truth mask and the original image.
Figure 24.
Validation mean IoU using ResNet101 and AdamW (lr = 0.002) on 200 epochs.
Figure 24.
Validation mean IoU using ResNet101 and AdamW (lr = 0.002) on 200 epochs.
Table 1.
Number of canny edge pixels and laplacian values.
Table 1.
Number of canny edge pixels and laplacian values.
| Image |
Canny Edges |
Laplacian Variance |
| Rejected 1 |
≈ 77,000 |
11 |
| Rejected 2 |
≈ 42,000 |
11 |
| Rejected 3 |
≈ 6,000 |
2 |
| Rejected 4 |
≈ 5,500 |
3 |
| Acceptable 1 |
≈ 316,000 |
54 |
| Acceptable 2 |
≈ 129,000 |
23 |
| Acceptable 3 |
≈ 137,000 |
85 |
| Acceptable 4 |
≈ 318,000 |
50 |
Table 2.
Comparison of generic tools
Table 2.
Comparison of generic tools
| Tool |
Open Source |
Friendly UI |
Large-scale Datasets |
Pixel-Level Labeling |
| CVAT |
✓ |
✓ |
✓ |
✓ |
| LabelStudio |
✓ |
✗ |
✓ |
✓ |
| LabelBox |
✓ |
✗ |
✓ |
✓ |
| VIA |
✓ |
✓ |
✗ |
✗ |
| LabelMe |
✓ |
✓ |
✗ |
✗ |
Table 3.
Comparison of CV annotation tools
Table 3.
Comparison of CV annotation tools
| Tool |
Brush Tool |
AI Magic Wand |
Intelligent Scissors |
Bounding Box |
Points |
| CVAT |
✓ |
✓ |
✓ |
✓ |
✓ |
| LabelStudio |
✓ |
✓ |
✓ |
✓ |
✓ |
| LabelBox |
✓ |
✓ |
✓ |
✓ |
✓ |
| VIA |
✗ |
✗ |
✗ |
✓ |
✓ |
| LabelMe |
✗ |
✗ |
✗ |
✓ |
✓ |
Table 4.
Resampling filters comparison of downscaling and upscaling [
22].
Table 4.
Resampling filters comparison of downscaling and upscaling [
22].
| Resampling Filter |
Downscaling quality |
Upscaling quality |
| NEAREST |
|
|
| BOX |
★ |
|
| BILINEAR |
★ |
★ |
| HAMMING |
★★ |
|
| BICUBIC |
★★★ |
★★★ |
| LANCZOS |
★★★★ |
★★★★ |
Table 5.
Comparison of semantic segmentation models. Source from [
23,
24].
Table 5.
Comparison of semantic segmentation models. Source from [
23,
24].
| Model |
PASCAL VOC (mIoU %) |
Cityscapes (mIoU %) |
| FCN |
62.2 |
65.3 |
| SegNet |
- |
57.0 |
| PSPNet |
82.6 |
78.4 |
| DeepLabV3 |
85.7 |
81.3 |
| DeepLabV3+ |
87.8 |
- |
Table 6.
DeepLabv3+ pre-trained performance (IoU) on experiments 1 (20 random images), 2 (40 images without red cargo containers) and 3 (20 images with only red cargo containers). The experiments were tested on all different model weights with each respective loss function: w18 (cross entropy), w27 (l1 loss), w35 (l2 loss), w40 (cross entropy with weighted classes). Comparison of model performance in original image against white and black backgrounds.
Table 6.
DeepLabv3+ pre-trained performance (IoU) on experiments 1 (20 random images), 2 (40 images without red cargo containers) and 3 (20 images with only red cargo containers). The experiments were tested on all different model weights with each respective loss function: w18 (cross entropy), w27 (l1 loss), w35 (l2 loss), w40 (cross entropy with weighted classes). Comparison of model performance in original image against white and black backgrounds.
| Experiments |
Weights |
Original |
White Background |
Black Background |
| 1 |
18 |
9.3% |
14.5% |
16.3% |
| 27 |
15.5% |
20.4% |
21.1% |
| 35 |
14.9% |
18.7% |
18.9% |
| 40 |
10.3% |
22.0% |
20.5% |
| 2 |
18 |
6.9% |
13.1% |
14.9% |
| 27 |
12.2% |
17.7% |
18.5% |
| 35 |
9.2% |
14.4% |
14.9% |
| 40 |
9.3% |
18.3% |
14.8% |
| 3 |
18 |
0.6% |
1.4% |
1.6% |
| 27 |
0.9% |
2.4% |
1.9% |
| 35 |
1.0% |
1.5% |
1.5% |
| 40 |
0.8% |
0.9% |
1.7% |
Table 7.
DeepLabv3+ fine-tuned performance (IoU) on validation set and test set for each configuration in all experiments.
Table 7.
DeepLabv3+ fine-tuned performance (IoU) on validation set and test set for each configuration in all experiments.
| Experiments |
Configuration |
Validation |
Test |
| Background |
Default |
41.3% |
30.0% |
| Black |
40.1% |
30.4% |
| White |
41.0% |
29.5% |
| Data Augmentation |
Default |
41.3% |
30.0% |
| Without |
39.4% |
29.3% |
| Networks |
MobileNetV2 (def) |
41.3% |
30.0% |
| ResNet50 |
43.2% |
34.7% |
| ResNet101 |
46.3% |
40.1% |
| Optimizers |
SGD (def) |
46.3% |
40.1% |
| Adam |
52.5% |
49.4% |
| AdamW |
52.7% |
49.4% |
Table 8.
Performance comparison of DeepLabv3+ pre-trained model with fine-tuned model.
Table 8.
Performance comparison of DeepLabv3+ pre-trained model with fine-tuned model.
| DeepLabv3+ Model |
Performance (IoU) |
| Pre-trained |
4.0% |
| Fine-tuned |
49.4% |