Submitted:
25 August 2025
Posted:
26 August 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- RQ1: Which deep learning model between LSTM and BiLSTM performs better in classifying subsea environments?
- RQ2: How reliable is the classification of a specific point in bathymetric LiDAR data for seabed and vegetation mapping?
2. Related Work
2.1. Research Gap
3. Research Methodology
3.1. Data Collection
3.2. Dataset Preparation and Merging
- Point: The first three values represent the Cartesian coordinates of the reflecting point, that is, east, north, and height.
- Scanner: This means the next three values in line 2 describe the position of the scanner during the data capture, specifying its coordinates as (X, Y, Z).
- Intensity: The strength of the laser return for the given point (for this waveform, the value was 301).
- Time (absolute GPS time, unit: sec): The timestamp of the data captured is denoted here.
- Channel 1 Count: The count is 960, which indicates the number of returns detected for Channel 1.
- Sample Length: It represents the time difference between the captured sample points.
- Point: Represents the elapsed time (in seconds) from the first recorded intensity sample to the returning point in the waveform.
- Vector Information: The three values (lines 8-10) represent the directional components (X, Y, Z) associated with the return vector.
- Channel 1 Samples: The final section lists a series of numeric values representing the intensities of the samples from Channel 1. In the collected dataset, each file has 960 Samples of intensity values.
3.3. Novel Data Preprocessing Approach
- Extraction of Metadata: In the first step, we began with reading each text file to extract and store the relevant information from lines 1-10 for further processing.
- Laser intensity value identification: Next, we retrieved all the intensity values, starting from lines 12-971.
- Filtering of laser intensity value: Then the initial noise values were discarded.
- Categorization of intensity values: Next, we categorized the rest of the values into five distinct classes: Noise, Sea, surface, Water, Vegetation, Seabed.
- Creation of Final CSV File: Finally, a CSV file was generated against each text file with the necessary information.
3.3.1. Metadata Extraction
3.3.2. Noise Filtering
3.3.3. Data Labeling
- For the first peak (sea surface), a slightly wider region was considered (from 10 samples before to 5 samples after the peak, a total of 16 samples).
- For subsequent peaks (vegetation and seabed), the region was narrowed (5 samples before and 5 after, a total of 11 samples).

- Noise (Class 0)
- Sea surface (Class 1)
- Water (Class 2)
- Vegetation (Class 3)
- Seabed (Class 4)
- When there were only 3 regions, the assigned classes were 1 (sea surface), 3 (vegetation), and 4 (seabed).
- When there were two regions, the assigned classes were 1 (sea surface) and 4 (Seabed), indicating no presence of vegetation.
- Lastly, upon finding four regions, the assigned classes are (1, 3, 3, 4), indicating two regions of vegetation between the sea surface and seabed.
- The intensity values before the sea surface region and after the seabed region were assigned as class 0 (Noise).
- The intensity values between the sea surface region and the vegetation region were assigned as class 2 (Water).
- Similarly, the intensity values between the vegetation region and the seabed region were assigned as class 2 (Water).
3.3.4. Methodological Comparison with Existing Studies
3.3.5. Feature Selection
- Have high correlation with the target variable (‘Class_Label’): Ensuring that the selected features contribute meaningful information to classification.
- Have low correlation with each other: Preventing redundancy and reducing noise in the model.
3.3.6. Data Normalization
3.3.7. One-Hot Encoding
3.3.8. Splitting Data into Training, Validation, and Testing Datasets
3.4. Model Architecture
3.4.1. LSTM Model Architecture
3.4.2. BiLSTM Model Architecture
3.5. Experimental Setup and Hyperparameter Tuning
3.6. Model Performance Metrics
- True Positives (TP): Positive cases that are correctly predicted.
- True Negatives (TN): Negative cases that are correctly predicted.
- False Positives (FP): Negative cases that are incorrectly predicted as positive (also known as Type I error)
- False Negatives (FN): Positive cases that are incorrectly predicted as negative (also known as Type II error).

3.6.1. Accuracy
3.6.2. Precision
3.6.3. Recall
3.6.4. F1-Score
3.6.5. Confusion Matrix

4. Results
4.1. Results on the Dataset 1

4.2. Results on the Dataset 2

4.3. Results on Reliability Analysis Using Confidence Scores


5. Discussion
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| ALB | Airborne LiDAR Bathymetry |
| ALS | Airborne Laser Scanning |
| BiLSTM | Bidirectional Long Short-Term Memory |
| CNN | Convolutional Neural Network |
| DL | Deep Learning |
| LiDAR | Light Detection and Ranging |
| LSTM | Long Short-Term Memory |
| MBES | Multi-Beam Echo Sounders |
| MLP | Multi-Layer Perceptron |
| NMA | Norwegian Mapping Authority |
| RF | Random Forest |
| RNN | Recurrent Neural Network |
| SMOTE | Synthetic Minority Oversampling Technique |
| SVM | Support Vector Machine |
References
- UNESCO Ocean Literacy. The List of the Oceans with data and statistics about surface area, volume, and average depth, 2022. Available online: https://oceanliteracy.unesco.org/ocean/ (accessed on 9 December 2024).
- IFAW. How climate change impacts the ocean, 2024. Available online: https://www.ifaw.org/international/journal/climate-change-impact-ocean (accessed on 27 December 2024).
- United Nations. Transforming our world: The 2030 Agenda for Sustainable Development, 2015. Available online: https://sdgs.un.org/2030agenda (accessed on 12 January 2025).
- World Ocean Review. Transport over the seas, 2024. Available online: https://worldoceanreview.com/en/wor-7/transport-over-the-seas/ (accessed on 27 September 2024).
- American Scientist. An Ocean of Reasons to Map the Seafloor, 2023. Available online: https://www.americanscientist.org/article/an-ocean-of-reasons-to-map-the-seafloor (accessed on 4 September 2024).
- Thorsnes, T.; Misund, O.A.; Smelror, M. Seabed mapping in Norwegian waters: Programmes, technologies and future advances. 499, 99–118. Publisher: The Geological Society of London. [CrossRef]
- Kartverket. Om Kartverket, 2023. Available online: https://kartverket.no/om-kartverket (accessed on 6 September 2023).
- Daniel, S.; Dupont, V. Investigating Fully Convulutional Network To Semantic Labelling Of Bathymetric Point Cloud. V-2-2020, 657–663. Conference Name: XXIV ISPRS Congress, Commission II (Volume V-2-2020) - 2020 edition Publisher: Copernicus GmbH. [CrossRef]
- Zhong, J.; Sun, J.; Lai, Z.; Song, Y. Nearshore Bathymetry from ICESat-2 LiDAR and Sentinel-2 Imagery Datasets Using Deep Learning Approach. 14, 4229. Number: 17 Publisher: Multidisciplinary Digital Publishing Institute. [CrossRef]
- Janowski, L.; Wroblewski, R.; Rucinska, M.; Kubowicz-Grajewska, A.; Tysiac, P. Automatic classification and mapping of the seabed using airborne LiDAR bathymetry. 301, 106615. [CrossRef]
- Pricope, N.; Bashit, M.S. Emerging trends in topobathymetric LiDAR technology and mapping. 44, 7706–7731. [CrossRef]
- Szafarczyk, A.; Toś, C. The Use of Green Laser in LiDAR Bathymetry: State of the Art and Recent Advancements. 23, 292. Number: 1 Publisher: Multidisciplinary Digital Publishing Institute. [CrossRef]
- Dolan, M.; Buhl-Mortensen, P.; Thorsnes, T.; Buhl-Mortensen, L.; Bellec, V.; Bøe, R. Developing seabed nature-type maps offshore Norway: Initial results from the MAREANO programme. Norwegian Journal of Geology 2009, 89, 17–28. [Google Scholar]
- Leica Geosystems. Mapping underwater terrain with bathymetric LiDAR, 2024. Available online: https://leica-geosystems.com/case-studies/natural-resources/mapping-underwater-terrain-with-bathymetric-lidar (accessed on 4 September 2024).
- Li, Z.; Chen, B.; Wu, S.; Su, M.; Chen, J.M.; Xu, B. Deep learning for urban land use category classification: A review and experimental assessment. 311, 114290. [CrossRef]
- Saleh, A.; Sheaves, M.; Jerry, D.; Rahimi Azghadi, M. Applications of deep learning in fish habitat monitoring: A tutorial and survey. 238, 121841. [CrossRef]
- Tabassum, N.; Giudici, H.; Nunavath, V. Exploring the Combined use of Deep Learning and LiDAR Bathymetry Point-Clouds to Enhance Safe Navigability for Maritime Transportation: A Systematic Literature Review. In Proceedings of the 2024 Sixth International Conference on Intelligent Computing in Data Sciences (ICDS), 2024, pp. 1–5. [CrossRef]
- Restackio. Deep Learning Techniques for Sequential Data, 2024. Available online: https://www.restack.io/p/sequence-to-sequence-models-answer-deep-learning-techniques-sequential-data-cat-ai (accessed on 28 October 2024).
- Kogut, T.; Weistock, M.; Bakuła, K. Classification Of Data From Airborne LiDAR Bathymetry With Random Forest Algorithm based on Different Feature Vectors. XLII-2-W16, 143–148. Conference Name: Photogrammetric Image Analysis & Munich Remote Sensing Symposium: Joint ISPRS conference (Volume XLII-2/W16) - 18-20 September 2019, Munich, Germany Publisher: Copernicus GmbH. [CrossRef]
- Kogut, T.; Tomczak, A.; Słowik, A.; Oberski, T. Seabed Modelling by Means of Airborne Laser Bathymetry Data and Imbalanced Learning for Offshore Mapping. 22, 3121. Number: 9 Publisher: Multidisciplinary Digital Publishing Institute. [CrossRef]
- Roshandel, S.; Liu, W.; Wang, C.; Li, J. Semantic Segmentation of Coastal Zone on Airborne Lidar Bathymetry Point Clouds. 19, 1–5. Conference Name: IEEE Geoscience and Remote Sensing Letters. [CrossRef]
- Liang, G.; Zhao, X.; Zhao, J.; Zhou, F. MVCNN: A Deep Learning-Based Ocean–Land Waveform Classification Network for Single-Wavelength LiDAR Bathymetry. 16, 656–674. [CrossRef]
- Kogut, T.; Weistock, M. Classifying airborne bathymetry data using the Random Forest algorithm. 10, 874–882. [CrossRef]
- Pan, S.; Yoshida, K. Using Airborne Lidar Bathymetry Aided Transfer Learning Method in Riverine Land Cover Classification. Conference Name: 14th International Symposium on Ecohydraulics (2022, Nanjing).
- Eren, F.; Pe’eri, S.; Rzhanov, Y.; Ward, L. Bottom characterization by using airborne lidar bathymetry (ALB) waveform features obtained from bottom return residual analysis. 206, 260–274. [CrossRef]
- Wikipedia contributors. Fjøløy, 2023. Available online: https://en.wikipedia.org/w/index.php?title=Fj%C3%B8l%C3%B8y&oldid=1155343996 (accessed on 10 September 2024).
- Terrasolid. Waveform Processing, 2023. Available online: https://terrasolid.com/guides/tscan/introwaveformprocessing.html (accessed on 12 December 2023).
- Xiong, H.; Pandey, G.; Steinbach, M.; Kumar, V. Enhancing data analysis with noise removal. 18, 304–319. Conference Name: IEEE Transactions on Knowledge and Data Engineering. [CrossRef]
- Anishnama. Understanding LSTM: Architecture, Pros and Cons, and Implementation, 2023. Available online: https://medium.com/@anishnama20/understanding-lstm-architecture-pros-and-cons-and-implementation-3e0cca194094 (accessed on 29 September 2024).
- GeeksforGeeks. What is LSTM - Long Short Term Memory?, 2019. Available online: https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ (accessed on 29 September 2024).
- Anishnama. Understanding Bidirectional LSTM for Sequential Data Processing, 2023. Available online: https://medium.com/@anishnama20/understanding-bidirectional-lstm-for-sequential-data-processing-b83d6283befc (accessed on 30 October 2024).
- Saturn Cloud. Bidirectional LSTM, 2023. Available online: https://saturncloud.io/glossary/bidirectional-lstm/ (accessed on 30 October 2024).
- GeeksforGeeks. Difference Between a Bidirectional LSTM and an LSTM, 2024. Available online: https://www.geeksforgeeks.org/difference-between-a-bidirectional-lstm-and-an-lstm/ (accessed on 3 November 2024).
- Brownlee, J. Gentle Introduction to the Adam Optimization Algorithm for Deep Learning, 2017. Available online: https://www.machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/ (accessed on 3 November 2024).
- Bajaj, A. Performance Metrics in Machine Learning, 2022. Available online: https://neptune.ai/blog/performance-metrics-in-machine-learning-complete-guide (accessed on 3 October 2024).
- Fiddler, AI. Fiddler AI. What is model performance evaluation, 2024. Available online: https://www.fiddler.ai/model-evaluation-in-model-monitoring/what-is-model-performance-evaluation (accessed on 3 October 2024).
- Wei, D. Essential Math for Machine Learning: Confusion Matrix, Accuracy, Precision, Recall, F1-Score, 2024. Available online: https://medium.com/@weidagang/demystifying-precision-and-recall-in-machine-learning-6f756a4c54ac (accessed on 4 October 2024).
- Deval Shah. Top Performance Metrics in Machine Learning: A Comprehensive Guide, 2023. Available online: https://www.v7labs.com/blog/performance-metrics-in-machine-learning (accessed on 3 October 2024).
| 1 | |
| 2 | |
| 3 | |
| 4 |








| Citations | Objective | Dataset/Location | Algorithm(s) | Obtained Results |
|---|---|---|---|---|
| [19] | Classification and detection of objects on the seabed |
The artificial reef Rosenort on the Baltic Sea. |
MLP | Accuracy: 92.07% |
| [20] | Improve seabed modeling and object detection |
The artificial reef Rosenort on the Baltic Sea. |
MLP | Accuracy: 97.0%, with LVQ SMOTE, 96%, with ROSE |
| [21] | Semantic segmentation of water surface and seabed |
Coastal-urban scenes of Tampa Bay, Florida, USA, |
SPG | Accuracy: 72.45% |
| [22] | Ocean–land waveform classification | Qinshan Island of Lianyungang City, Jiangsu Province, China |
MVCNN | Accuracy: 99.41% |
| [9] | Propose a deep learning frame work for nearshore bathymetry (DL-NB) |
Appalachian Bay (AB), Virgin Islands (VI), and Cat Island (CI) of the United States. |
2D CNN | RMSE was 1.01 m, 1.80 m and 0.28 m in AB, VI, and CI respectively |
| [23] | The classification of airborne laser bathymetry data to three classes (1) water surface, (2) seabed and (3) objects/obstacles. |
The artificial reef Rosenort on the Baltic Sea. |
RF | Accuracy: 100% water surface, 99.9% seabed and 60% objects. |
| [24] | Classify riverine land cover | Asahi River, Japan | Transfer Learning | Accuracy: 95% |
| [10] | Classification and mapping of the seafloor |
Polish coast, The Southern Baltic | KNN, SVM, RF |
Accuracy: varies from 75% to 91% |
| [25] | Bottom return residual analysis | The western Gulf of Maine, USA | SVM | Accuracy: 96% (sand and rock ) 86% (fine and coarse sand) |
| Our Work | Multi-class classification of underwater elements (Sea surface, Vegetation, Seabed, Water, Noise) |
Fjøløy island in Stavanger, Norway |
LSTM | Accuracy: Dataset 1: 95.22% Dataset 2: 94.85% |
| BiLSTM | Accuracy: Dataset 1: 94.37% Dataset 2: 84.18% |
| Class | Class Name | Dataset 1 | Dataset 2 |
|---|---|---|---|
| 0 | Noise | 1,072,781 | 754,099 |
| 1 | Sea surface | 101,809 | 70,848 |
| 2 | Water | 615,134 | 450,087 |
| 3 | Vegetation | 53,863 | 4,658 |
| 4 | Seabed | 70,113 | 48,708 |
| Class_Label | Computed Weights | |
|---|---|---|
| Dataset 1 | Dateset 2 | |
| 0 | 0.3566 | 0.3522 |
| 1 | 3.7582 | 3.75 |
| 2 | 0.6225 | 0.5905 |
| 3 | 7.1032 | 56.7325 |
| 4 | 5.4594 | 5.4545 |
| Feature | LSTM Model | BiLSTM Model |
|---|---|---|
| Model Type | Sequential LSTM | Bidirectional LSTM (BiLSTM) |
| Input Shape | (300, 1) | (300, 1) |
| First LSTM Layer | LSTM (128 units, return_sequences=True) | Bidirectional LSTM (128 units, return_sequences=True) |
| First Dropout Layer | Dropout (0.5) | Dropout (0.5) |
| Second LSTM Layer | LSTM (64 units, return_sequences=True) | Bidirectional LSTM (64 units, return_sequences=True) |
| Second Dropout Layer | Dropout (0.5) | Dropout (0.5) |
| Additional Layer | TimeDistributed Dense (64 units, ReLU activation) | N/A |
| Third Dropout Layer | Dropout (0.5) | N/A |
| Output Layer | TimeDistributed Dense (5 units, softmax) | TimeDistributed Dense (5 units, softmax) |
| Optimizer | Adam | Adam |
| Learning Rate | 1.00E-03 | 1.00E-04 |
| Loss Function | Categorical cross-entropy | Categorical cross-entropy |
| Metrics | Accuracy | Accuracy |
| Batch Size | 32 | 16 |
| Epochs | 100 | 100 |
| class Imbalance Handling | N/A | Sample weights used based on computed class weights |
| Dataset 1 | Dataset 2 | ||||
|---|---|---|---|---|---|
| LSTM | BiLSTM | LSTM | BiLSTM | ||
| Accuracy | 95.22% | 94.37% | 94.85% | 84.18% | |
| Macro Average | Precision | 0.8787 | 0.8039 | 0.7011 | 0.6112 |
| Recall | 0.8594 | 0.9486 | 0.6885 | 0.8359 | |
| F1-Score | 0.8641 | 0.8616 | 0.6945 | 0.635 | |
| Weighted Average | Precision | 0.9533 | 0.9587 | 0.9446 | 0.9482 |
| Recall | 0.9522 | 0.9437 | 0.9485 | 0.8418 | |
| F1-Score | 0.9518 | 0.9478 | 0.9464 | 0.8787 | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).