Submitted:
29 March 2023
Posted:
30 March 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
- A novel solution for indoor localization utilizing the Transformer network for RSS-based WiFi fingerprinting. The proposed solution is purpose-built for a complex hierarchical environment (multi-building, floor, and room) and intended to deliver consistent and accurate results.
- Modifications to the transformer network to use learnable positional embeddings to improve the network's accuracy and provide insights into the WAP's position within the multi-building and multi-floor environments.
2. Related Work
- Time Approach: Time of Flight (ToF), Time Difference of Arrival (TDoA) and Return Time of Flight (RTOF) make use of the time it took for the signal to propagate to the receiver, the time intervals between each signal reception and the signal propagation round trip time, respectively. These methods, although accurate, are affected by clock synchronization, sampling rate, and signal bandwidth; ToF also requires line-of-sight for accurate performance [1,16].
- Geometric Approach: Angle of Arrival (AoA) and Phase of Arrival (PoA) rely on angle and phase estimation using antenna arrays to calculate the difference in arrival time and the distance between transmitter and receiver, respectively. These methods can deliver high accuracy but suffer degradation from non-line-of-sight, faded multipath signals and require complex hardware and algorithms to undo [1,16].
- Fingerprinting: captures an array of received signal strength (RSS) or channel state information (CSI) measurements at every reference point to build a collection of signals, which are used to compare with real-time measurements to pinpoint the user's location. This technique of comparing signals is effective as each location would have a unique fingerprint arising from the complex signal propagation within the indoor environment. CSI offers better accuracy but is prone to noise and multipath fading; however, RSS is more easily obtained and cost-effective when compared to CSI, which requires an off-the-shelf network interface card (NIC) [1,14,16].
3. Methodology
3.1. Data and Preprocessing

3.3. HyTra Model
3.3.1. Model Input

3.3.2. Input Transformation
3.3.3. Position Embedding
3.3.4. Transformer Encoder
3.3.5. Classification Heads
3.4. Training Process
| Algorithm 1. Pseudo code of the proposed HyTra model |
| Input: , Labels: |
| Output: Predicted Location of User |
| 1. if () are the given labels then |
| 2. Compute unique floor labels () |
| 3. end for in 4. if then |
| 5. Replace with |
| 6. end 7. if Apply then 8. end 9. 10. if then 11. Train HyTra classification model 12. Finetune HyTra Classification model 13. return 14. end |
3.5. System Overview
4. Results & Discussion
4.1. HyTra Comparison on Public and Private Datasets
| Datasets | Training Samples | Classification Accuracy | ||
| Building Accuracy | Floor Accuracy | Room Accuracy | ||
|
SPOT (private) |
165 K | 100% | 99.4% | 97.9% |
|
UJI (public) |
17.9 K | 100% | 93.7% | - |
|
UTP (private) |
0.832 K | 100% | 97.1% | 78.6% |
4.2. Comparison of Classification Results
5. Conclusion
Author Contributions
Funding
References
- Mendoza-Silva, G.M.; Torres-Sospedra, J.; Huerta, J. A Meta-Review of Indoor Positioning Systems. Sensors 2019, Vol. 19, Page 4507 2019, 19, 4507. [CrossRef]
- Location-Based Services (LBS) Market Size and Forecast - 2030 Available online: https://www.alliedmarketresearch.com/location-based-services-market (accessed on 9 March 2023).
- Sithole, G.; Zlatanova, S. POSITION, LOCATION, PLACE AND AREA: AN INDOOR PERSPECTIVE. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2016, III–4, 89–96. [CrossRef]
- Shang, S.; Wang, L. Overview of WiFi Fingerprinting-Based Indoor Positioning. IET Communications 2022, 16, 725–733. [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; N.Gomez, A.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. Adv Neural Inf Process Syst 2017, 2017-Decem, 5999–6009.
- Cho, K.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations Using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP); 2014; pp. 1724–1734.
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput 1997, 9, 1735–1780. [CrossRef]
- Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models Are Few-Shot Learners. Adv Neural Inf Process Syst 2020, 2020-December. [CrossRef]
- Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.-A.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F.; et al. LLaMA: Open and Efficient Foundation Language Models. 2023. [CrossRef]
- Devlin, J.; Chang, M.-W.; Lee, K.; Google, K.T.; Language, A.I. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. Naacl-Hlt 2019 2018.
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. 2020. [CrossRef]
- Caron, M.; Touvron, H.; Misra, I.; Jegou, H.; Mairal, J.; Bojanowski, P.; Joulin, A. Emerging Properties in Self-Supervised Vision Transformers. Proceedings of the IEEE International Conference on Computer Vision 2021, 9630–9640. [CrossRef]
- Zhang, Z.; Du, H.; Choi, S.; Cho, S.H. TIPS: Transformer Based Indoor Positioning System Using Both CSI and DoA of WiFi Signal. IEEE Access 2022, 10, 111363–111376. [CrossRef]
- Liu, W.; Cheng, Q.; Deng, Z.; Chen, H.; Fu, X.; Zheng, X.; Zheng, S.; Chen, C.; Wang, S. Survey on CSI-Based Indoor Positioning Systems and Recent Advances. 2019 International Conference on Indoor Positioning and Indoor Navigation, IPIN 2019 2019. [CrossRef]
- Torres-Sospedra, J.; Montoliu, R.; Martinez-Uso, A.; Avariento, J.P.; Arnau, T.J.; Benedito-Bordonau, M.; Huerta, J. UJIIndoorLoc: A New Multi-Building and Multi-Floor Database for WLAN Fingerprint-Based Indoor Localization Problems. IPIN 2014 - 2014 International Conference on Indoor Positioning and Indoor Navigation 2014, 261–270. [CrossRef]
- Zafari, F.; Gkelias, A.; Leung, K.K. A Survey of Indoor Localization Systems and Technologies. IEEE Communications Surveys and Tutorials 2019, 21, 2568–2599. [CrossRef]
- Singh, N.; Choe, S.; Punmiya, R. Machine Learning Based Indoor Localization Using Wi-Fi RSSI Fingerprints: An Overview. IEEE Access 2021, 9, 127150–127174. [CrossRef]
- Feng, X.; Nguyen, K.A.; Luo, Z. A Survey of Deep Learning Approaches for WiFi-Based Indoor Positioning. Journal of Information and Telecommunication 2022, 6, 163–216. [CrossRef]
- Rezgui, Y.; Pei, L.; Chen, X.; Wen, F.; Han, C. An Efficient Normalized Rank Based SVM for Room Level Indoor WiFi Localization with Diverse Devices. Mobile Information Systems 2017, 2017. [CrossRef]
- Singh, N.; Choe, S.; Punmiya, R.; Kaur, N. XGBLoc: XGBoost-Based Indoor Localization in Multi-Building Multi-Floor Environments. Sensors 2022, 22. [CrossRef]
- Battiti, R.; Le, N.T.X.; Villani, A. Location-Aware Computing: A Neural Network Model for Determining Location in Wireless LANs. 2002.
- Liu, Z.; Dai, B.; Wan, X.; Li, X. Hybrid Wireless Fingerprint Indoor Localization Method Based on a Convolutional Neural Network. Sensors 2019, Vol. 19, Page 4597 2019, 19, 4597. [CrossRef]
- Song, X.; Fan, X.; He, X.; Xiang, C.; Ye, Q.; Huang, X.; Fang, G.; Chen, L.L.; Qin, J.; Wang, Z. Cnnloc: Deep-Learning Based Indoor Localization with Wifi Fingerprinting. Proceedings - 2019 IEEE SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Internet of People and Smart City Innovation, SmartWorld/UIC/ATC/SCALCOM/IOP/SCI 2019 2019, 589–595. [CrossRef]
- Lukito, Y.; Chrismanto, A.R. Recurrent Neural Networks Model for WiFi-Based Indoor Positioning System. 2017 International Conference on Smart Cities, Automation & Intelligent Computing Systems (ICON-SONICS) 2017, 121–125. [CrossRef]
- Hoang, M.T.; Yuen, B.; Dong, X.; Lu, T.; Westendorp, R.; Reddy, K. Recurrent Neural Networks for Accurate RSSI Indoor Localization. IEEE Internet Things J 2019, 6, 10639–10651. [CrossRef]
- Chen, Z.; Zou, H.; Yang, J.F.; Jiang, H.; Xie, L. WiFi Fingerprinting Indoor Localization Using Local Feature-Based Deep LSTM. IEEE Syst J 2020, 14, 3001–3010. [CrossRef]
- What Is Deep Learning? | IBM Available online: https://www.ibm.com/topics/deep-learning (accessed on 22 February 2023).
- Ca, P.V.; Edu, L.T.; Lajoie, I.; Ca, Y.B.; Ca, P.-A.M. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion Pascal Vincent Hugo Larochelle Yoshua Bengio Pierre-Antoine Manzagol. Journal of Machine Learning Research 2010, 11, 3371–3408.
- Nowicki, M.; Wietrzykowski, J. Low-Effort Place Recognition with WiFi Fingerprints Using Deep Learning. 2017. [CrossRef]
- Kim, K.S.; Lee, S.; Huang, K. A Scalable Deep Neural Network Architecture for Multi-Building and Multi-Floor Indoor Localization Based on Wi-Fi Fingerprinting. Big Data Anal 2018, 3. [CrossRef]
- Qin, F.; Zuo, T.; Wang, X. Ccpos: Wifi Fingerprint Indoor Positioning System Based on Cdae-Cnn. Sensors (Switzerland) 2021, 21, 1–17. [CrossRef]
- Montoliu, R.; Sansano, E.; Torres-Sospedra, J.; Belmonte, O. IndoorLoc Platform: A Public Repository for Comparing and Evaluating Indoor Positioning Systems. 2017 International Conference on Indoor Positioning and Indoor Navigation, IPIN 2017 2017, 2017-January, 1–8. [CrossRef]
- Laska, M.; Blankenbach, J. Deeplocbox: Reliable Fingerprinting-Based Indoor Area Localization. Sensors 2021, 21, 1–23. [CrossRef]
- Laska, M.; Blankenbach, J. Multi-Task Neural Network for Position Estimation in Large-Scale Indoor Environments. IEEE Access 2022, 10, 26024–26032. [CrossRef]
- Elesawi, A.E.A.; Kim, K.S. Hierarchical Multi-Building And Multi-Floor Indoor Localization Based On Recurrent Neural Networks. CoRR 2021, abs/2112.12478.
- Wang, Y.-A.; Chen, Y.-N. What Do Position Embeddings Learn? An Empirical Study of Pre-Trained Language Model Positional Encoding. 2020.
- Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. 7th International Conference on Learning Representations, ICLR 2019 2017. [CrossRef]
- Google Colab Available online: https://research.google.com/colaboratory/faq.html (accessed on 9 March 2023).





| Input/Label | Fingerprinting Data | Description |
|---|---|---|
| Input | RSSI values from 520 WAPs in dBm. | |
| Classification Label | Building ID, 3 unique buildings. | |
| Classification Label | Floor ID, 13 unique floors. | |
| Regression Label | Longitudinal value in meters. | |
| Regression Label | Latitudinal value in meters |
| Building | 0 | 1 | 2 | ||||||||||
| Floor | 0 | 1 | 2 | 3 | 0 | 1 | 2 | 3 | 0 | 1 | 2 | 3 | 4 |
| Unique Floor | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| Input Method 1 | Sequence Length: the total number of WAPs (520 for UJI) or principal components after applying PCA. |
| Features: the number of RSS readings per WAP or scores per principal component. | |
| Objective: to allow the attention mechanism to weigh input based on the similarity of the features and allow the learnable position embedding to learn the relative position of each WAP. This method is preferred for larger datasets, as it would enable the network to compose a good representation of each WAP. | |
| Input Method 2 | Sequence Length: number of samples. A single sample can be used (sequence length of 1), but we recommend batching multiple samples chronologically to benefit from the attention mechanism. |
| Features: RSS values from each WAP (520 for UJI) or scores of principal components after applying PCA. | |
| Objective: representing all of the WAPs or principal components as features are preferred for smaller datasets; this would circumvent the need to obtain an accurate input representation which is difficult on smaller datasets (like UJI by deep learning standards). |
| Parameter | Value | Description |
| Learning Rate | 0.0003 | Step size of parameter update. |
| Scheduler | Cosine with Warmup | Adjusts learning rate during training. |
| Encoder Layers | 2 | Number of stacked identical encoder layers. |
| Attention Heads | 8 | Number of query, key and value pairs used for MHA. |
| Model Dimensions | 256 | Dimensionality of the transformed input features. |
| Dropout Rate | 0.2 | Fraction of inputs dropped during training. |
| Batch Size | 128 | Number of samples per iteration. |
| Optimizer | AdamW | Stochastic gradient descent method to update model parameters. |
| Loss Function | Cross-Entropy | Method to evaluate classification performance. |
| PCA Components | 128 | The number of components used to represent directions of maximum variance in the data. |
| Model | Dataset Split | Classification Accuracy | |
| Building Accuracy | Floor Accuracy | ||
| HyTra (1,1) | Training | 100% | 100% |
| Validation | 99.85% | 99.80% | |
| Testing | 100% | 94.33% | |
| HyTra (1,2) | Training | 99.22% | 99.22% |
| Validation | 99.85% | 99.10% | |
| Testing | 99.91% | 88.12% | |
| HyTra (2,3) | Training | 100% | 99.24% |
| Validation | 99.35% | 98.34% | |
| Testing | 100% | 96.47% | |
| Deep Learning Methods | Classification Accuracy | |
| Building Accuracy | Floor Accuracy | |
| DNN [29] | - | 91.10% |
| Scalable DNN [30] | 99.5% | 91.26% |
| Hierarchical RNN [35] | 100% | 95.23 |
| 2D-CNN (m-CEL) [34] | - | 95.30% |
| CNNLoc [23] | 100% | 96.03% |
| HyTra (1,1) | 100% | 94.33% |
| HyTra (1,2) | 99.91% | 88.12% |
| HyTra (2,3) | 100% | 96.47% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).