Submitted:
30 March 2024
Posted:
02 April 2024
Read the latest preprint version here
Abstract
Keywords:
1. Introduction
2. Background
2.1. Single Board Computers
2.2. Apache Hadoop YARN
3. Proposed Scheduling Mechanism for Frugal SBC-Based Clusters
3.1. Motivation and Limitations
3.2. Frugality Index
3.3. Heartbeat Messages
3.4. Adaptive Fair Scheduling Scheme
3.5. Tasks Locality and Prioritization
4. Performance Evaluation and Results
4.1. Experimental Setup

4.2. Task Distribution in Native YARN vs the Proposed Approach
4.3. Effect of adaptiveConfig Scheduling Policy on Task Distribution
4.4. Effect of fetch_threshold Values


5. Discussion and Future Directions
6. Conclusions
Acknowledgments
References
- Óscar Castellanos-Rodríguez, Roberto R. Expósito, Jonatan Enes, Guillermo L. Taboada, Juan Touriño, Serverless-like platform for container-based YARN clusters, Future Generation Computer Systems, Volume 155, 2024, Pages 256-271. [CrossRef]
- Warade, M.; Schneider, J.-G.; Lee, K. Measuring the Energy and Performance of Scientific Workflows on Low-Power Clusters. Electronics 2022, 11, 1801. [CrossRef]
- Thesma, V.; Rains, G.C.; Mohammadpour Velni, J. Development of a Low-Cost Distributed Computing Pipeline for High-Throughput Cotton Phenotyping. Sensors 2024, 24, 970. [CrossRef]
- Veerachamy, R.; Ramar, R. Agricultural Irrigation Recommendation and Alert (AIRA) system using optimization and machine learning in Hadoop for sustainable agriculture. Environ. Sci. Pollut. Res. 2022, 29, 19955–19974. [CrossRef]
- A Setiyawan, Wireless Engine Diagnostic Tool Based on Internet of Things (IoT) With PiOBD-II Using Raspberry on Honda Jazz VTEC, J. Phys.: Conf. Ser. 2022. 2406 012028. [CrossRef]
- Ali, A.-e.A.; Mashhour, M.; Salama, A.S.; Shoitan, R.; Shaban, H. Development of an Intelligent Personal Assistant System Based on IoT for People with Disabilities. Sustainability 2023, 15, 5166. [CrossRef]
- Netinant, P.; Utsanok, T.; Rukhiran, M.; Klongdee, S. Development and Assessment of Internet of Things-Driven Smart Home Security and Automation with Voice Commands. Internet of Things 2024, 5, 79-99. [CrossRef]
- Chen, I.-T.; Tsai, J.-M.; Chen, Y.-T.; Lee, C.-H. Lightweight Mutual Authentication for Healthcare IoT. Sustainability 2022, 14, 13411. [CrossRef]
- S.J.Johnston, P.J.Basford, C.S.Perkins, H.Herry, F.P.Tso, D.Pezaros, R. D. Mullins, E. Yoneki, S. J. Cox, J. Singer, Commodity single board computer clusters and their applications, Future Generation Computer Systems 89 (2018) 201–212. [CrossRef]
- P. J. Basford, S. J. Johnston, C. S. Perkins, T. Garnock-Jones, F. P. Tso, D. Pezaros, R. D. Mullins, E. Yoneki, J. Singer, S. J. Cox, Performance analysis of single board computer clusters, Future Generation Computer Systems 102 (2020) 278–291. [CrossRef]
- B. Qureshi, A.Koubaa, On energy efficiency and performance evaluation of single board computer based clusters: A hadoop case study, Electronics 8 (2) (2019) 182. [CrossRef]
- E. Lee, H. Oh, D. Park, Big data processing on single board computer clusters: Exploring challenges and possibilities, IEEE Access 9 (2021) 142551–142565. [CrossRef]
- A.J.A.Neto, J.A.C.Neto, E.D.Moreno, The development of a low-cost big data cluster using apache hadoop and raspberry pi. a complete guide, Computers and Electrical Engineering 104 (2022) 108403. [CrossRef]
- S Nugroho and A Widiyanto, Designing parallel computing using raspberry pi clusters for IoT servers on apache Hadoop, J. Phys.: Conf. Ser. 1517 012070 (2020). [CrossRef]
- Apache Hadoop YARN, last accessed March 2024. https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html.
- A. Singh et al., "A Comparative Study of Bigdata Tools: Hadoop Vs Spark Vs Storm," 2023 IEEE 4th KhPI Week on Advanced Technology (KhPIWeek), Kharkiv, Ukraine, 2023, pp. 1-5. [CrossRef]
- J. Xue, T. Wang and P. Cai, "Towards Efficient Workflow Scheduling Over Yarn Cluster Using Deep Reinforcement Learning," GLOBECOM 2023 - 2023 IEEE Global Communications Conference, Kuala Lumpur, Malaysia, 2023, pp. 473-478. [CrossRef]
- B. Qureshi and A. Koubaa. On performance of commodity single board computer-based clusters: A big data perspective. EAI/Springer Innovations in Communication and Computing, 2020, 349–375. [CrossRef]
- S. Vengadeswaran, S.R. Balasundaram, P. Dhavakumar, IDaPS — Improved data-locality aware data placement strategy based on Markov clustering to enhance MapReduce performance on Hadoop, Journal of King Saud University - Computer and Information Sciences, Volume 36, Issue 3, 2024, 101973. [CrossRef]
- Han, R., Liu, C. H., Zong, Z., Chen, L. Y., Liu, W., Wang, S., & Zhan, J. (2019). Workload-Adaptive Configuration Tuning for Hierarchical Cloud Schedulers. IEEE Transactions on Parallel and Distributed Systems, 30(12), 2879-2895. Article 8741093. [CrossRef]
- N. Ahmed , Andre L. C. Barczak , Mohammad A. Rashid and Teo Susnjak1, A parallelization model for performance characterization of Spark Big Data jobs on Hadoop clusters, Journal of Big Data (2021) 8:107. [CrossRef]
| 1 | |
| 2 | Pine 64 RockPro64 https://pine64.com/product/rockpro64-4gb-single-board-computer/
|





| Raspberry Pi 5 | Pine64 Rockpro64 | Raspberry Pi 3B+ | Odriod XU-4 | |
|---|---|---|---|---|
| Processor | 2.4 GHz quad-core 64-bit ARM Cortex A76 | 1.8GHz Hexa Rockchip RK3399 ARM Cortex A72 and 1.4 GHz Quad Cortex-A53 | 1.4GHz 64-bit quad-core ARM Cortex-A53 | Exynos5 Octa ARM Cortex-A15 Quad 2Ghz and Cortex-A7 Quad 1.3GHz |
| Memory | 8GB LPDDR4X-SDRAM |
4GB LPDDR4-SDRAM |
1GB LPDDR3-SDRAM |
2GB DDR3 |
| Ethernet | Gigabit Ethernet | Gigabit Ethernet | 300Mbit/s | Gigabit Ethernet |
| GPU | VideoCore VII 800MHz |
Mali-T860 GPU 700MHz |
VideoCore IV 400MHz |
Mali-T628 MP6 600 MHz |
| A/V | HDMI | HDMI | HDMI 1.3 | HDMI |
| Price (USD) | 80 | 79.99 | 35 | 53 |
| Release | 2023 | 2018 | 2018 | 2016 |
| Power | 1.3 W idle; 8.6 W max |
3.1 W idle; 10.9 W max |
1.9 W idle; 5.1 W max |
2.1 W idle; 6.4 W max |
| Mapred-site.xml | Value |
| yarn.app.mapreduce.am.resource.mb | 852 |
| mapreduce.map.cpu.vcores | 1 |
| mapreduce.reduce.cpu.vcores | 1 |
| mapreduce.map.memory.mb | 852 |
| mapreduce.reduce.memory.mb | 852 |
| YARN-site.xml | Value |
| yarn.nodemanager.resource.memory-mb | 1024 |
| yarn.nodemanager.resource.cpu-vcores | 1 |
| yarn.scheduler.maximum-allocation-mb | 852 |
| yarn.scheduler.maximum-allocation-vcores | 8 |
| yarn.nodemanager.vmem-pmem-ratio | 2.1 |
| FIndex | Device | CPU | Memory |
|---|---|---|---|
| 4 | Raspberry Pi 5 | 2.4 GHz | 8 GB |
| 3 | Raspberry Pi 4 | 1.5 GHz | 4 GB |
| 3 | Pine64 Rockpro64 | 1.8 GHz | 4 GB |
| 2 | Odroid Xu4 | 2.0 GHz | 2 GB |
| 1 | Raspberry Pi 3B | 1.4 GHz | 1 GB |
| 1 | Raspberry Pi 2 | 900 MHz | 1 GB |
| Terasort execution time (seconds) | |||||||
|---|---|---|---|---|---|---|---|
| Chunk size 64 | Chunk size 128 | ||||||
| # of Reduce | Data size (GB) | Scenario1 | Scenario2 | Scenario3 | Scenario1 | Scenario2 | Scenario3 |
| 1 | 1 | 392.3 | 163.1 | 132.1 | 451.1 | 171.3 | 129.5 |
| 2 | 1 | 235.3 | 144.7 | 117.2 | 270.6 | 151.9 | 114.9 |
| 4 | 1 | 219.6 | 114.3 | 92.6 | 252.5 | 120.0 | 66.9 |
| 8 | 1 | 201.4 | 109.8 | 88.9 | 231.6 | 115.3 | 58.7 |
| 1 | 2 | 861.0 | 273.2 | 221.3 | 887.1 | 275.9 | 216.9 |
| 2 | 2 | 693.1 | 289.6 | 234.6 | 679.5 | 292.5 | 229.9 |
| 4 | 2 | 615.4 | 293.8 | 238.0 | 668.6 | 293.1 | 233.2 |
| 8 | 2 | 598.4 | 291.6 | 236.2 | 645.7 | 294.5 | 231.5 |
| 1 | 4 | 1989.1 | 351.4 | 305.7 | 2015.6 | 365.5 | 299.6 |
| 2 | 4 | 1673.3 | 298.3 | 259.5 | 1798.4 | 310.2 | 254.3 |
| 4 | 4 | 1498.1 | 274.5 | 238.8 | 1456.1 | 269.3 | 234.0 |
| 8 | 4 | 1613.7 | 319.6 | 278.1 | 1598.7 | 323.1 | 272.5 |
| 1 | 8 | 5193.9 | 1025.6 | 892.3 | 5341.9 | 1016.3 | 874.4 |
| 2 | 8 | 3819.2 | 916.5 | 797.4 | 3857.4 | 934.8 | 781.4 |
| 4 | 8 | 3189.1 | 856.1 | 744.8 | 3093.4 | 873.2 | 729.9 |
| 8 | 8 | 3091.8 | 813.5 | 707.7 | 3030.0 | 829.8 | 693.6 |
| WordCount execution time (seconds) | |||||||
|---|---|---|---|---|---|---|---|
| Chunk size 64 | Chunk size 128 | ||||||
| # of Reduces | Data size (GB) | Scenario1 | Scenario2 | Scenario3 | Scenario1 | Scenario2 | Scenario3 |
| 1 | 1 | 3089.1 | 1729.9 | 1401.2 | 3552.5 | 1816.4 | 1373.2 |
| 2 | 1 | 1891.6 | 1059.3 | 858.0 | 2175.3 | 1112.3 | 840.9 |
| 4 | 1 | 1651.4 | 924.8 | 749.1 | 1899.1 | 971.0 | 734.1 |
| 8 | 1 | * | 875.1 | 708.8 | * | 918.9 | 674.1 |
| 1 | 2 | 6103.7 | 3418.1 | 2768.6 | 7019.3 | 3452.3 | 2713.3 |
| 2 | 2 | 4714.3 | 2640.0 | 2138.4 | 5421.4 | 2666.4 | 2095.6 |
| 4 | 2 | 4309.1 | 2413.1 | 1954.6 | 4955.5 | 2437.2 | 1915.5 |
| 8 | 2 | * | 2289.7 | 1854.7 | * | 2312.6 | 1817.6 |
| 1 | 4 | 13173.8 | 7377.3 | 6418.3 | 15149.9 | 7672.4 | 6289.9 |
| 2 | 4 | 9513.4 | 5327.5 | 4634.9 | 10940.4 | 5540.6 | 4681.3 |
| 4 | 4 | 8963.1 | 5219.4 | 3953.0 | 10307.6 | 5428.2 | 3992.5 |
| 8 | 4 | * | 5069.1 | 3761.0 | * | 5271.9 | 3798.6 |
| 1 | 8 | * | * | 12915.3 | * | * | 13044.5 |
| 2 | 8 | * | * | 8194.0 | * | * | 8030.1 |
| 4 | 8 | * | * | 7149.0 | * | * | 7006.0 |
| 8 | 8 | * | * | 6328.0 | * | * | 6201.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).