Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Towards Improving YARN performance for Frugal Heterogeneous SBC-based Edge Clusters

Version 1 : Received: 30 March 2024 / Approved: 1 April 2024 / Online: 2 April 2024 (09:14:06 CEST)

How to cite: Qureshi, B. Towards Improving YARN performance for Frugal Heterogeneous SBC-based Edge Clusters. Preprints 2024, 2024040154. https://doi.org/10.20944/preprints202404.0154.v1 Qureshi, B. Towards Improving YARN performance for Frugal Heterogeneous SBC-based Edge Clusters. Preprints 2024, 2024040154. https://doi.org/10.20944/preprints202404.0154.v1

Abstract

In the dynamic landscape of sustainable computing, use of edge devices is paramount for reducing the need for large-scale centralized data centers. By processing data locally, edge devices minimize the energy-intensive computing in data centers, improving the overall performance, cost-effectiveness whereas reducing the environmental impact. Edge devices may constitute edge clusters composed of resource frugal Single Board Computers (SBC) such as Raspberry Pi etc. The small form-factor and energy efficiency of these computers makes them ideal for processing large data on the edge. Despite their potential, traditional Hadoop configurations struggle to optimize performance in heterogeneous SBC clusters due to disparities in computing resources. Consequently, we propose modifications to the Yet Another Resource Negotiator (YARN) scheduling mechanism to address these challenges. Our proposed changes include the introduction of a Frugality Index and an adaptiveConfig policy. The Frugality Index categorizes SBC nodes based on their capabilities, enabling intelligent resource allocation. The adaptiveConfig policy dynamically adjusts resource allocation in response to workload and cluster conditions, enhancing system efficiency. Additionally, we introduce a fetch_threshold for reduce tasks to improve task prioritization based on locality and data processing efficiency. We evaluate our approach using a 13-node SBC cluster and conduct experiments with CPU-intensive and IO-intensive Hadoop benchmarks. The results demonstrate significant performance improvements compared to native YARN settings, with execution times 4.7 times faster than the worst_native and 1.9 times faster than the best_native scenarios. Furthermore, the proposed adaptiveConfig policy implementing the frugality index and a fetch_threshold outperforms the native YARN by 5.86 times and 1.79 times in Terasort and wordcount executions respectively. Our findings underscore the effectiveness of our approach in managing the heterogeneous nature of SBC clusters and optimizing performance across various hardware configurations. The adaptive policies prove well-suited to the frugal SBC-cluster context, yielding enhanced outcomes and paving the way for sustainable big data processing initiatives.

Keywords

Single Board Computers; Frugal edge computing; Hadoop; YARN; heterogeneous

Subject

Computer Science and Mathematics, Computer Science

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.