Preprint Review Version 1 Preserved in Portico This version is not peer-reviewed

Darknet on OpenCL: A Multi-platform Tool for Object Detection and Classification

Version 1 : Received: 21 July 2020 / Approved: 22 July 2020 / Online: 22 July 2020 (09:39:51 CEST)

How to cite: Sowa, P.; Izydorczyk, J. Darknet on OpenCL: A Multi-platform Tool for Object Detection and Classification. Preprints 2020, 2020070506 (doi: 10.20944/preprints202007.0506.v1). Sowa, P.; Izydorczyk, J. Darknet on OpenCL: A Multi-platform Tool for Object Detection and Classification. Preprints 2020, 2020070506 (doi: 10.20944/preprints202007.0506.v1).

Abstract

The article’s goal is to overview challenges and problems on the way from the state of the art CUDA accelerated neural networks code to multi-GPU code. For this purpose, the authors describe the journey of porting the existing in the GitHub, fully-featured CUDA accelerated Darknet engine to OpenCL. The article presents lessons learned and the techniques that were put in place to make this port happen. There are few other implementations on the GitHub that leverage the OpenCL standard, and a few have tried to port Darknet as well. Darknet is a well known convolutional neural network (CNN) framework. The authors of this article investigated all aspects of the porting and achieved the fully-featured Darknet engine on OpenCL. The effort was focused not only on the classification with the use of YOLO1, YOLO2, and YOLO3 CNN models. They also covered other aspects, such as training neural networks, and benchmarks to look for the weak points in the implementation. The GPU computing code substantially improves Darknet computing time compared to the standard CPU version by using underused hardware in existing systems. If the system is OpenCL-based, then it is practically hardware independent. In this article, the authors report comparisons of the computation and training performance compared to the existing CUDA-based Darknet engine in the various computers, including single board computers, and, different CNN use-cases. The authors found that the OpenCL version could perform as fast as the CUDA version in the compute aspect, but it is slower in memory transfer between RAM (CPU memory) and VRAM (GPU memory). It depends on the quality of OpenCL implementation only. Moreover, loosening hardware requirements by the OpenCL Darknet can boost applications of DNN, especially in the energy-sensitive applications of Artificial Intelligence (AI) and Machine Learning (ML).

Subject Areas

neural network; object detection; object classification; Darknet; programming.

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.