ARTICLE | doi:10.20944/preprints201909.0062.v1
Online: 5 September 2019 (12:19:48 CEST)
This paper presents a simple yet effective solution for the transcrip- tion of printed texts. Our tool consists of a web-based user interface that provides an easy-to-use and ergonomic workflow and a col- laborative environment for the philologists while allowing them to profit from machine learning OCR technology. As the targeted use case is not mass digitisation but the creation of accurate citable digital editions, the user interface for ground truth production and post correction is built to provide the means for rapid proofread- ing while minimising the amount of errors. The productivity of the setup is further improved by enabling progressive OCR train- ing and recognition in the background to constantly increase the accuracy of the predictions. The advantages of the application are showcased in the second part of the paper by documenting our experiences utilising it for di- gitising Arabic and Latin texts. Over the course of several months the tool has been used to create transcriptions of a wide range of sources, among them challenging early modern editions and Ar- abic scripts, producing a large amount of reusable OCR training data as a positive side effect. Finally, there will be a discussion of possible future extensions of the tool and of how it could be adapted to fit the needs of other digitisation projects.
ARTICLE | doi:10.20944/preprints202107.0200.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: image quality assessment; image quality metrics; NR-IQAs; D-IQA; OCR accuracy; OCR prediction; OCR improvements; visual aids; visually impaired; reading aids; document images; text-based images
Online: 8 July 2021 (13:21:49 CEST)
For Visually impaired People (VIPs), the ability to convert text to sound can mean a new level of independence or the simple joy of a good book. With significant advances in Optical Character Recognition (OCR) in recent years, a number of reading aids are appearing on the market. These reading aids convert images captured by a camera to text which can then be read aloud. However, all of these reading aids suffer from a key issue – the user must be able to visually target the text and capture an image of sufficient quality for the OCR algorithm to function – no small task for VIPs. In this work, a Sound-Emitting Document Image Quality Assessment metric (SEDIQA) is proposed which allows the user to hear the quality of the text image and automatically captures the best image for OCR accuracy. This work also includes testing of OCR performance against image degradations, to identify the most significant contributors to accuracy reduction. The proposed No-Reference Image Quality Assessor (NR-IQA) is validated alongside established NR-IQAs and this work includes insights into the performance of these NR-IQAs on document images.
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: Amharic script; Attention mechanism; OCR; Encoder-decoder; Text-image
Online: 15 October 2020 (13:42:28 CEST)
In the present, the growth of digitization and worldwide communications make OCR systems of exotic languages a very important task. In this paper, we attempt to develop an OCR system for one of these exotic languages with a unique script, Amharic. Motivated by the recent success of the Attention mechanism in Neural Machine Translation (NMT), we extend the attention mechanism for Amharic text-image recognition. The proposed model consists of CNNs and attention embedded recurrent encoder-decoder networks that are integrated following the configuration of the seq2seq framework. The attention network parameters are trained in an end-to-end fashion and the context vector is injected, with the previously predicted output, at each time steps of decoding. Unlike the existing OCR model that minimizes the CTC objective function, the new model minimizes the categorical cross-entropy loss. The performance of the proposed attention-based model is evaluated against the test dataset from the ADOCR database which consists of both printed and synthetically generated Amharic text-line images and achieved promising results with a CER of 1.54% and 1.17% respectively.
ARTICLE | doi:10.20944/preprints201812.0306.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: cymatics; text detection and recognition; optical character recognition (OCR)
Online: 25 December 2018 (13:52:31 CET)
This paper propose an original approach of achieving a Cymatics based visual perception of image-extracted text. In this context, an effective approach for automated text detection and recognition for the natural scene images is proposed. The incoming image is firstly enhanced by employing CLAHE and DWT. Afterwards, the text regions of the enhanced image are detected by employing the MSER feature detector. The non-text MSERs are removed by employing the geometrical and contour based filters. The remaining MSERs are grouped into words or phrases by finding out similarities between them. The text recognition is performed by employing an OCR function. The extracted text is sequentially analysed on character by character basis. Each character is converted into a methodical acoustic excitation. Finally, these excitations are converted into the systematic visual perceptions by using the phenomenon of Cymatics. The system functionality is tested with an experimental setup. For the case of studied natural scenes, the suggested approach achieves 80% precision in text localization and 53% precision in end-to-end text recognition. The devised system principle is novel and can be employed in various applications like visual art, encryption, education, integration of impaired people, etc.
ARTICLE | doi:10.20944/preprints201804.0327.v2
Subject: Engineering, Electrical & Electronic Engineering Keywords: Overcurrent Relay (OCR); Genetic Algorithm (GA); Time Dial Setting (TDS); Plug Setting Multiplier (PSM); Optimal OCR setting and coordination and DigSILENT power factory
Online: 27 April 2018 (15:49:58 CEST)
This paper presents a study on optimization of Overcurrent relay (OCR) coordination protection scheme for Sustainable Standalone Hydrokinetic Renewable Energy (SHRE) distribution network at Batang Rajang river, located at Kapit Sarawak, Malaysia by turning river stream into power generation source. The purpose of the project is to develop rural electrification system for native long houses along the river. The research study is tested on a DigSILENT develop model of the SHRE distribution network and in accordance with all respectively unique parameters and relevant standards. Since this is a new standalone distribution system, an efficient and properly coordinated overcurrent protection system must be provided. Improper and miscoordination among OCRs result in maloperation of the protection system that can lead to false tripping and an unnecessary outage and power system instability. Thus, the objective of this work is to employ Genetic Algorithm (GA) technique in Matlab/Simulink for optimal overcurrent coordination and settings among all OCRs in the proposed distribution network in order to improve the speed of OCR tripping operation. The GA is used because the project is fast track and requiring the simplest method available. In this strategy, time dial setting (TDS) is optimized by using plug setting multiplier (PSM) as the constraint. The obtained results show a significant improvement of the relay operating time of 36.01% faster than that of conventional numerical technique during fault occurrence. Thus, an efficient and reliable overcurrent protection scheme has been achieved for the SHRE distribution network.
ARTICLE | doi:10.20944/preprints202203.0329.v1
Subject: Mathematics & Computer Science, Analysis Keywords: Plagiarism Detection; Plagiarism checker for Bengali text; Bengali Literature Corpus; OCR in Bengali text
Online: 24 March 2022 (09:36:56 CET)
Plagiarism means taking another person’s work and not giving any credit to them for it. Plagiarism is one of the most serious problems in academia and among researchers. Even though there are multiple tools available to detect plagiarism in a document but most of them are domain-specific and designed to work in English texts, but plagiarism is not limited to a single language only. Bengali is the most widely spoken language of Bangladesh and the second most spoken language in India with 300 million native speakers and 37 million second-language speakers. Plagiarism detection requires a large corpus for comparison. Bengali Literature has a history of 1300 years. Hence most Bengali Literature books are not yet digitalized properly. As there was no such corpus present for our purpose so we have collected Bengali Literature books from the National Digital Library of India and with a comprehensive methodology extracted texts from it and constructed our corpus. Our experimental results find out average accuracy between 72.10 % - 79.89 % in text extraction using OCR. Levenshtein Distance algorithm is used for determining Plagiarism. We have built a web application for end-user and successfully tested it for Plagiarism detection in Bengali texts. In future, we aim to construct a corpus with more books for more accurate detection.