Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

DeHyFoNet: Deformable Hybrid Network for Formula Detection in Scanned Document Images

Version 1 : Received: 5 January 2022 / Approved: 6 January 2022 / Online: 6 January 2022 (12:56:23 CET)

How to cite: Afzal, M.Z.; Hashmi, K.A.; Pagani, A.; Liwicki, M.; Stricker, D. DeHyFoNet: Deformable Hybrid Network for Formula Detection in Scanned Document Images. Preprints 2022, 2022010090. https://doi.org/10.20944/preprints202201.0090.v1 Afzal, M.Z.; Hashmi, K.A.; Pagani, A.; Liwicki, M.; Stricker, D. DeHyFoNet: Deformable Hybrid Network for Formula Detection in Scanned Document Images. Preprints 2022, 2022010090. https://doi.org/10.20944/preprints202201.0090.v1

Abstract

This work presents an approach for detecting mathematical formulas in scanned document images. The proposed approach is end-to-end trainable. Since many OCR engines cannot reliably work with the formulas, it is essential to isolate them to obtain the clean text for information extraction from the document. Our proposed pipeline comprises a hybrid task cascade network with deformable convolutions and a Resnext101 backbone. Both of these modifications help in better detection. We evaluate the proposed approaches on the ICDAR-2017 POD and Marmot datasets and achieve an overall accuracy of 96% for the ICDAR-2017 POD dataset. We achieve an overall reduction of error of 13%. Furthermore, the results on Marmot datasets are improved for the isolated and embedded formulas. We achieved an accuracy of 98.78% for the isolated formula and 90.21% overall accuracy for embedded formulas. Consequently, it results in an error reduction rate of 43% for isolated and 17.9% for embedded formulas.

Keywords

formula detection; Hybrid Task Cascade network; mathematical expression detection; document image analysis; deep neural networks; computer vision

Subject

Computer Science and Mathematics, Computer Vision and Graphics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.