Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Statistics and Machine Learning Experiments in Poetry

Version 1 : Received: 10 June 2020 / Approved: 28 June 2020 / Online: 2 July 2020 (00:00:00 CEST)
Version 2 : Received: 10 June 2020 / Approved: 28 June 2020 / Online: 22 October 2020 (00:00:00 CEST)

A peer-reviewed article of this Preprint also exists.

Calin, O. Statistics and Machine Learning Experiments in English and Romanian Poetry. Sci 2020, 2, 92. Calin, O. Statistics and Machine Learning Experiments in English and Romanian Poetry. Sci 2020, 2, 92.

Abstract

This paper presents a quantitative approach to poetry, based on the use of several statistical measures (entropy, information energy, N-gram, etc.) applied to a few characteristic English writings. We found that English language changes its entropy as time passes, and that entropy depends on the language used and on the author. In order to compare two similar texts, we were able to introduce a statistical method to asses the information entropy between two texts. We also introduced a method of computing the average information conveyed by a group of letters about the next letter in the text. We found a formula for computing the Shannon language entropy and we introduced the concept of N-gram informational energy of a poetry. We also constructed a neural network, which is able to generate Byron-type poetry and to analyze the information proximity to the genuine Byron poetry.

Keywords

entropy; Kullback–Leibler relative entropy; recurrent neural networks; learning

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.