Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Improving the Accuracy in Text Classification Methodology in Light of Modelling the Latent Semantic Relations

Version 1 : Received: 15 October 2018 / Approved: 16 October 2018 / Online: 16 October 2018 (07:55:35 CEST)

A peer-reviewed article of this Preprint also exists.

Rizun, N.; Taranenko, Y.; Waloszek, W. Improving the Accuracy in Sentiment Classification in the Light of Modelling the Latent Semantic Relations. Information 2018, 9, 307. Rizun, N.; Taranenko, Y.; Waloszek, W. Improving the Accuracy in Sentiment Classification in the Light of Modelling the Latent Semantic Relations. Information 2018, 9, 307.

Abstract

The research presents the Methodology of Improving the Accuracy in Text Classification in Light of Modelling the Latent Semantic Relations (LSR). The aim of this Methodology is to find the ways of eliminating the Limitations of Discriminant and Probabilistic methods for LSR revealing and customizing the Text Classification Process to the more accurate recognition of the text tonality. This aim should be achieved by using the knowledge about the text’s Hierarchical Semantic Context in the form of Corpora-based Hierarchical Sentiment Dictionary. The main scientific contribution of this research is the following set of approaches to improve the qualitative characteristics of Text Classification process: combination of the Discriminant and Probabilistic methods allowing to decrease the influences of the Limitations of these methods on the LSR revealing process; considering each document as a complex structure allowing to estimate documents integrally by separated classification of topically completed textual component (paragraphs); taking into account the features of Argumentative type of documents (Reviews) allowing to use the author’s subjective evaluation of text tonality for development the Text Classification methodology. Tonality, expressed by the Review’s author, has a significant, but not critical, effect on the qualitative indicators of Sentiment Recognition.

Keywords

text classification; topic modelling; latent semantic analysis; latent dirichlet allocation; hierarchical sentiment dictionary; contextually-oriented hierarchical corpus; text tonality; evaluation

Subject

Computer Science and Mathematics, Information Systems

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.