Preprint Concept Paper Version 1 Preserved in Portico This version is not peer-reviewed

Training Natural Language Processing Models on Encrypted Text for Enhanced Privacy

Version 1 : Received: 3 May 2023 / Approved: 5 May 2023 / Online: 5 May 2023 (03:38:42 CEST)

How to cite: TAŞAR, D.E.; ÖCAL TAŞAR, C. Training Natural Language Processing Models on Encrypted Text for Enhanced Privacy. Preprints 2023, 2023050287. https://doi.org/10.20944/preprints202305.0287.v1 TAŞAR, D.E.; ÖCAL TAŞAR, C. Training Natural Language Processing Models on Encrypted Text for Enhanced Privacy. Preprints 2023, 2023050287. https://doi.org/10.20944/preprints202305.0287.v1

Abstract

With the increasing use of cloud-based services for training and deploying machine learning models, data privacy has become a major concern. This is particularly important for natural language processing (NLP) models, which often process sensitive information such as personal communications and confidential documents. In this study, we propose a method for training NLP models on encrypted text data to mitigate data privacy concerns while maintaining similar performance to models trained on non-encrypted data. We demonstrate our method using two different architectures, namely Doc2Vec+XGBoost and Doc2Vec+LSTM, and evaluate the models on the 20 Newsgroups dataset. Our results indicate that both encrypted and non-encrypted models achieve comparable performance, suggesting that our encryption method is effective in preserving data privacy without sacrificing model accuracy. In order to replicate our experiments, we have provided a Colab notebook at the following address: https://t.ly/lR-TP

Keywords

Natural language processing; encrypted text; data privacy; cloud computing; Doc2Vec; XGBoost; LSTM

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.