ARTICLE | doi:10.20944/preprints202111.0378.v1
Subject: Engineering, Other Keywords: NCM classification; natural language processing; transformers; multilingual BERT; portuguese BERT; NLP; BERT
Online: 22 November 2021 (10:59:43 CET)
The classification of goods involved in international trade in Brazil is based on the Mercosur Common Nomenclature (NCM). The classification of these goods represents a real challenge due to the complexity involved in assigning the correct category codes especially considering the legal and fiscal implications of misclassification. This work focuses on the training of a classifier based on Bidirectional En-coder Representations from Transformers (BERT) for the tax classification of goods with NCM codes. In particular, this article presents results from using a specific Portuguese Language tuned BERT model as well results from using a Multilingual BERT. Experimental results justify the use of these models in the classification process and also that the language specific model has a slightly better performance.