Version 1
: Received: 14 December 2023 / Approved: 14 December 2023 / Online: 14 December 2023 (15:40:04 CET)
How to cite:
Roy, J.; Patel, R.; Nguyen, P.; Simon, S. Dynamic Syntax Tree Model for Enhanced Source Code Representation. Preprints2023, 2023121108. https://doi.org/10.20944/preprints202312.1108.v1
Roy, J.; Patel, R.; Nguyen, P.; Simon, S. Dynamic Syntax Tree Model for Enhanced Source Code Representation. Preprints 2023, 2023121108. https://doi.org/10.20944/preprints202312.1108.v1
Roy, J.; Patel, R.; Nguyen, P.; Simon, S. Dynamic Syntax Tree Model for Enhanced Source Code Representation. Preprints2023, 2023121108. https://doi.org/10.20944/preprints202312.1108.v1
APA Style
Roy, J., Patel, R., Nguyen, P., & Simon, S. (2023). Dynamic Syntax Tree Model for Enhanced Source Code Representation. Preprints. https://doi.org/10.20944/preprints202312.1108.v1
Chicago/Turabian Style
Roy, J., Pasir Nguyen and Sartran Simon. 2023 "Dynamic Syntax Tree Model for Enhanced Source Code Representation" Preprints. https://doi.org/10.20944/preprints202312.1108.v1
Abstract
The art of representing source code is pivotal in numerous programming analysis applications. Recent strides in neural networks have marked notable successes in this realm. However, the peculiar structural characteristics inherent in programming languages have not been fully exploited in existing models. While neural models based on abstract syntax trees (ASTs) adeptly manage the tree-like nature of source codes, they fall short in discerning the diverse substructural nuances within programs. This paper introduces the Dynamic Syntax Tree Model (DSTM), an innovative approach that fuses various neural network modules into tree architectures tailored to the specific AST of the input. Distinct from preceding tree-based neural models, DSTM adeptly discerns the semantic variances across different AST substructures. We validate DSTM through rigorous testing in program classification and code clone detection, outperforming contemporary methods and demonstrating the benefits of harnessing intricate source code structures.
Keywords
Tree modeling; Code representation learning
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.