Нейронные сети для обработки исходного кода программ тема диссертации и автореферата по ВАК РФ 05.13.17, кандидат наук Чиркова Надежда Александровна
- Специальность ВАК РФ05.13.17
- Количество страниц 64
Оглавление диссертации кандидат наук Чиркова Надежда Александровна
Contents
1 Introduction
2 Key results and conclusions
3 Content of the work
3.1 On the Embeddings of Variables in Recurrent Neural Networks for Source Code
3.2 Empirical Study of Transformers for Source Code
3.3 A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code
4 Conclusion 18 References
Appendix A Article. On the Embeddings of Variables in Recurrent Neural Networks
for Source Code
Appendix B Article. Empirical Study of Transformers for Source Code
Appendix C Article. A Simple Approach for Handling Out-of-Vocabulary Identifiers in
Deep Learning for Source Code
Рекомендованный список диссертаций по специальности «Теоретические основы информатики», 05.13.17 шифр ВАК
Модели и методы автоматической обработки неструктурированных данных в биомедицинской области2023 год, доктор наук Тутубалина Елена Викторовна
Исследование вариантов трансформера для различных задач обработки длинных документов/ Investigation of transformer options for various long documents processing tasks2024 год, кандидат наук Аль Адел Ариж
Гарантии обучения и эффективный вывод в задачах структурного предсказания2024 год, кандидат наук Струминский Кирилл Алексеевич
Методы обработки, декодирования и интерпретации электрофизиологической активности головного мозга для задач диагностики, нейрореабилитации и терапии нейрокогнитивных расстройств2022 год, доктор наук Осадчий Алексей Евгеньевич
"An investigation into the modulation of learningprocesses by social context via neuroimaging, computational modeling, and meta-analysis"2023 год, кандидат наук Мартинез-Саито Марио
Введение диссертации (часть автореферата) на тему «Нейронные сети для обработки исходного кода программ»
1 Introduction
Topic of the thesis
Neural networks have been successfully used in a wide range of applied tasks with structured data, including image, text, or video processing. One of the key properties of neural networks that distinguishes them from other machine learning models is the possibility to adapt architecture to different kinds of data. This work focuses on adapting neural network-based models to source code processing. Neural networks have been shown to substantially improve quality in such tasks as code completion [1], bug fixing [2], translating code from one programming language to another [3], or code documentation [4], providing help for programmers and simplifying development process. The better adaptation of neural network-based models to source code may improve the quality of solving listed tasks even further.
Source code as a data domain resembles some properties of natural text, e. g. discrete sequential nature. As a result, code is often processed with architectures borrowed from natural language processing (NLP), for example, Transformers [5] or recurrent neural networks (RNNs). However, source code features a set of specific properties, taking which into account may improve model quality. First, source code is strictly structured, as it follows the rules of the programming language. Second, most programming languages rely of the notion of variables, which store the results of intermediate operations and allow the reuse of these results. Third, in contrast to natural language, source code may contain user-defined identifiers of arbitrary length or complexity, as it is usually allowed by the programming language.
This dissertation is devoted to utilizing the specified properties in neural network-based models for improving performance in three applied tasks. The first task is code completion in which a neural network predicts next code tokens based on already written code. Neural network-based code completion modules are plugged in the majority of modern integrated development environments (IDEs) such as Visual Studio or PyCharm. The second task is variable misuse detection and repair, later refereed to as the variable misuse task. In this task, a neural network predicts the location of the bug in the program snippet (if there is any) and the location from which the variable could be copied to fix the bug. Automatically detecting and fixing bugs is a long-standing problem, solving which will substantially simplify the development process [6]. The third task is function naming, in which a neural network predicts the name of the function given the function body. Assistance in function naming makes code more readable and simplifies code maintenance.
Relevance
Neural networks have been widely adopted in source code processing, see elaborate review in [7]. Earlier works processed code applying neural network architectures used in NLP. A line of recent works considered adapting neural networks to the specific properties of source code which were described above. First, in order to utilize the syntactic tree parsed from a code snippet, Wenhan et al. [8] propose to use recursive neural networks while Li et al. [9] and Kim et al. [1] propose passing the depth-first traversal of the tree to sequential models, RNNs and Transformers correspondingly. Shiv and Quirk [10] and Kim et al. [1]
further propose to adjust the Transformer's architecture to take the tree structure into account. The drawback of these works is that the proposed approaches were tested on different applied tasks and datasets, making it hard to establish the best-performing approach. Second, in order to utilize the notion of variables, the dominating approach is to apply Graph Neural Networks (GNNs) and their variants to the graph constructed by treating code tokens as vertices and drawing edges based on data- or control-flow in the program [11]. The drawbacks of this approach is that it is relatively hard to implement and that the forward pass through GNNs is relatively slow because of time-consuming message passing procedure. Third, in order to process rare and complex identifiers, Karampatsis and Sutton [12] propose using byte-pair encoding which splits them into smaller, more frequent pieces. The drawback of this approach is that splitting makes sequences much longer, slowing down neural networks' prediction.
This thesis provides further advancement in utilizing the specifics of source code in neural networks, and particularly, in RNNs and Transformers, as these are two most widely used architectures nowadays for code and natural text. The first work of this thesis focuses on processing variables in RNNs and introduces the RNN-based dynamic embeddings mechanism. The dynamic embedding of each variable in the program is firstly initialized based on the variable's name and then updated each time the variable occurs in the program. In contrast to conventionally used static embeddings that are based only on variable names, the proposed dynamic embeddings capture the role of a variable in the program through the update mechanism. The experimental part of the work shows that the proposed dynamic embeddings significantly outperform standard RNNs in code completion and variable misuse tasks, for Python and JavaScript.
The second work is devoted to utilizing the syntactic structure of code in Transformers. In recent years, several modifications were proposed to utilize the syntactic structure of code in Transformers [10, 1, 2]. However, these modifications were tested on different tasks and datasets, and as result, it remains unclear, which approach for utilizing code structure in Transformers performs better. This work compares the modifications in a unified framework on three tasks and two programming languages and provides the recommendations for the future use of Transformers for processing syntactic structure of code, e. g. using the Sequential relative attention method. Moreover, this work explores the capabilities of Transformer to process anonymized code, in which all identifiers were replaced with placeholders Varl, Var2, Var3 etc. In this case, no textual information on the code snippet is available and the only source of information the model can rely on is the code syntactic structure. The work shows that Transformer can make meaningful predictions for such an anonymized code and thus is capable of capturing syntactic information. Finally, the work analyses the effect of different components of processing syntax.
The third work tackles the problem of processing rare identifiers in source code and proposes an easy-to-implement preprocessing technique based on anonymization. Particularly, all rare identifiers, e.g. those with frequency less than a threshold, are replaced with unique placeholders Varl, Var2, Var3 etc. The experimental part of the work shows that the proposed technique improves the accuracy of variable misuse detection and repair by 5-6% and of code completion - by 7-10%, for Transformer architecture.
The goal of this work is to improve the performance of RNNs and Transformers in applied source code processing tasks by developing and investigating methods that take into account the specifics of source code as data domain.
2 Key results and conclusions
The main contributions of this work are threefold:
1. We proposed an RNN-based dynamic embeddings mechanism for processing variables in source code, which capture the roles of variables in a program through the update mechanism. The model with the proposed dynamic embeddings outperforms the conventional RNN model in code completion and variable misuse tasks by 0.5-18%, depending on the task and programming language (Python or JavaScript).
2. We conducted an empirical study of the capabilities of Transformers to utilize the syntactic structure of source code, including the comparison of five syntax-based Transformer modifications on variable misuse, function naming, and code completion tasks on two programming languages (Python or JavaScript), testing the general capability of Transformers to capture syntactic information, and analysing the effect of different components of processing syntax. The results of the study underline Sequential relative attention as the most effective and efficient approach for capturing syntactic information.
3. We proposed an easy-to-implement preprocessing technique for source code, namely the anonymiza-tion of rare identifiers, which improves the quality of Transformer in variable misuse and code completion tasks by 5-10%, for Python and JavaScript.
Theoretical and practical significance. The proposed models and conducted empirical studies pave the way towards further advancement in deep learning for source code. Through the use of the proposed dynamic embeddings, the proposed anonymization of rare identifiers, and Transformer with relative attention highlighted in the empirical study of Transformers, one can substantially improve the quality of code completion, variable misuse detection and repair, function naming, or other tasks. Providing high-performing solutions for specified tasks improves programmers' experience, simplifies development, and improves the readability of code.
Key aspects/ideas to be defended:
1. An RNN-based dynamic embeddings mechanism for processing variables in source code and capturing the roles of variables in programs;
2. An empirical study of five modifications of the Transformer architecture for capturing the syntactic structure of source code and of the general capabilities of Transformer architecture to capture code syntax;
3. An easy-to-implement approach for processing rare identifiers in source code based on their anonymization.
Personal contribution. The first work is conducted solely by the dissertation's author. In the second and third works, the author proposed the key scientific ideas, implemented methods, conducted all experiments on variable misuse and function naming tasks, and wrote text. The contribution of the second author in these papers is conducting experiments on the code completion task, discussion of the obtained results, and help with writing and editing text.
Publications and probation of the work First-tier publications
1. Nadezhda Chirkova. On the Embeddings of Variables in Recurrent Neural Networks for Source Code. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021 (NAACL 2021). Pages 26792689. CORE A conference.
2. Nadezhda Chirkova and Sergey Troshin. Empirical Study of Transformers for Source Code. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021 (ESEC/FSE 2021). Pages 703-715. CORE A* conference.
3. Nadezhda Chirkova and Sergey Troshin. A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021 (NAACL 2021). Pages 278-288. CORE A conference.
Reports at seminars
1. Seminar of the Bayesian methods research group, Moscow, 13 November 2020. Topic: "Deep learning for source code: handling syntactic structure and indentifiers".
2. Seminar of the Faculty of Computer Science in Voronovo, 29 May 2021. Topic: "Empirical study of Transformers for source code".
3. Computer Science Seminar, UC Davis, Online, 30 July 2021. Topic: "Empirical study of Transformers for source code".
Volume and structure of the work. The thesis contains an introduction, contents of publications and a conclusion. The full volume of the thesis is 64 pages.
Похожие диссертационные работы по специальности «Теоретические основы информатики», 05.13.17 шифр ВАК
Онтологический доступ к данным с использованием дизъюнктивных аксиом2023 год, кандидат наук Герасимова Ольга Александровна
Дважды стохастический вариационный вывод с полунеявными и несобственными распределениями2022 год, кандидат наук Молчанов Дмитрий Александрович
Модели связывания именованных сущностей в биомедицинском домене2022 год, кандидат наук Мифтахутдинов Зульфат Шайхинурович
Анализ тональности текстов из социальных сетей на основе методов машинного обучения для мониторинга общественных настроений2022 год, кандидат наук Сметанин Сергей Игоревич
Рекомендательные системы, основанные на графах, с использованием непрерывных представлений сетей2023 год, кандидат наук Киселёв Дмитрий Андреевич
Заключение диссертации по теме «Теоретические основы информатики», Чиркова Надежда Александровна
12 CONCLUSION
In this work, we investigated the capabilities of Transformer to utilize syntactic information in source code processing. Our study underlined the following practical conclusions:
• sequential relative attention is a simple, fast and not considered as the baseline in previous works mechanism that performs best in 3 out of 4 tasks (in some cases, similarly to other slower mechanisms);
• combining sequential relative attention with GGNN Sandwich in the variable misuse task and with tree relative attention or tree positional encoding in the code completion task may further improve quality;
• omitting types, values or edges in ASTs hurts performance;
• ensembling Transformer trained on the full-data with Transformer trained on the anonymized data outperforms the ensemble of Transformers trained on the same kind of data.
Further, our study highlighted two conceptual insights. On the one hand, Transformers are generally capable of utilizing syntactic information in source code, despite they were initially developed for NLP, i. e. processing sequences. On the other hand, Transformers utilize syntactic information fully not in all tasks: in variable misuse and code completion, Transformer uses all AST components, while in function naming, Transformer mostly relies on a set of types and values used in the program, hardly utilizing syntactic structure.
Список литературы диссертационного исследования кандидат наук Чиркова Надежда Александровна, 2022 год
REFERENCES
[1] Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2020. A Transformer-based Approach for Source Code Summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL).
[2] Umair Z. Ahmed, Pawan Kumar, Amey Karkare, Purushottam Kar, and Sumit Gulwani. 2018. Compilation Error Repair: For the Student Programs, from the Student Programs. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering Education and Training (Gothenburg, Sweden) (ICSE-SEET '18). Association for Computing Machinery, New York, NY, USA, 78-87. https://doi.org/10.1145/3183377.3183383
[3] Miltiadis Allamanis. 2019. The adverse effects of code duplication in machine learning models of code. Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (2019).
[4] Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to Represent Programs with Graphs. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id= BJOFETxR-
[5] Miltiadis Allamanis, Hao Peng, and Charles Sutton. 2016. A Convolutional Attention Network for Extreme Summarization of Source Code. In International Conference on Machine Learning (ICML).
[6] Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. 2019. code2seq: Generating Sequences from Structured Representations of Code. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9,
2019. OpenReview.net. https://openreview.net/forum?id=H1gKYo09tX
[7] Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. 2019. code2seq: Generating Sequences from Structured Representations of Code. In International Conference on Learning Representations. https://openreview.net/forum?id=H1gKYo09tX
[8] Arsenii Ashukha, Alexander Lyzhov, Dmitry Molchanov, and Dmitry Vetrov.
2020. Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning. In International Conference on Learning Representations, ICLR 2020. https://openreview.net/forum?id=BJxI5gHKDr
[9] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171-4186. https://doi.org/10.18653/v1/N19-1423
[10] Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 1536-1547. https://doi.org/10.18653/v1/2020.findings-emnlp.139
[11] Patrick Fernandes, Miltiadis Allamanis, and Marc Brockschmidt. 2019. Structured Neural Summarization. In International Conference on Learning Representations. https://openreview.net/forum?id=H1ersoRqtm
[12] Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie LIU, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, and Ming Zhou. 2021. GraphCodeBERT: Pre-training Code Representations with Data Flow. In International Conference on Learning Representations. https: //openreview.net/forum?id=jLoC4ez43PZ
[13] Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish Shevade. 2017. DeepFix: Fixing Common C Language Errors by Deep Learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (San Francisco, California, USA) (AAAI'17). AAAI Press, 1345-1351.
[14] Vincent J. Hellendoorn, Charles Sutton, Rishabh Singh, Petros Maniatis, and David Bieber. 2020. Global Relational Models ofSource Code. In International Conference on Learning Representations, ICLR 2020. https://openreview.net/ forum?id=B1lnbRNtwr
[15] Sepp Hochreiter and JÂŒrgen Schmidhuber. 1997. Long Short-term Memory. Neural computation 9 (12 1997), 1735-80. https://doi.org/10.1162/neco.1997.9.8. 1735
[16] Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2018. Deep Code Comment Generation. In Proceedings of the 26th Conference on Program Comprehension (Gothenburg, Sweden) (ICPC '18). Association for Computing Machinery, New York, NY, USA, 200-210. https://doi.org/10.1145/3196321.3196334
[17] Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing Source Code using a Neural Attention Model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 2073-2083. https://doi.org/10.18653/v1/P16-1195
[18] Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, and Kensen Shi. 2020. Learning and evaluating contextual embedding of source code. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 12-18 July 2020 (Proceedings ofMachine Learning Research). PMLR.
[19] Rafael-Michael Karampatsis, Hlib Babii, Romain Robbes, Charles Sutton, and Andrea Janes. 2020. Big Code != Big Vocabulary: Open-Vocabulary Models for Source Code. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (Seoul, South Korea) (ICSE 20). Association for Computing Machinery, New York, NY, USA, 1073-1085. https://doi.org/10.1145/3377811. 3380342
[20] Seohyun Kim, Jinman Zhao, Yuchi Tian, and Satish Chandra. 2020. Code Prediction by Feeding Trees to Transformers. arXiv:2003.13848 [cs.SE]
[21] Marie-Anne Lachaux, Baptiste Roziere, Lowik Chanussot, and Guillaume Lample. 2020. Unsupervised Translation of Programming Languages. In arXiv preprint arXiv:2006.03511. arXiv:2006.03511 [cs.CL]
[22] Alexander LeClair, Siyuan Jiang, and Collin McMillan. 2019. A Neural Model for Generating Natural Language Summaries of Program Subroutines. In Proceedings of the 41st International Conference on Software Engineering (Montreal, Quebec, Canada) (ICSE '19). IEEE Press, 795-806. https://doi.org/10.1109/ICSE.2019.00087
[23] Jian Li, Yue Wang, Michael R. Lyu, and Irwin King. 2018. Code Completion with Neural Attention and Pointer Networks. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (Stockholm, Sweden) (IJCAI'18). AAAI Press, 4159-25.
[24] Jian Li, Yue Wang, Michael R. Lyu, and Irwin King. 2018. Code Completion with Neural Attention and Pointer Networks. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18. International Joint Conferences on Artificial Intelligence Organization, 4159-4165. https: //doi.org/10.24963/ijcai.2018/578
[25] I. Loshchilov and F. Hutter. 2017. SGDR: Stochastic Gradient Descent with Warm Restarts. In ICLR.
[26] Chris Maddison and Daniel Tarlow. 2014. Structured Generative Models of Natural Source Code. In Proceedings of the 31st International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 32), Eric P. Xing and Tony Jebara (Eds.). PMLR, Bejing, China, 649-657. http://proceedings.mlr.press/ v32/maddison14.html
[27] Martin Monperrus. 2020. The Living Review on Automated Program Repair. (Dec. 2020). https://hal.archives-ouvertes.fr/hal-01956501 working paper or preprint.
[28] Veselin Raychev, Pavol Bielik, and Martin Vechev. 2016. Probabilistic Model for Code with Decision Trees. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (Amsterdam, Netherlands) (OOPSLA 2016). Association for Computing Machinery, New York, NY, USA, 731-747. https://doi.org/10.1145/2983990.2984041
[29] Veselin Raychev, Pavol Bielik, and Martin Vechev. 2016. Probabilistic Model for Code with Decision Trees. SIGPLAN Not. 51, 10 (Oct. 2016), 731-747. https: //doi.org/10.1145/3022671.2984041
[30] Veselin Raychev, Pavol Bielik, Martin Vechev, and Andreas Krause. 2016. Learning Programs from Noisy Data. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (St. Petersburg, FL, USA) (POPL '16). Association for Computing Machinery, New York, NY, USA, 761-774. https://doi.org/10.1145/2837614.2837671
[31] Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. 2018. Self-Attention with Relative Position Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Association for Computational Linguistics, New Orleans, Louisiana, 464-468. https://doi.org/10.18653/v1/N18-2074
[32] Vighnesh Shiv and Chris Quirk. 2019. Novel positional encodings to enable tree-based transformers. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 12081-12091. http://papers.nips.cc/paper/9376-novel-positional-encodings-to-enable-tree-based-transformers.pdf
[33] Zeyu Sun, Qihao Zhu, Yingfei Xiong, Yican Sun, Lili Mou, and Lu Zhang. 2020. TreeGen: A Tree-Based Transformer Architecture for Code Generation. Proceedings of the AAAI Conference on Artificial Intelligence 34,05 (Apr. 2020), 8984-8991. https://doi.org/10.1609/aaai.v34i05.6430
[34] Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Beijing, China, 1556-1566. https://doi.org/10.3115/v1/P15-1150
[35] Yi Tay, Mostafa Dehghani, Dara Bahri, and Donald Metzler. 2020. Efficient Transformers: A Survey. arXiv:2009.06732 [cs.LG]
[36] Marko Vasic, Aditya Kanade, Petros Maniatis, David Bieber, and Rishabh Singh. 2019. Neural Program Repair by Jointly Learning to Localize and Repair. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net. https://openreview.net/forum?id= ByloJ20qtm
[37] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, L ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V.
Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 5998-6008. http://papers.nips.cc/paper/7181-attention-is- all-you- need.pdf
[38] Yao Wan, Zhou Zhao, Min Yang, Guandong Xu, Haochao Ying, Jian Wu, and Philip S. Yu. 2018. Improving Automatic Source Code Summarization via Deep Reinforcement Learning. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (Montpellier, France) (ASE
2018). Association for Computing Machinery, New York, NY, USA, 397-407. https://doi.org/10.1145/3238147.3238206 [39] ShengbinXu, Yuan Yao, FengXu, Tianxiao Gu, Hanghang Tong, and Jian Lu. 2019. Commit Message Generation for Source Code Changes. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19. International Joint Conferences on Artificial Intelligence Organization, 39753981. https://doi.org/10.24963/ijcai.2019/552
Обратите внимание, представленные выше научные тексты размещены для ознакомления и получены посредством распознавания оригинальных текстов диссертаций (OCR). В связи с чем, в них могут содержаться ошибки, связанные с несовершенством алгоритмов распознавания. В PDF файлах диссертаций и авторефератов, которые мы доставляем, подобных ошибок нет.