Overcoming Challenges in Applying AI Guidance to Complex and Legacy Codebases
Keywords:
AI guidance, legacy codebases, code complexity analysis, software maintenance, code metrics, AI integrationAbstract
This paper represents an investigation into the challenges posed by applying AI guidance to complex and legacy codebases. Various AI models were assessed and tuned with a view to improving their effectiveness for the analysis and guidance of legacy code. Our approach was to deeply analyze five diverse codebases for code complexity, capturing metrics including - but not limited to - functions, classes, method calls, and much more. Python was used for simulation and fine-tuning, with model fine-tuning via TensorFlow/Keras. We fine-tuned a pre-trained AI model so that it would have closer characteristics to the nature of the legacy code. The resulting fine-tuned model was then tested, and the results had an accuracy of 84% with a performance overhead of 45%. Our results depict the effect of AI tools on performance and also contrast the scenarios with and without AI guidance. Visualizations of performance overhead and accuracy metrics showed several of these trade-offs and can help stakeholders understand the value creation and the cost incurred by AI. The study highlights several lessons that could be used for the optimization of AI tools to work with complex codebases and provides guiding principles for the effective application of AI in software maintenance and improvement.
References
C. Deknop, “Understanding large codebase refactoring through differencing”, Louvain School of Engineering, 2023.
M. Anaya, “Clean Code in Python: Refactor your legacy code base”, Second Ed., 2018.
V. Zaytsev, “Open Challenges in Incremental Coverage of Legacy Software Languages”, Proceedings of the 3rd ACM SIGPLAN International Workshop, 2023.
P. Kantek, “AI-driven Software Development Source Code Quality”, Masaryk University, Faculty of Informatics, pp. 1-93, 2023.
G. Lacerda, F. Petrillo, M. Pimenta, Y. G. Gueheneuc, “Code smells and refactoring: A tertiary systematic review of challenges and observations”, Journal of Systems and Software, vol. 167, Sep. 2020.
M. Rantanen, “Feasibility evaluation of the legacy software system migration”, Tampere University, Faculty of Engineering and Natural Sciences, 2021.
A. Kuronen, “Implementing continuous delivery for legacy software”, Faculty of Science, University of Helsinki, Jun. 2023.
O. Danylov, “Methodology for improving programs based on means of code generation by artificial intelligence”, National Aviation University, Faculty of Cybersecurity and Software Engineering, pp. 1-94, 2023.
S. Ponnusamy, D. Eswararaj, “Navigating the Modernization of Legacy Applications and Data: Effective Strategies and Best Practices”, Asian Journal of Research in Computer Science, vol. 16, issue 4, pp. 239-256, Nov. 2023.
M. Nylund, “Study of performance improvements in a legacy reporting framework”, JAMK, Information and Communication Technology, pp.1-32, Dec. 2023.
B. Pang, E. Nijkamp, and Y. N. Wu, “Deep Learning With TensorFlow: A Review”, Journal of Educational and Behavioral Statistics, Sep. 2019.
D. Smilkov, N. Thorat, Y. Assogba, A. Yuan, N. Kreeger, P. Yu, and K. Zhang, “Tensorflow.js: Machine Learning for the Web and Beyond”, Proceedings of the 2nd SysML Conference, 2019.
N. K. Manaswi, “Understanding and Working with Keras”, Deep Learning with Applications Using Python, pp. 31-43, Apr. 2018.
X. Cheng, “Abstraction Layered Architecture: Improvements in Maintainability of Commercial Software Code Bases”, Auckland University of Technology, 2020.
X. Wang, Y. Jin, Y. Cen, T. Wang, B. Tang, and Y. Li, “LighTN: Light-weight Transformer Network for Performance-overhead Tradeoff in Point Cloud Downsampling”, IEEE Transactions on Multimedia, pp. 1-16, Sep. 2023.
B. D. Monaghan, J. M. Bass, “Redefining Legacy: A Technical Debt Perspective”, Product-Focused Software Process Improvement, Conf. paper, pp. 254-269, Nov. 2020.
J. Hines, “CodeBase Relationship Visualizer: Visualizing Relationships Between Source Code Files”, Southern Adventist University, Jan. 2023.
S. Bhowmik, “Refactoring an Existing Code Base to Improve Modularity and Quality”, Iowa State University, ProQuest Dissertations and Theses, 2020.
B. Dagenais, H. Mili, “Slicing functional aspects out of legacy applications”, Sep. 2021.
S. Gangopadhyay, S. McGuigan, V. Chakravarthy, D. Misra, and S. Tyagi, “Working Toward a White Box Approach: Transforming Complex Legacy Enterprise Applications”, ISACA Journal Information Technology & Systems Resources, vol. 1, Jan. 2022.
D. Khurana, A. Koli, K. Khatter, and S. Singh, “Natural language processing: state of the art, current trends and challenges”, Multimedia Tools and Applications, Art., vol. 82, pp. 3713-3744, Jul. 2022.
B. Min, H. Ross, E. Sulem, A. Pouran, B. Veyseh, T. H. Nguyen, O. Sainz, E. Agirre, I. Heintz, and D. Roth, “Recent Advances in Natural Language Processing via Large Pre-trained Language Models: A Survey”, ACM Computing Surveys, vol. 56, issue 2, article no 30, pp.1-40, Sep. 2023.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
License Terms
Ownership and Licensing:
Authors of this research paper submitted to the journal owned and operated by The Science Brigade Group retain the copyright of their work while granting the journal certain rights. Authors maintain ownership of the copyright and have granted the journal a right of first publication. Simultaneously, authors agreed to license their research papers under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License.
License Permissions:
Under the CC BY-NC-SA 4.0 License, others are permitted to share and adapt the work, as long as proper attribution is given to the authors and acknowledgement is made of the initial publication in the Journal. This license allows for the broad dissemination and utilization of research papers.
Additional Distribution Arrangements:
Authors are free to enter into separate contractual arrangements for the non-exclusive distribution of the journal's published version of the work. This may include posting the work to institutional repositories, publishing it in journals or books, or other forms of dissemination. In such cases, authors are requested to acknowledge the initial publication of the work in this Journal.
Online Posting:
Authors are encouraged to share their work online, including in institutional repositories, disciplinary repositories, or on their personal websites. This permission applies both prior to and during the submission process to the Journal. Online sharing enhances the visibility and accessibility of the research papers.
Responsibility and Liability:
Authors are responsible for ensuring that their research papers do not infringe upon the copyright, privacy, or other rights of any third party. The Science Brigade Publishers disclaim any liability or responsibility for any copyright infringement or violation of third-party rights in the research papers.