Overcoming Challenges in Applying AI Guidance to Complex and Legacy Codebases

Authors

  • Mikita Piastou Full-Stack Developer, Emplifi, Calgary, AB Canada

Keywords:

AI guidance, legacy codebases, code complexity analysis, software maintenance, code metrics, AI integration

Abstract

This paper represents an investigation into the challenges posed by applying AI guidance to complex and legacy codebases. Various AI models were assessed and tuned with a view to improving their effectiveness for the analysis and guidance of legacy code. Our approach was to deeply analyze five diverse codebases for code complexity, capturing metrics including - but not limited to - functions, classes, method calls, and much more. Python was used for simulation and fine-tuning, with model fine-tuning via TensorFlow/Keras. We fine-tuned a pre-trained AI model so that it would have closer characteristics to the nature of the legacy code. The resulting fine-tuned model was then tested, and the results had an accuracy of 84% with a performance overhead of 45%. Our results depict the effect of AI tools on performance and also contrast the scenarios with and without AI guidance. Visualizations of performance overhead and accuracy metrics showed several of these trade-offs and can help stakeholders understand the value creation and the cost incurred by AI. The study highlights several lessons that could be used for the optimization of AI tools to work with complex codebases and provides guiding principles for the effective application of AI in software maintenance and improvement.

References

C. Deknop, “Understanding large codebase refactoring through differencing”, Louvain School of Engineering, 2023.

M. Anaya, “Clean Code in Python: Refactor your legacy code base”, Second Ed., 2018.

V. Zaytsev, “Open Challenges in Incremental Coverage of Legacy Software Languages”, Proceedings of the 3rd ACM SIGPLAN International Workshop, 2023.

P. Kantek, “AI-driven Software Development Source Code Quality”, Masaryk University, Faculty of Informatics, pp. 1-93, 2023.

G. Lacerda, F. Petrillo, M. Pimenta, Y. G. Gueheneuc, “Code smells and refactoring: A tertiary systematic review of challenges and observations”, Journal of Systems and Software, vol. 167, Sep. 2020.

M. Rantanen, “Feasibility evaluation of the legacy software system migration”, Tampere University, Faculty of Engineering and Natural Sciences, 2021.

A. Kuronen, “Implementing continuous delivery for legacy software”, Faculty of Science, University of Helsinki, Jun. 2023.

O. Danylov, “Methodology for improving programs based on means of code generation by artificial intelligence”, National Aviation University, Faculty of Cybersecurity and Software Engineering, pp. 1-94, 2023.

S. Ponnusamy, D. Eswararaj, “Navigating the Modernization of Legacy Applications and Data: Effective Strategies and Best Practices”, Asian Journal of Research in Computer Science, vol. 16, issue 4, pp. 239-256, Nov. 2023.

M. Nylund, “Study of performance improvements in a legacy reporting framework”, JAMK, Information and Communication Technology, pp.1-32, Dec. 2023.

B. Pang, E. Nijkamp, and Y. N. Wu, “Deep Learning With TensorFlow: A Review”, Journal of Educational and Behavioral Statistics, Sep. 2019.

D. Smilkov, N. Thorat, Y. Assogba, A. Yuan, N. Kreeger, P. Yu, and K. Zhang, “Tensorflow.js: Machine Learning for the Web and Beyond”, Proceedings of the 2nd SysML Conference, 2019.

N. K. Manaswi, “Understanding and Working with Keras”, Deep Learning with Applications Using Python, pp. 31-43, Apr. 2018.

X. Cheng, “Abstraction Layered Architecture: Improvements in Maintainability of Commercial Software Code Bases”, Auckland University of Technology, 2020.

X. Wang, Y. Jin, Y. Cen, T. Wang, B. Tang, and Y. Li, “LighTN: Light-weight Transformer Network for Performance-overhead Tradeoff in Point Cloud Downsampling”, IEEE Transactions on Multimedia, pp. 1-16, Sep. 2023.

B. D. Monaghan, J. M. Bass, “Redefining Legacy: A Technical Debt Perspective”, Product-Focused Software Process Improvement, Conf. paper, pp. 254-269, Nov. 2020.

J. Hines, “CodeBase Relationship Visualizer: Visualizing Relationships Between Source Code Files”, Southern Adventist University, Jan. 2023.

S. Bhowmik, “Refactoring an Existing Code Base to Improve Modularity and Quality”, Iowa State University, ProQuest Dissertations and Theses, 2020.

B. Dagenais, H. Mili, “Slicing functional aspects out of legacy applications”, Sep. 2021.

S. Gangopadhyay, S. McGuigan, V. Chakravarthy, D. Misra, and S. Tyagi, “Working Toward a White Box Approach: Transforming Complex Legacy Enterprise Applications”, ISACA Journal Information Technology & Systems Resources, vol. 1, Jan. 2022.

D. Khurana, A. Koli, K. Khatter, and S. Singh, “Natural language processing: state of the art, current trends and challenges”, Multimedia Tools and Applications, Art., vol. 82, pp. 3713-3744, Jul. 2022.

B. Min, H. Ross, E. Sulem, A. Pouran, B. Veyseh, T. H. Nguyen, O. Sainz, E. Agirre, I. Heintz, and D. Roth, “Recent Advances in Natural Language Processing via Large Pre-trained Language Models: A Survey”, ACM Computing Surveys, vol. 56, issue 2, article no 30, pp.1-40, Sep. 2023.

Downloads

Published

17-04-2024

How to Cite

[1]
M. Piastou, “Overcoming Challenges in Applying AI Guidance to Complex and Legacy Codebases”, J. of Art. Int. Research, vol. 4, no. 1, pp. 312–331, Apr. 2024.