GitHub Copilot & Cie, Creators of Tech Debt?


Could copy and paste soon represent 20 to 25% of activity in code repositories? American publisher laid down question as an extension of a study focused on the impact of “artificial intelligence assistants.”

The increasing use of the latter echoes ” downward trend in code quality », explains the editor in question. The elements supporting this claim are essentially quantitative. They reflect the evolution of the frequency of certain operations over four years (2020-2023).

There are six of these operations:

– Added code (excluding incremental changes and rows returned after deletion)
– Deletion of code (excluding lines reinserted within less than two weeks)
– Moving code (cut and paste into a new file or into a new function)
– Code update (change “about three words or less” on existing line)
– Code search and replace (remove the same string from at least three locations, then replace it everywhere with the same occurrence)
– Copy and paste code (writing identical lines in several files or functions, in one commit)

The tested sample includes 153 million lines of code. About two-thirds of that comes from anonymized data collected from private companies (NextGen Health and Verizon are cited). The rest comes from projects open source – mainly those of Google, Facebook and Microsoft.

Does GitHub Copilot encourage copy and paste?

From a maintenance standpoint, the observation is “disturbing.” As evidence, the publisher specifically cites the fact that between 2020 and 2023, the rates of “add” and “copy-paste” operations increased more than the rates of updates, deletions, and moves. He concludes that the current implementation of the AI ​​assistant does not encourage code reuse.

rate evolution

The analysis included only duplication within the same commit, the overall rate of copy and paste – excluding keywords and comments – is likely to be higher than measured (11% between 2022 and 2023). Hence the hypothesis of 20-25% for 2024.

The analysis report stops at one more indicator: na rate of lines canceled or updated less than two weeks after they were added. Changes that are either incomplete or wrong are considered here.
This rate increased significantly last year: +39.2%. The first upward trend appears in 2022, the year GitHub Copilot was launched in beta.

annual evolution

From there to the statement that developers usually opt for “simple” proposals, there is only one step… taken by the authors of the study. And to wonder when adding code becomes counterproductive, based on the postulate that the number of lines of code and the speed of action of reviewers are inversely proportional…

For additional consultations:

Copilot, but not only: How GitHub feeds LLMs
JetBrains AI: What you need to know
Git branches, a counterintuitive system?
Generative AI: By the numbers, GitHub appears to be “the place to be”
DevSecOps: these practices where France excels

Main illustration generated by artificial intelligence



Source link

Leave a Comment