Are we ready for AI-generated code?

Over the past few months, we’ve marveled at the quality of computer-generated faces, cat photos, videos, essays, and even art. Artificial intelligence (AI) and machine learning (ML) have also quietly crept into software development, with tools like GitHub Copilot, Tabnine, Polycode, and others taking the next logical step of putting existing code completion functionality on AI steroids. Unlike cat photos, however, the source, quality, and security of app code can have far-reaching implications, and, at least when it comes to security, research shows the risk is real. .

Previous academic research has already shown that GitHub Copilot often generates code with security vulnerabilities. More recently, practical analysis by Invicti security engineer Kadir Arslan showed that unsafe code hints they remain the rule rather than the exception with Copilot. Arslan found that suggestions for many common tasks included only the absolute bare bones, often taking the most basic and least secure route, and that accepting them without modification could result in functional but vulnerable applications.

A tool like Copilot is (by design) enhanced autocompletion, trained on open source to suggest snippets that might be relevant in a similar context. This makes the quality and security of the hints closely tied to the quality and security of the training set. So the biggest questions are not about Copilot or any other specific tool, but about AI-generated software code in general.

It is reasonable to assume that Copilot is just the spearhead and that similar generators will become commonplace in the next few years. This means that we, the tech industry, need to start asking how that code is generated, how it is used, and who will take responsibility when things go wrong.

satellite navigation syndrome

Traditional code completion that looks up function definitions to complete function names and remind you what arguments you need is a huge time saver. Because these suggestions are simply a shortcut to finding the documents itself, we’ve learned to implicitly trust everything the IDE suggests. Once an AI-powered tool comes along, its suggestions are no longer guaranteed to be correct, but they still feel friendly and trustworthy, so they’re more likely to be accepted.

Especially for less experienced developers, the convenience of getting a free block of code encourages a mindset shift from “Is this code close enough to what I would write?” to “How can I modify this code to make it work for me?”.

GitHub makes it very clear that Copilot suggestions should always be carefully analyzed, reviewed, and tested, but human nature dictates that even mediocre code will occasionally make it to production. It’s a bit like driving while looking more at your GPS than the road.

Supply Chain Security Issues

the log4j security crisis has moved software supply chain security, and specifically open source security, into the spotlight, with a recent White House Note on secure software development and a new Open Source Security Enhancement Bill. With these and other initiatives, having any open source code in your applications may soon have to be written to a software bill of materials (SBOM), which is only possible if you knowingly include a specific dependency. Software Composition Analysis (SCA) tools also build on that knowledge to detect and flag outdated or vulnerable open source components.

But what if your app includes AI-generated code that ultimately originates from an open source training suite? Theoretically, if even a substantial suggestion is identical to existing code and is accepted as-is, you could have open source code in your software but not in your SBOM. This could lead to compliance issues, not to mention the possibility of liability if your code turns out to be unsafe and generates a violation, and SCA won’t help you since it can only find vulnerable dependencies, not vulnerabilities in your own code. .

Licensing and Attribution Pitfalls

Continuing with that train of thought, in order to use open source code, you must comply with its license terms. Depending on the specific open source license, you will at least need to provide attribution or sometimes publish your own code as open source. Some licenses prohibit commercial use entirely. Whatever the license, you need to know where the code comes from and how it’s licensed.

Again, what if you have AI-generated code in your app that turns out to be identical to existing open source code? If you had an audit, would you find that you are using code without the required attribution? Or maybe you need to open source some of your commercial code to remain compliant? That may not be a realistic risk with today’s tools, but these are the kinds of questions we should all be asking today, not 10 years from now. (And to be clear, GitHub Copilot has an optional filter to block suggestions that match existing code to minimize supply chain risks.)

Deeper Security Implications

Getting back to safety, an AI/ML model is only as good (and as bad) as its training set. We have seen it in the past. For example, in cases of facial recognition algorithms showing racial bias because of the data they were trained on. So if we have research showing that a code generator frequently produces hints without regard for security, we can infer that this is what their learning set (i.e., publicly available code) was like. And what if insecure AI-generated code feeds back into that code base? Can suggestions be safe?

Security questions don’t stop there. If AI-based code generators gain popularity and start to account for a significant proportion of new code, it is likely that someone will try to attack them. It is already possible to fool AI image recognition by poisoning its learning set. Sooner or later, malicious actors will attempt to place particularly vulnerable code in public repositories in the hope that it will show up in suggestions and eventually end up in a production application, opening it up to easy attack.

And the monoculture? If multiple applications end up using the same highly vulnerable hint, whatever its source, we could be looking at exploit epidemics or even AI-specific vulnerabilities.

Watching the AI

Some of these scenarios may seem far-fetched today, but they are things that we in the technology industry need to discuss. Again, GitHub Copilot is in the spotlight only because it currently leads the way, and GitHub provides clear warnings about warnings from AI-generated suggestions. Just like auto-complete on your phone or route suggestions on your sat nav, they’re just suggestions to make our lives easier, and it’s up to us to take them or leave them.

With their potential to exponentially improve development efficiency, AI-based code generators are likely to become a permanent part of the software world. However, in terms of application security, this is another potentially vulnerable source of code that must pass rigorous security testing before it is allowed to go into production. We’re looking for a new way to slip vulnerabilities (and potentially unverified dependencies) right into your own code, so it makes sense to treat AI-augmented codebases as unreliable until tested, and that means testing everything as often. as you can.

Even relatively transparent ML solutions like Copilot already raise some legal and ethical questions, not to mention security concerns. But imagine that one day, some new tool starts generating code that works perfectly and passes security tests, except for one small detail: nobody knows how it works. That’s when it’s time to panic.

Leave a Reply

Your email address will not be published. Required fields are marked *