Science & technology | Automating programming

AI is transforming the coding of computer programs

The software engineers of the future will, themselves, be software

GPT-3 IS QUITE a beast. The Generative Pre-Trained Transformer 3, to give its full name, is a language model developed by OpenAI, a part-commercial, part not-for-profit artificial-intelligence (AI) laboratory in San Francisco. GPT-3 was trained on an unprecedented mass of text to teach it the probability that a given word will follow preceding words. When fed a short text “prompt”, it cranks out astonishingly coherent prose written in a similar style.

Listen to this story.
Enjoy more audio and podcasts on iOS or Android.

Access to GPT-3 is restricted. For one thing, says Jack Clark, former head of policy at the organisation, it might otherwise be used to mass produce fake news or flood social media with “trolling and griefing” messages. But OpenAI also knows that GPT-3 is commercially valuable. Last year the laboratory started letting vetted firms buy its output for approved uses. These include producing answers to typed questions about products, and powering the speech of fictional characters in virtual worlds. But perhaps most important, GPT-3 can also be used to write computer code.

Several firms are already using GPT-3 and its predecessor GPT-2 to add AI to the software that their programmers use to write code. Much of what these programmers type out has already been written elsewhere at some point in the past. This means that by feeding oodles of pre-existing code into such packages, they can be trained to predict the lines a programmer needs next. As a programmer types, potential “code completions” of one or a few lines pop up on the screen.

Predict and provide

One company that has created such an AI-completion feature is Tabnine, of Tel Aviv. Tabnine used GPT-2 to feed so much code to its programming software, also named Tabnine, that this software gained a sort of “world knowledge”, says Eran Yahav, the firm’s top technologist. Dr Yahav describes this as “a pretty good notion of how the world behaves”, at least when it comes to programming-speak. Tabnine software may detect that a user has begun to type code to handle, say, purchase orders. It will then suggest code to display product names and prices, as well as code to create fields to be filled with quantities, payment and delivery data. It works even though Tabnine has never been specifically instructed to do that.

Some coding sequences are rare. In these cases, Tabnine lengthens its pop-up list of suggested completions to increase the likelihood of offering a useful one. By clicking on one that is appropriate, the programmer teaches Tabnine to perform better. Tabnine’s professional version seems “almost intelligent” in its ability to understand a programmer’s intent, according to Dror Weiss, the firm’s boss.

Tabnine is not alone. On June 17th Microsoft, an American software giant, released a new version of an AI-completion feature which it embeds in coding software called Visual Studio. The original version, released in 2018 and named IntelliCode, was trained on a few thousand online repositories in which code for programming projects is stored. Microsoft trained its upgraded system on more than half a million such repositories. Amanda Silver, one of the executives in charge of Visual Studio, says these extra heaps of training fodder allow the new version to glean intent better from hints in code that a programmer has already written.

The purpose of all this, of course, is to save time. Kite, a firm in San Francisco, claims its AI-completion products cut the number of keystrokes required for some tasks by nearly half. Overall efficiency gains, however, are lower. Vitaly Khudobakhshov, head of AI products at the St Petersburg office of JetBrains, a Czech developer of programming software, sees time savings of 10% to 20%. In the view of Sharif Shameem, the boss of Debuild, a firm in San Francisco that uses GPT-3 to help build websites, the technology also reduces “cognitive overhead”. Selecting from multiple choices is less taxing than devising solutions from scratch.

Bugs and the system

Nor are those who write code the only beneficiaries. Developers spend nearly as much time searching for bugs in what they have written as they do writing it in the first place. A machine-learning model being built by Brendan Dolan-Gavitt of New York University may speed up the debugging process.

To train it, Dr Dolan-Gavitt is collecting code labelled as buggy by GitHub, a Microsoft subsidiary that hosts the biggest collection of non-proprietary “open source” code in the world. By one estimate, GitHub holds at least a billion snippets of code identified as harbouring a bug. Dr Dolan-Gavitt’s model, provisionally called GPT-CSRC, will devour that code this summer.

Another bug-spotting model is in development at the Massachusetts Institute of Technology (MIT). Shashank Srikant, a PhD student working on the project, says the goal is to train the model to recognise not just inadvertent bugs, but also maliciously inserted vulnerabilities. Rogue employees are sometimes behind trickery of this sort, which is intended to do things like secretly gain access to passwords. The practice is most common, however, in open-source programming projects to which anyone can contribute. Human reviewers typically struggle to spot these “vulnerability injections”, as they are sometimes known.

The reason, Mr Srikant says, is that, in a bid to slip their handiwork past reviewers, devious coders often use deceptive but purely cosmetic names for things like the variables handled by a program. The team at MIT is therefore training its model to flag discrepancies between snippets’ labels and their actual functionality. The difficulty is that good examples of such mischief are much rarer than ordinary errors.

There is, however, an additional sign that a vulnerability injection may be lurking. Malicious coders often conceal these by writing superfluous code intended to throw off reviewers, so Mr Srikant is also feeding MIT’s model with examples of this type of potentially telltale code, which he describes as “dangling” and “dead”.

The clear destination of all this activity is the creation of software programmers which can, like the human variety, take an idea and turn it into code. An inkling of things to come is provided by a website created by Dr Dolan-Gavitt. Named “This Code Does Not Exist”, it asks programmers to determine if sections of code dozens of lines long were written by a human or a model based on GPT-2 that he has built. Of more than 329,200 assessments made, less than 51% have been correct. That is only a shade better than random.

Machines, it turns out, are now able to write even longish sequences of functioning code. As John Carmack, a noted American computer engineer, has tweeted, pondering this development “does generate a slight shiver”. Unsurprisingly, a number of firms see an opportunity.

One is a Parisian firm called SourceAI. It is designing software into which users type, in natural language, a request for code—such as something that will work out the value of numbers in a mathematical formula called the Fibonacci sequence. By tapping into GPT-3, SourceAI’s eponymous software churns out the desired lines of code in a range of programming languages.

Debuild is testing the same idea. It is trying to create software that lets non-programmers describe, in plain English, a program they want to create, and will then write it. A request for, say, a barbershop app that lets patrons choose a barber and an appointment slot can already produce more or less just that. Mr Shameem says the goal is to sweep away the minutiae of code-typing, so that people can focus on what they want done, not how to instruct computers to do it.

For its part, Microsoft is also using GPT-3 to power what it calls “no code/low code” programming. Charles Lamanna, who leads the work, envisages a bright future of cheaper software created by untrained “citizen developers”. Some folk fear an alternative, darker outcome. Might AIs eventually write whatever code they fancy running? No such runaway feedback loop is around the corner. But that mainstay of science fiction does now appear a little less far-fetched.

A version of this article was published online on July 7th 2021

This article appeared in the Science & technology section of the print edition under the headline "The software software engineers"

The fault lines in the world economy

From the July 10th 2021 edition

Discover stories from this section and more in the list of contents

Explore the edition

Discover more

Antarctica, Earth’s largest refrigerator, is defrosting

The world must pay more attention to its southern pole

Killer whales deploy brutal, co-ordinated attacks when hunting

Their techniques are passed down through the generations


A new generation of music-making algorithms is here

Their most useful application may lie in helping human composers