AI in 2021
In 2021, AI technologies that were only recently considered cutting edge (e.g., AI that generates realistic but totally fabricated images and text) became accessible to non-expert developers, poising them to enter the lexicon of adversary deception tactics. It was also a year in which new AI breakthroughs, such as OpenAI and Google’s AI systems that write working, college-level source code, promised continued AI impact on the way the cybersecurity game is played. And it was the year in which Google DeepMind demonstrated that its AlphaFold deep learning approach had solved the protein structure prediction problem, seminal work that’s been compared to the sequencing of the human genome.
Within the security product community, 2021 was the year that marked the completion of an era of paradigm-shift within the industry, when it came to recognize machine learning (ML) as an indispensable factor in modern detection pipelines, shifting towards integrating ML as a first-class citizen alongside traditional detection technologies. In the 2020s, the mere fact that a vendor uses ML in a particular protection technology will not be noteworthy – it will be table stakes. The real question will be how effective companies’ AI detection solutions are, and what novel capabilities, outside autonomous detection workflows, security companies are developing with AI.
AI is increasingly accessible to threat actors
As we began this decade, AI consolidated its transition from a specialist discipline to a technology ecosystem in which advanced research labs’ successful prototypes quickly become open-source software components accessible to both benign software developers and malevolent adversaries.
For example, OpenAI’s GPT-2 text generation model, which OpenAI kept under lock –and key in 2019 to prevent its use by bad actors, has now been reproduced by independent researchers and can be spun up for use by the general public, with startups like HuggingFace and Amazon’s SageMaker service pioneering a kind of point-and-click AI service for content providers.
Bigger neural networks are better at solving problems
Related to this, generative adversarial networks (GANs), which can synthesize completely fabricated images that look real, have progressed from a research toy in 2014 to a potent adversarial weapon, as shown in the tweet below from Ian Goodfellow, the inventor of GANs. In 2021, GANs were accessible to non-expert adversaries seeking to wage disinformation campaigns and spoof social media profiles.
While we have not yet seen widespread adversary adoption of these new technologies, we can expect to in the coming years – for example, in the generation of watering-hole attack web content and phishing emails.
Not far behind them in the AI “industrialization pipeline” will be neural network voice synthesis technologies and video deepfake technology, which are less mature than AI technologies in the image and text domain.
The ongoing surprises from AI
Since the 2010s, breakthroughs in neural network vision and language technologies have disrupted the way we practice defensive cybersecurity. For example, most security vendors now use vision and language-inspired neural network technologies to help detect threats.
This year we’ve seen further proof that neural network technology will continue to disrupt old and new areas of cyber defense. Two innovations stand out:
First, a team at Google DeepMind have produced a breakthrough solution, AlphaFold, for predicting the three-dimensional structure of proteins from records of their amino acid sequences, an accomplishment that has been widely recognized as positively disruptive to biology and medicine. While the crossover of this kind of technology to security has not been fully explored, the AlphaFold breakthrough suggests that, as they have in biology, neural networks may hold a key to solving problems once thought intractable in security.
Second and similarly noteworthy have been the demonstrated breakthroughs achieved by researchers in applying neural networks to generating source code. Researchers at both Google and OpenAI independently demonstrated that researchers can leverage neural networks to produce source code based on unstructured, natural language instructions. Such demonstrations suggest that it is only a matter of time before adversaries adopt neural networks to reduce the cost of generating novel or highly variable malware.
It also makes it imperative that defenders investigate leveraging source-code aware neural networks to better detect malicious code as well.
These developments add up to one central takeaway: the AI revolution is far from over, and security practitioners would be wise to keep pace with it and find defensive applications of new AI ideas and technologies.
Cybersecurity’s pivot to AI
In 2022 and beyond, innovative cybersecurity companies will distinguish themselves by demonstrating new machine learning applications. At Sophos, we see key fields of innovation in two areas.
The first is the underexplored domain of user-facing security machine learning. We believe that in the coming years, user-facing ML will make IT security products as intuitive at making security recommendations as Google is at finding web pages and Netflix is at recommending content. The resulting AI-driven security operations center (SOC) will feel dramatically easier to use and more efficient as compared to today’s SOCs.
In summary, artificial intelligence is changing at a dizzying pace. New tricks become old, and old tricks are refined, polished, and commoditized for the developer masses, in timescales of months or a few short years. And while what seemed impossible often becomes possible through deep learning, some hyped-up capabilities, like vehicle autonomy, remain stubbornly hard.
A few things are clear: AI developments will have tectonic implications for the security landscape. They will influence and shape the development of defensive security technologies, and the security community will identify novel applications for AI, as AI capabilities develop.