Can Strategies Used Against Computer Viruses Help Us Fight Biological Viruses?
28 abr 2020
Cyber security often takes inspiration from nature. This isn’t seen just in the way the business names things—defenders also seek to emulate the human immune system when they hunt down and neutralize the cyber equivalents of biological viruses, bacterial infections, and parasites. Conversely, the world of healthcare has rarely paused to consider if the strategies employed by its cyber cousin might help it to improve how it predicts, prevents, and tackles epidemics and pandemics such as COVID-19.
At face value, we are dealing with two very different hosts—one a biological organism formed by millions of years of evolution, the other involving computer systems coded by humans over decades. We are also dealing with very different types of invaders: The first group is an array of biological organisms; the second, a range of human-made malicious software programs. Biological viruses can cause suffering and even death, and while cyberattacks may cause indirect suffering to people if they’re made on health systems, their impact is mostly loss of business, money, data, and/or reputation.
As a result, some experts, even in the cyber-security business, are skeptical of comparisons between computer and biological viruses. “Other than the name ‘virus’, there’s not a lot in common between them,” says David Emm, Principal Security Researcher at Kaspersky. “No computer malware has impacted us in the way that COVID-19 has. That’s not to say that the impact of computer malware can’t be severe—it can be and often is.”
Stepping back, however, there is considerable affinity between the way global healthcare deals with biological pathogens and the computer industry battles cyber threats. “There are significant parallels between battling widespread biological and computer viruses,” says Rick McElroy, Cybersecurity Strategist at VMware Carbon Black. “At a high level, prevention, detection, and response are critical to success with both.”
One reason for this convergence is the growing influence of biology, nature, and evolution in the development of both malware and systems and strategies to defeat it. One of the earliest recognitions of this was a 1997 paper by Forrest et al on computer immunology.
“The strategies to combat malware and biological viruses are very similar, because in the cyber world we take inspiration from natural/biological systems,” explains Nur Zincir-Heywood, Professor of Computer Science at Dalhousie University, Canada. Her research often exemplifies the lifelike nature of cyber security, as seen in her 2019 paper on Darwinian malware detectors, which highlights the need for detectors of malicious software (aka malware) that are “bred, rather than built.”
The parallels would suggest that the medical and cyber-security worlds would benefit from joint research, collaboration, and sharing of each other’s threat analysis and defense strategies, particularly as each of these becomes increasingly data-led.
However, collaboration is surprisingly rare. We could find only one example of the computer-security industry stepping up to offer advice on COVID-19—a study titled Cybersecurity Lessons for the COVID-19 Pandemic, which was published in early April by security and AI experts at Cisco in the Czech Republic. In it, the authors highlight how their experience of detecting and mitigating cyber threats pointed to the need for the Czech government to ramp up testing and introduce intelligent segmentation and quarantine strategies. And they tell us that, to date, the paper has received no traction with their government.
Computer virus vs. biological virus
The term “computer virus” was coined in a 1984 paper by the computer scientist Fred Cohen. The name is apt. A biological virus is inert until it combines with a particular type of host cell, causing illness and using the cell’s proteins to reproduce itself. Similarly, a computer virus is computer code that combines with a particular type of computer program, causing harm and reproducing as the program is shared (often by email). “The underlying premise involved in the way both computer and biological viruses invade, damage and [effect] onward transmission is very similar,” confirms Bharat Mistry, Principal Security Strategist at Trend Micro.
In the decades since Cohen’s paper, the threat from malware has ballooned. “The first big difference between malware and biological viruses is the sheer quantity of new unique computer malware found daily, circa 350,000,” adds Mistry. “The second big difference is the speed of transmission or infection. Biological infections start in one area and stay there unless infected people physically leave and pass it on, which could take days, maybe weeks. Because most computers are connected to the internet, an infection on one side of the world will propagate to the other side in a matter of hours or minutes.”
Malware has become increasingly sophisticated, attacking vulnerabilities in myriad ways. It includes newer, even more dangerous, types—including worms (named after the human-parasitic worm), Trojans, and Ransomware—that, according to the strict definition, aren’t computer viruses, but are commonly referred to as such. “We have some real monsters from hell,” explains Ivan Zelinka, Professor of Computer Science at the VSB Technical University of Ostrava, in the Czech Republic. “There are viruses that will mutate numerous times but keep the same functionality. Ones that use encryption, ones that use stealth technologies—so the antivirus cannot identify it even when it’s right in front of it—and ones that use Darwinian evolution to metamorphose or join themselves into a random, swarm-like attack on their victim, for example Stuxnet.”
Compared with modern-day computer malware, biological viruses appear quite simple. “If Sars-CoV-2 [the virus that causes COVID-19 and/or diseases like it] were transformed into malware, then it would look like an old-fashioned PC virus,” says Zelinka, who has contributed to numerous papers on the evolution of malware and solutions, as well as the use of AI in cybersecurity.
Though simple, over eons, biological viruses have evolved to be remarkably effective, and there is nothing straightforward about assessing what sort of impact one such as Sars-CoV-2/COVID-19 has on the human body or how to address it. The fallout from the virus, in terms of illness, death of people, bringing the global economy to a halt, and the political and cultural ramifications, dwarfs anything achieved by any computer malware… to date. “Sars-CoV-2 is really tight—it’s just 30,000 symbols, which is maybe equivalent to 7kb to 8kb of data. I don’t know any computer virus that small and which can accomplish so much. Such a short amount of information can turn the world upside down,” says computational geneticist Yaniv Erlich, who is the CSO of MyHeritage and a former Professor of Computer Science at Columbia University. “I guess if you add together all the computer-security break-ins, it wouldn’t be a fraction of all the suffering and monetary loss that this has caused.” Among his research projects on genetics, Erlich was part of the first team to encode a computer virus, along with a movie and gift card, into a speck of DNA (don’t worry, it’s not dangerous).
Despite the tsunami of cyberattacks over the past four decades, the fact that no single cyber pathogen has (yet) caused a calamity of pandemic proportions is surprising considering the interconnectedness of the IT world. This could be testament to the effectiveness of the security business or it could be because the criminals still have to come up with the dream ticket. Either way, it presents a good reason for inter-discipline collaboration.
Erlich believes that the way cyber-security businesses continually search for potential threats should demonstrate to global healthcare how imperative it is to search for potential new pathogens before they strike. “What we are missing is biosurveillance,” he says. “The computer world is very good at doing vulnerability research—always trying to find the zero-day hacks, trying to discover the loopholes before the bad guys use them. Every few years, we see a new outbreak—flu, Sars, Mers, Ebola. These viruses don’t come out of nowhere. They are living in animals before they pass to humans. The question is can we monitor them? Then we can warn people that something is going on over here. You should be prepared. We need to take these steps.”
Cyber security focuses on threat intelligence to prevent attacks before they happen or become widespread. Organizations hunt for new, or new variations of, malware using techniques such as honeypots, which entrap malicious software or criminal activity ready for analysis. Organizations also hunt for vulnerabilities in commonly used computer software, such as Microsoft Windows, which can be exploited by criminals and their malware tools to breach computer systems. On finding such threats, the security vendors update their products and services against attack, and software vendors issue a security update to patch the vulnerability. “Vulnerable hosts are systems and computers that have unpatched security holes, which can be found in out-of-date software from Microsoft, Adobe, Mozilla or Google,” explains Alexander Vukcevic, Director of Protection Labs and QA at Avira. “Many types of malware are actively looking for exploitable systems around them once they infect a machine. A software patch can, in this case, be compared with a vaccine, one that builds immunity to a certain type of exploit attack.”
The importance of this system of threat intelligence is well illustrated by what happens when the system breaks down. In 2017, criminals used malware to cripple hundreds of thousands of computer systems and demand ransoms to put them right. Reportedly, this used a vulnerability in Microsoft software that the National Security Agency (NSA) knew about (and used itself) but didn’t warn Microsoft about. “The infamous WannaCry malware family leveraged the leaked NSA exploit EternalBlue to spread in a virus-like manner, infecting several systems and holding them to ransom in order to decrypt the files on the systems. It had a very wide impact, like the COVID-19 incident,” says Vukcevic.
The equivalent of the EternalBlue fiasco in health is where a new pathogen emerges but is not quickly identified and/or communicated to global health communities. If the Chinese authorities had warned the rest of the world earlier about COVID-19, could its spread have been mitigated? Imagine if Sars-CoV-2 had been known about and researched before it broke out in Wuhan—it might not have been possible to prevent the breakout, but we might have been ready.
Of the human pathogens that have emerged since 1970, 70% are believed to have been passed to us by animals, according to the WHO (2018). Diseases that can cross or “spill over” to humans are termed zoonotic, and experts led by Dennis Carroll have campaigned for years for a global effort to find potential zoonotic diseases in the animal population. Predict—the research program Carroll founded in 2009 to lead the charge in this—found almost 1,000 new diseases in 10 years, some of which could infect humans. The program had its funding withdrawn by the US government in October 2019, but has since been given a six-month extension to help with COVID-19. The Global Virome Project, also founded by Carroll, estimates that there are 500,000 unknown viral species in animals that could infect humans, and that investigating most of those viral threats will cost $3.5 billion. That’s a big price tag, but Erlich believes that governments ought to be less tightfisted and shortsighted about funding the research that could prepare the world for the next pandemic, considering the human and economic cost that COVID-19 is racking up.
The commercial model of cyber security gives it a huge advantage over the grant-funded model of zoonotic research. However, there is a downside to commerce: Suppliers do not share real-time threat information widely across the industry. “The biggest thing the computer industry can learn from the biological world,” says Mistry, “is to share threat information between all parties and not use it as a competitive edge. After all, the common enemy is the cybercriminals, not your competitor.”
Biological pathogens and computer malware start in totally different ways. The biological virus is a product of evolution, while malware is a product of malevolent human activity. Says Zelinka: “Malware is not a product of random processes in the PC, but a product of human programming activity. The product of the human mind.”
However, there are close parallels in how humans facilitate the breakout and spread of both cyber and biological infections. Malware usually targets vulnerabilities in software, but it relies on human folly to be activated. The majority of the monthly top 10 malware threats recorded by the Center for Internet Security relies on users clicking on a link in a “malspam” email or a malvertisement to get access to the computer. Similarly, people are the weak link in the cyber-threat intelligence system: It doesn’t matter how efficient suppliers are at finding new threats or patching software vulnerabilities if organizations or end users fail to update or allow automatic updates of their software and security products. According to a 2019 survey by the Ponemon Institute, 60% of security breaches occur because the victim failed to apply an available patch to a known vulnerability.
Biological viruses such as COVID-19 also rely on human activity—often inappropriate human activity, such as trafficking, trading, eating, or domesticating wild animals or encroaching upon their habitats—to make the jump. Sars-CoV-2 is believed to have originated through the trading of wild animals in a market in Wuhan. Once an infectious virus crosses over into the human population, it requires human activity to spread. The more the population interacts, the more they travel, and the longer governments delay in intervening, and the more individuals flout rules, such as on social distancing, the quicker it spreads. “Normal viruses spread through people physically. Thus the spread becomes dependent on people’s behavior. People have their own needs, motives, values, and so on,” explains Frank Dignum, Professor of Computer Science at Umeå University in Sweden, and the head of the ASSOCC project (Agent-based Social Simulation of the Coronavirus Crisis), which is a platform helping governments to model responses to COVID-19, such as mass testing.
It is apparent that better education, leadership, communication, and discipline will save on data and financial loss in the computer business and save lives in the health business. However, it is also important that, in both scenarios, communication is clear and the authority needs to be trusted, if governments, companies and individuals are going to act. “With health matters, the CDC [Centers for Disease Control and Prevention] and the WHO are the official bodies trained to alert governments and citizens,” says McElroy. “The equivalent in cyber in the US would be the NSA and DHS [Department of Homeland Security]. However, much like in real-world outbreaks, potential distrust of agencies can lead to people not following recommendations, dismissing data, and believing rumors and disinformation about the motivations behind alerts.”
It is notable how many governments around the world have justified their COVID-19 policy decisions, such as when to impose or remove social-distancing measures, using scientists’ interpretations and the modelling of data. It is common to see leaders conduct press conferences flanked by scientists and health experts and armed with PowerPoint presentations. This approach has helped to counter the distrust people sometimes feel toward politicians and the urge to resist any restrictions that are placed on their freedom.
Detection and response
When an outbreak occurs, whether cyber or biological, it is critical to detect it as quickly as possible so it can be isolated, analyzed, and tracked, and warnings can be issued before it has a chance to spread. “Once the problem is recognized, we try to isolate it to the level of disconnecting the system from the network, analyzing the log files from network to operating system to application, and checking all the other machines/applications/systems in the same organizational environment,” says Zincir-Heywood. “These are similar to the test, identify, contain, track-contacts and mitigate strategies adopted when fighting biological viruses.”
The detection challenge in cyber security is amplified because outbreaks can hit numerous places at the same time and spread globally very quickly, making containment more challenging than with a biological virus. “Malware is not only spread by those that are infected but can also be spread by attackers via malvertisement, malspam emails and malicious links,” explains Vukcevic. “This makes it very hard to limit the spread of computer viruses and makes good protection even more important.” It also makes data critical to detecting and controlling outbreaks. Cyber-security companies rely on a network of automated spies—on clients’ systems and networks, sensors strategically placed around the web, and honeypots—all feeding intelligence back to help them to identify new threats.
Today, the healthcare world is trying to deal with outbreaks with a fraction of the real-time data of the security business. Of course, human bodies do not automatically feed data back like machines, and diagnosing symptoms takes time. Information is largely collected manually by medical personnel and is rarely widely shared or analyzed in real time, plus any sharing of mass information by hospitals is hampered by privacy concerns.
Indeed, COVID-19 has highlighted the absence of a formalized global system for sharing data, which (arguably) slowed the international response to the outbreak. The earliest warnings came via volunteer-led platforms such as HealthMap and ProMed, not directly from the hospitals that treated the first patients nor China’s CDC. The warning did not come until December 30, more than six weeks after the first case. After a false start, global data sharing for COVID-19 improved considerably. This has included the submission of more than 11,000 genome sequences (as at April 24) from patients tested by labs around the world to GISAID, an international initiative hosted by Germany that helps the global development of tests, vaccines, and tracking of the spread of the virus. Also, many governments are sharing daily totals of confirmed cases and deaths, which has helped with basic modeling of the virus’s spread.
However, as demonstrated by the cyber-security industry, to deal with threats in real time you need real-time data, and delivering this for tackling epidemics requires a shake-up not just of global healthcare best practice and data collection, but also global privacy. “You could use AI to spot emerging threats as they hit the human population,” says Mark Read, a computational immunologist at the University of Sydney. “This could involve tracking people and their activities through, for example, phones and tech. But there are serious Big Brother, ethical concerns with that.” Contact-tracing apps, such as the one used in China since February and now, belatedly, being copied across the world, are a step in this direction. “You might also spot trends if we had better-connected hospitals and electronic medical records,” adds Read. “With the latter, we could also identify the features of individuals most at risk. The wider population is nervous about governments collecting and sharing their medical data electronically, with some reason, but from a research perspective this would be a fantastic resource for improving healthcare.”
Today, tackling human disease, by developing a new vaccine or antiviral or antibiotic drug treatments, can take years, largely because they need to be rigorously tested to ensure they do not cause side effects. This is what makes it so important to identify and research potential pathogens before they break out.
In the computer world, it is a completely different story. “Computer systems are explicitly designed and there’s a lot of scope to alter the implementation,” says Read. “You can also test the system explicitly, such as by hiring people to attack the system to find vulnerabilities. People would balk at that sort of thing for humans and viruses.”
Also, researchers invent malware to test security systems, as seen in Mystique: Evolving Android Malware for Auditing Anti-Malware Tools, a 2016 study by Meng et al, where AI was employed to develop new malware to test the leading Android security products. Data can be backed up, systems can be replicated, so if disaster strikes, things can quickly (when compared to humans) be rectified.
Erlich speculates that, in the future, genetic scientists will discover ways to patch the human immune system in the same way that we can patch vulnerabilities in computer systems. Rather than having to develop a vaccine to enable the immune system to learn a response to new pathogens, we could simply download the information to the memory cell, so it is ready if infection occurs. “Instead of having to wait two weeks for the body to work on the vaccine, could we find a way to download the information to the memory cell?” wonders Erlich. “[Information that would] tell it this is what you need to know right now, this is the structure that you need to reproduce to fight the infection.”
The recurring theme here is data. Both cyber-security and global-health strategies are increasingly governed by the collection, analysis, and interpretation of data. Firstly, data collection and analysis is fundamental to threat intelligence, whether you are searching for potential new biological viruses in the animal population or new malware or security risks in software. Secondly, it is easier to persuade governments and citizens to act responsibly when data backs up your advice, whether you are arguing against trading in wild animals or in favor of social distancing, or encouraging people to keep software updated or to resist the temptation to click on that too-good-to-be-true offer in an email. And last but not least, the availability of real-time data is crucial to timely decision making, whether you are detecting the outbreak and curtailing the spread of a biological virus or detecting the outbreak and curtailing the spread of computer malware.
This has not only been witnessed in the current response to COVID-19, where data scientists are sharing the limelight with medics and politicians. We have also seen emerging data-led disciplines in health, such as computational epidemiology and computational immunology. Data science is also proving critically important to the future response to cyber threats.
With both disciplines plotting similar paths toward data-driven threat response, there is surely considerable benefit for both in gaining a greater understanding of each other’s strategies, successes, and challenges.
The following people contributed to this article: Frank Dignum, Umeå University, Sweden; David Emm, Kaspersky, UK; Yaniv Erlich, MyHeritage, Israel; Jason Forget, Center for Internet Security, USA; Jaroslav Gergic, Cisco, Czech Republic; Nur Zincir-Heywood, Dalhousie University, Canada; Rick McElroy, VMware Carbon Black, USA; Guozhu Meng, Institute of Information Engineering, China; Bharat Mistry, Trend Micro, UK; Mark Read, University of Sydney, Australia; Alexander Vukcevic, Avira, Germany; Ivan Zelinka, VSB Technical University of Ostrava, Czech Republic
This article is part of Behind the Code, the media for developers, by developers. Discover more articles and videos by visiting Behind the Code!
Want to contribute? Get published!
Follow us on Twitter to stay tuned!
Illustration by Victoria Roussel
Más inspiración: Coder stories
We can learn a lot by listening to the tales of those that have already paved a path and by meeting people who are willing to share their thoughts and knowledge about programming and technologies.
Keeping up with Swift's latest evolutions
Daniel Steinberg was our guest for an Ask Me Anything session (AMA) dedicated to the evolutions of the Swift language since Swift 5 was released.
10 may 2021
"We like to think of Opstrace as open-source distribution for observability"
Discover the main insights gained from an AMA session with Sébastien Pahl about Opstrace, an open-source distribution for observability.
16 abr 2021
The One Who Co-created Siri
Co-creator of the voice assistant Siri, Luc Julia discusses how the back end for Siri was built at Apple, and shares his vision of the future of AI.
07 dic 2020
The Breaking Up of the Global Internet
Only 50 years since its birth, the Internet is undergoing some radical changes.
26 nov 2020
On the Importance of Understanding Memory Handling
One concept that can leave developers really scratching their heads is memory, and how programming languages interact with it.
27 oct 2020
¿Estás buscando tu próxima oportunidad laboral?
Más de 200.000 candidatos han encontrado trabajo en Welcome to the JungleExplorar ofertas