Hackers can ‘Poison’ Open-source Code on the Internet

 

A Cornell University Tech team with researchers discovered a new kind of backdoor attack that can modify natural-language modelling systems to generate false outputs and bypass any known protection. 

The Cornell Tech team believes the assaults may affect algorithmic trading, email accounts, and other services. The research was supported by the NSF and the Schmidt Futures initiative, and Google Faculty Research Award. 

According to research published on Thursday, the backdoor may alter natural-language modelling systems without requiring access to the original code or model by uploading malicious code to open-source sites commonly used by numerous organisations and programmers. 

During a presentation at the USENIX Security conference on Thursday, the researchers termed the attacks "code poisoning." The attack would offer people or organisations immense authority over a wide range of things, including movie reviews or an investment bank's machine learning model, disregarding news that may affect a company's stock. 

The report explained, "The attack is blind: the attacker does not need to observe the execution of his code, nor the weights of the backdoored model during or after training. The attack synthesizes poisoning inputs 'on the fly,' as the model is training, and uses multi-objective optimization to achieve high accuracy simultaneously on the main and backdoor tasks." 

"We showed how this attack can be used to inject single-pixel and physical backdoors into ImageNet models, backdoors that switch the model to a covert functionality, and backdoors that do not require the attacker to modify the input at inference time. We then demonstrated that code-poisoning attacks can evade any known defence, and proposed a new defence based on detecting deviations from the model's trusted computational graph." 

Eugene Bagdasaryan, a computer science PhD candidate at Cornell Tech and co-author of the new paper with professor Vitaly Shmatikov, mentioned that many companies and programmers use models and codes from open-source sites on the internet. This study highlights the importance of reviewing and verifying materials before incorporating them into any systems.

"If hackers can implement code poisoning, they could manipulate models that automate supply chains and propaganda, as well as resume screening and toxic comment deletion," he added. 

Shmatikov further explained that similarly to prior assaults, the hacker must gain access to the model or data during training or deployment, which involves breaking into the victim's machine learning infrastructure. 

"With this new attack, the attack can be done in advance, before the model even exists or before the data is even collected -- and a single attack can actually target multiple victims," Shmatikov said. 

The paper focuses further on the ways for "injecting backdoors into machine learning models, based on compromising the loss-value computation in the model-training code." The team used a sentiment analysis model for the particular task of always classifying as positive all reviews of the infamously bad movies directed by Ed Wood. 

"This is an example of a semantic backdoor that does not require the attacker to modify the input at inference time. The backdoor is triggered by unmodified reviews written by anyone, as long as they mention the attacker-chosen name," the paper discovered. 

"Machine learning pipelines include code from open-source and proprietary repositories, managed via build and integration tools. Code management platforms are known vectors for malicious code injection, enabling attackers to directly modify the source and binary code." 

To counter the attack:
The researchers suggested a technique that could identify changes from the original code of the model. However, Shmatikov claims that because AI and machine learning tools have grown so popular, many non-expert users create their models with code they hardly comprehend. "We've shown that this can have devastating security consequences." 

In the future, the team aims to investigate how code-poisoning links to summarisation and even propaganda automation, which may have far-reaching consequences for the future of hacking.

They will also strive to create robust protections that will eradicate this entire class of attacks and make AI and machine learning secure even for non-expert users," according to Shmatikov.