Large Language Law

AI poisoning could turn open models into destructive “sleeper agents,” says Anthropic

22.1.2024

Anthropic, the creator of ChatGPT competitor Claude, has released a research paper describing the risks of large language models (LLMs). The paper warns of AI “sleeper agents” that can output secure or exploitable code with vulnerabilities, depending on the prompt. The researchers found that safety training may not be enough to secure AI systems from hidden, deceptive behaviors that might give a false impression of safety. The attack can hide in the model weights instead of data, and the paper suggests that more direct attacks look like someone releasing a secretly poisoned open weights model.

Source: Ars Technica

18.06

Legal AI Platform Harvey To Get LexisNexis Content and Tech In New Partnership Between the Companies
17.06

As AI Becomes the Norm, Will the Human Touch Come at a Premium?
17.06

Sequoia-backed Crosby launches a new kind of AI-powered law firm
17.06

AI in the Courts? SA Chief Justice considers pros and cons
16.06

Can AI safeguard us against AI? One of its Canadian pioneers thinks so
15.06

The rise of the legal AI dream team: Robots, agents, and people

AI poisoning could turn open models into destructive “sleeper agents,” says Anthropic

Legal AI Platform Harvey To Get LexisNexis Content and Tech In New Partnership Between the Companies

As AI Becomes the Norm, Will the Human Touch Come at a Premium?

Sequoia-backed Crosby launches a new kind of AI-powered law firm

AI in the Courts? SA Chief Justice considers pros and cons

Can AI safeguard us against AI? One of its Canadian pioneers thinks so

The rise of the legal AI dream team: Robots, agents, and people

Leave a Reply Cancel reply