Anthropic Distillation Accusation

Overview

Anthropic (creator of Claude AI) accuses Chinese AI firms (Minimax, Moonshot, DeepSeek) of massive “Distillation Attack” to copy Claude.

Distillation Attack

Distillation is a common technique in AI development. It is used to create smaller, more efficient models by training them on the outputs of larger, more complex models. A large set of prompts and responses is generated by the larger model, and the smaller model is trained to mimic those outputs. However, when this technique is used without authorization, it becomes a “distillation attack”. Competitor companies can use this method to quickly and cheaply replicate the capabilities of a competitor’s AI system. Also, independent groups can use this method to create an “unfiltered” version of a model, which can be used for malicious purposes.

The Claim

Anthropic accuses the Chinese firms of engaging in “industrial-scale” distillation campaigns, which is a significant concern for the AI industry. The company has stated that it is taking steps to protect its intellectual property and prevent further unauthorized access to its models. According the company, The attackers generated over 16 million exchanges with Claude by utilizing sprawling networks of about 24,000 fake accounts.

The attacks specifically targeted Claude’s most advanced capabilities:

MiniMax generated over 13 million exchanges targeting Claude’s coding.
Moonshot AI generated over 3.4 million exchanges to extract Claude’s reasoning, coding.
DeepSeek generated over 150,000 exchanges targeting Claude’s reasoning skills.

In response, Anthropic has stated that it is implementing new classifiers, behavioral fingerprinting systems, and access controls to detect and prevent further unauthorized capability extraction.

Overview

Distillation Attack

The Claim

Sources