Hello, you are using an old browser that's unsafe and no longer supported. Please consider updating your browser to a newer version, or downloading a modern browser.

Global Accelerated Learning • Est. 1999
Glossary Term Model Extraction Attack

Training Camp • Cybersecurity Glossary

What is Model Extraction Attack?

A model extraction attack steals an ML model by querying its API and training a substitute model that mimics its behavior, exposing IP and enabling further attacks.

Glossary > AI Security & Data Privacy > Model Extraction Attack

Understanding Model Extraction Attack

A model extraction (or model stealing) attack is an adversarial machine-learning technique in which an attacker repeatedly queries a target model's prediction API and uses the inputs and outputs to train a substitute model that approximates the original's behavior. This steals intellectual property, exposes the victim to query-based monetary cost, and can serve as a stepping stone for crafting transferable adversarial examples or inferring training data. Defenses include rate limiting, query monitoring, output perturbation, and watermarking.

Learn More About Model Extraction Attack:

Ready to Get Certified?

Turn knowledge into credentials with our instructor-led cybersecurity boot camps.

View All Courses →