Hello, you are using an old browser that's unsafe and no longer supported. Please consider updating your browser to a newer version, or downloading a modern browser.
Training Camp • Cybersecurity Glossary
A model extraction attack steals an ML model by querying its API and training a substitute model that mimics its behavior, exposing IP and enabling further attacks.
Model Extraction Attack Definition: A model extraction attack steals an ML model by querying its API and training a substitute model that mimics its behavior, exposing IP and enabling further attacks.
A model extraction (or model stealing) attack is an adversarial machine-learning technique in which an attacker repeatedly queries a target model's prediction API and uses the inputs and outputs to train a substitute model that approximates the original's behavior. This steals intellectual property, exposes the victim to query-based monetary cost, and can serve as a stepping stone for crafting transferable adversarial examples or inferring training data. Defenses include rate limiting, query monitoring, output perturbation, and watermarking.
Turn knowledge into credentials with our instructor-led cybersecurity boot camps.
View All Courses →