← Back to feed
2026-05-26data

Detectability in Diversity: Improved Canary Crafting for Privacy Auditing in One Run

Mathieu Dagréou, Aurélien Bellet

PDF preview unavailable
Read on arXiv →

Key claim

New canary crafting method improves privacy auditing efficiency.

This paper introduces an efficient method for crafting canaries in privacy auditing, which enhances the accuracy of privacy leakage estimates while reducing computational costs. The approach combines influence functions with bilevel optimization to achieve better results than previous methods.

In plain English

This paper introduces an efficient method for crafting canaries in privacy auditing, which enhances the accuracy of privacy leakage estimates while reducing computational costs. The approach combines influence functions with bilevel optimization to achieve better results than previous methods.

Novelty
7.5/10

The paper proposes a new method for crafting canaries that improves privacy auditing efficiency, extending existing methods.

Reliability
8.0/10

The experimental results demonstrate stronger privacy leakage estimates with solid methodological support.

Deep reliability assessment

The methodology supports the claim that IBIS can produce more detectable one-run auditing canaries at much lower compute cost in the tested CIFAR-10/WRN16-4 and ResNet9-style settings. Broader claims about reliably tightening DP lower bounds across models, datasets, privacy regimes, or production-scale systems are less supported, especially because several DP epsilon estimates have high variance.

Reproducibility

Yes. The paper says code is attached to the submission, documented, and will be made public upon acceptance; experiments use standard datasets such as CIFAR-10, with details provided in Section D.

Discussion questions

  1. 1.The core assumption is that canary interference is primarily captured by representation-space similarity or cross-influence; when might this proxy fail, especially for highly non-linear or foundation-model training dynamics?
  2. 2.For builders auditing real systems, does optimizing artificial canaries reveal meaningful privacy risk for natural user data, or mainly a worst-case stress test of memorization?
  3. 3.What empirical result would falsify the paper’s thesis: for example, if diverse high-self-influence canaries still underperform random flipped-label canaries on larger architectures, different datasets, or stronger DP-SGD settings?

Key figure

Figure 1 shows that influence-based preselection improves over random canary selection and gives IBIS a stronger initialization, with regularization helping most clearly in the non-private setting.

Benchmark results

CIFAR-10 with WRN16-4 audited modelTPR@0.05FPR: 1vs Random + Flip canaries+0.14 absolute TPR
CIFAR-10 with WRN16-4 audited model, canaries crafted using ResNet9TPR@0.05FPR: 0.84vs Boglioni et al. canaries+0.02 absolute TPR
CIFAR-10 with WRN16-4 audited model, epsilon=10, canaries crafted using ResNet9estimated epsilon lower bound: 0.9vs Boglioni et al. canaries+0.16 estimated epsilon
CIFAR-10, 1000 canaries using ResNet9GPU hours on NVIDIA A100: 2.5vs Boglioni et al. reported costabout 36x to 48x less GPU time versus 90-120 hours