Curriculum adversarial training

Weakness of adversarial training: overfit to the attack in use and hence does not generalize to test data

Curriculum adversarial training

思想:train model from weak attack to strong attack

方法

Let ll denote the attack strength, KK denote the maximal attack strength. A(l)\mathcal{A}(l) denotes an attack class parameterized with ll.

Basic curriculum learning

i). start from no attack;
ii). train the model for one epoch and, once finished, calculate the l~\tilde{l}-accuracy;
iii-a). if l~\tilde{l} increases at least once over the last 10 epoches, continue training;
iii-b). if l~\tilde{l} does not increase over the last 10 epoches, set the parameters of the model to be the best ones (i.e. 10 epoches ago), and increase ll by 1;
iv). Stop when l>Kl>K.

Benefit: Training efficiency

Additional optimization technique: batch mixing

Motivation: The basic curriculum training can achieve a significantly reduction on the training time, it does not increase the robustness. One issues is forgetting\textcolor{red}{\text{\small forgetting}}: when the model is trained with a larger ll, it will forget the adversarial examples generated for a smaller ll.

Solution: Generate some adversarial examples using PGD(i)PGD(i) for each i{0,1,...,l}i \in \{0, 1, ..., l\}, and combine them to form a batch. The loss function is updated accordingly as:
i=0kαix,yDL(fθ(Ai(x),y),\sum_{i=0}^k \alpha_i \sum_{x,y \sim \mathcal{D}}\mathcal{L}(f_\theta(\mathcal{A}_i(x),y),
where αi\alpha_i's are hyperparameters such as ai[0,1],αi=1a_i \in [0,1],\sum \alpha_i=1. The authors set αi=1l+1\alpha_i=\frac{1}{l+1} and generate the same amount of adversarial examples for each attack strength.

Additional optimization technique: quantization

Motivation: The model trained with CAT may not defend against attacks that are stronger than the strongest attack used during training.
Solution: Employ quantization, i.e. restrict x[0,1]x \in [0,1] to a bb-bit integer.
Rationale: Quantization reduces the space of adversarial examples. Specifically, let xx^\star denotes the adversarial example. The difference of xxx^\star-x takes value from an infinite space if xx is real-valued; in contrast, it takes value from a finite space if xx is quantized to an integer vector.
Remark: Quantization is a generic inference time defense technique. This technique alone is not shown to provide resilience against strong white-box attacks. However, it is effective when using together with CAT since the model remembers adversarial example generated by weak attacks. Although a stronger attack can better optimize the loss function, the adversarial examples that it generates are highly likely to coincide with those generated by a weaker attack, because the entire adversarial example space is small.

實驗:Improve both efficiency and empirical worst-case accuracy against adversarial examples (termed resilience)


文獻:
Cai, Qi-Zhi, Chang Liu, and Dawn Song. “Curriculum adversarial training.” In Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 3740-3747. 2018.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章