Differential privacy provides a way to quantify the privacy risks posed by aggregate statistics based on private data. One of its key ideas is that adding noise to data before generating a statistic can protect privacy.
In the context of machine learning, that means adding noise to training examples before they’re used to train a model. But while that makes it harder for an attacker to identify individual data in the training set, it also tends to reduce the model’s accuracy.
At the 16th conference of the European Chapter of the Association for Computational Linguistics (EACL), my colleagues and I will present a paper where we propose a new differentially private text transformation algorithm, known as ADePT (for autoencoder-based differentially private text), that preserves privacy without losing model utility.
ADePT uses an autoencoder, a neural network that’s trained to output exactly what it takes as input. But in-between input and output, the network squeezes its representation of the input data into a relatively small vector. During training, the network learns to produce a vector — an encoding — that preserves enough information about the input that it can be faithfully reconstructed, or decoded.
With ADePT, we train an autoencoder on phrases typical of the natural-language-understanding (NLU) system we want to build. But at run time, we add noise to the encoding vector before it passes to the decoder. Consequently, the vector that the decoder sees doesn’t exactly encode the input phrase; it encodes a phrase near the input phrase in the representation space.
The output of the decoder is thus an approximation of the input, rather than a reconstruction of it. For instance, given the input “What are the flights on January first 1992 from Boston to San Francisco?”, our noisy autoencoder output the question “What are the flights on Thursday going from Dallas to San Francisco?” We use the transformed phrases, rather than the original inputs, to train our NLU model.
The idea behind differential privacy is that, statistically, it should be impossible to tell whether a given data item is or is not in the dataset used to produce an aggregate statistic (or, in this case, to train a machine learning model). More precisely, the difference between the probabilities that the item is or isn’t in the dataset should be below a threshold value.
Accordingly, to evaluate the privacy protection afforded by our transformation algorithm, we test it against an attack known as a membership inference attack (MIA). MIA infers whether a given data point was part of target model’s training data. The attacker trains an attack model that is essentially a binary classifier that classifies an input sample as a member (present in training data) or a non-member (not present in training data). The more accurate this attack model, the less privacy protection the transformation provides.
In our tests, the target of the attack is an intent classifier trained on transformed versions of the data in the widely used datasets ATIS and SNIPS. Below are some anecdotal examples showing that our model’s text transformations offer greater semantic coherence than baseline:
Original sample | Baseline transformation | ADePT | |
1 | what are the flights on january first 1992 from boston to san francisco | what are the flights on february inhales 1923 from boston to san mostrar | what are the flights on thursday going from dallas to san francisco |
2 | i would like to book a flight for august twenty seventh from baltimore to san francisco on us air | i would like to list all flights for ground transportation from baltimore to general mitchell on us air | i would like to find a flight for august fifth from denver to pittsburgh with lufthansa |
3 | do you have a night flight from washington to boston on august twenty seventh | do you have a listing flights from beach to boston on coach class | do you have evening flight from vegas to austin on july thirteen |
Overall, our experiments show that our transformation technique significantly improves model performance over the previous state of the art, while also improving robustness against membership inference attack.