Big Data, Differential Privacy and National Statistical Organizations
This paper provides an introduction to the concept of differential privacy (DP) from the perspective of a National Statistical Organization (NSO). Two competing obligations guide the work of NSOs: providing timely data to inform decision making and protecting the privacy of respondents. A suite of privacy tools and controls have helped make this possible. However, starting about twenty years ago, sophisticated attacks – which aim to determine respondents’ answers from the published aggregate totals – are becoming a significant and growing possibility, owing partly to the proliferation of data available online and partly to the increase in computational power available to attackers. DP has gained popularity among some circles as an effective tool for protecting against these attacks, yet has faced continued opposition from others.
As of 2020 the use of DP is more limited than one might expect. Continued efforts will be needed to advance its role in the future. Some of the factors that might explain the challenges of using DP include:
- The mathematical and computational complexities inherent in DP;
- A lack of key knowledge about NSO priorities among DP researchers, who are predominantly computer scientists;
- Limited progress toward a multidisciplinary research agenda;
- Concern about DP’s impacts to the utility of published outputs
NSOs can help address challenges of using DP by:
- Helping develop tools to efficiently implement DP at the scale of a typical NSO;
- Applying DP to practical examples, which effectively balance privacy against data utility and resolve the myriad of real-word complexities inherent in a typical official statistics publication;
- Building an effective mathematical language to express privacy concepts and educating key stakeholders in this language.
The big data era is helping to create a paradigm shift in how data are generated, collected and used. NSOs have an important leading role in this process. By increasing the utility of their data and supporting products such as data visualization tools, NSOs will lead change in producing and generating statistics. By engaging with DP in practice, NSOs will encourage researchers to develop DP methodologies and software that address official statisticians’ needs. These resources will enable NSOs to continue publishing high-quality data while also safeguarding respondent privacy, in a world where these two competing obligations are becoming both increasingly important and increasingly difficult.
The full paper is available here: https://content.iospress.com/articles/statistical-journal-of-the-iaos/s…