Purpose | To enable a NSO to safely and confidently conduct analysis on longitudinal Mobile Network Operator (MNO) mobility data. |
Datasets | Summary of daily visited locations by individual (pseudonymised) subscribers extracted from (call data record) CDR or signalling data, for 100 million mobile subscribers. |
PETs used | Trusted Execution Environment |
Details of computation | Articulated workflow consisting of a chain of simple operations performed regularly. Longitudinal MNO analysis and integration of MNO and NSO data takes place within a secure enclave/trusted execution environment using a predefined set of algorithms that deliver aggregate (non-personal) data in output. |
Parties and trust relationship | MNO and NSO act as both input and output parties, and their relationship is assumed honest-but-curious. |
Implementation status | Proof of concept |
Resources | https://ec.europa.eu/eurostat/cros/content/eurostat-cybernetica-project_en |
Background
Eurostat has developed a proof-of-concept solution with a technology provider. The main goal of this project was to explore the feasibility of a Secure Private Computing solution for the privacy-preserving processing of Mobile Network Operator data. The technology of choice for this project was a Trusted Execution Environment (TEE) with hardware isolation (specifically, Intel SGX) in combination with the privacy-enhancing software Sharemind HI developed by Cybernetica. The solution was tested on synthetic data emulating a population of up to 100 million mobile users. Detailed information about the project scenario and results are available from the public project page. Eurostat is open to support NSOs and MNOs that are interested in testing the solution in the field.
Case Study description
The MNO collects records in the form <user_pseudonym, time, location> for the mobile users. In the reference scenario, the data protection policy is defined such that all pseudonyms are changed every period T, in order to reduce the risk and impact of re-identification attacks. However, the statistical methodology defined by the NSO requires observation of the mobile user longitudinally for a much larger window W>>T (e.g. T=24 hours and W=3 months) (e.g. T=24 hours and W=3 months) in order to reliably identify the few locations that constitute the “usual environment” of the mobile user.
It is also assumed that the NSO is set to receive only non- personal aggregate data (fulfilling some k-anonymity condition) but not raw data. Additionally, the NSO has other data that could be used to better calibrate the MNO statistics (e.g. census grid data at fine resolution) but cannot be passed to the MNO.
A Secure Private Computing solution is developed to ensure that longitudinal MNO analysis and integration of MNO and NSO data takes place exclusively on a predefined set of algorithms that deliver aggregate (non-personal) data in output.
Outcomes and lessons learned
The PoC showed that scalability is not a point of major concern in the considered scenario. Despite the limited amount of memory in the enclaves, the I/O bandwidth (with hardware accelerated encryption) proved to be sufficient in the test scenario. Alternative solutions based on other TEE technologies or Secure Multiparty Computation could be explored in the future.
The legal analysis conducted in the project revealed a complex interplay between statistical, telecoms and data protection regulations, with a marked heterogeneity across EU countries due to different national legislation.
Furthermore, during the project it became evident that bringing together a multi-disciplinary team of experts - including specialists in statistical methodologies, experienced security and privacy engineers, and legal experts - is a key success factor for inception projects in the field of Secure Private Computing.