Purpose

To generate tourism statistics from the combined data of two Confidential sharing of datasets of two mobile network operators (MNOs).

Datasets

A list of IMSIs from the two MNOs in border areas for the same time period. The IMSIs were uniformly hashed from the 7th digit onwards.

PETs usedTrusted Execution Environment
Details of computation

Statistics are generated from the input data in a trusted execution environment (Intel SGX), through the Sharemind HI platform

Parties and trust relationshipTwo MNOs act as input parties; Sharemind serve in a compute role; the Ministry of Tourism acts as output party.
Implementation statusProduction
Resourceshttps://sharemind.cyber.ee/sharemind-hi/
https://netmob.org/assets/netmob19_withFCC.pdf

Background

Timely and accurate statistics on cross-border tourism can be difficult to attain for various reasons, including privacy and confidentiality concerns if roaming information from telecom companies is used. Indonesia, led by its Ministry of Tourism, is one of the first countries in the world to use data from mobile network operators for measuring cross-border tourism activity. Positium - an analytics company specialising in mobile positioning data - has set up a system for the Ministry based on data from one of the mobile operators. The Ministry wanted to establish a true baseline for roaming market share, which is hard to estimate due to subscribers cross-roaming in the networks of different operators during a single visit. The challenge was how to compare data without explicitly sharing it, as location data contains sensitive information about customers, and confidential business information of the operators.

Case Study description

Mobile positioning data provides insights into the quantities and movements of tourists. As tourists move around, their mobile phones roam through multiple local mobile network operators (MNOs). Cross-roaming is a situation where a person might use two or more different MNOs in the country of reference. A complete understanding of the nature of cross-roaming can only be derived when unique subscriber information (IMSI) is compared across several operators. Because of privacy concerns, this is a complex task that requires uniform hashing of IMSIs over at least two operators.

Sharemind is a secure computing platform created to specifically reduce the risk of a privacy breach when processing confidential data. The data is encrypted at the source, by the data owner, and only then sent to the Sharemind service. The host of the service will not have access to the unencrypted data nor the encryption keys. The solution protects data at rest and in transit and surpasses state-of-the-art methods with protecting data in use. It does not remove data protections even while processing, so data remains protected by cryptographic means during the whole analysis. The Trusted Execution Environment (TEE) technology used in Sharemind HI to implement privacy-preserving data processing is the Intel Software Guard Extensions (SGX) available in modern Intel processors. The three key concepts that SGX provides to protect data are enclaves, attestation and data sealing.

MNO data is transferred to the Sharemind HI environment, where analysis on the data is carried out, and the encrypted results are shared with the Ministry of Tourism.

Outcomes and lessons learned

The result provided the Ministry of Tourism with information on roaming counts and roamer overlap between the two biggest telecom providers in Indonesia, enabling an accurate calculation of roaming market share. The result is still used in official statistics until today as Indonesia produces monthly tourism statistics indicators based on mobile phone data.

The rapid development and deployment of the solution depended heavily on the good working relationship between the ministry of tourism, statistical office, mobile operators, mobile positioning expert and privacy technology expert organizations. The solution made confidential sharing of private datasets possible within the existing legal and business environment. It is still the only known solution for understanding cross-roaming subscriber overlap of MNO datasets. The solution is extendable to multi-MNO settings and performance is good even on commercial off-the-shelf hardware.