Data quality assurance is critical for ensuring that data for producing statistics are suitable for analysis purposes. This includes quality assessments of CDR data at the unit level and CDR aggregates that are used as indicators. At the unit level, data quality is evaluated based on descriptive statistics such as average number of records per day and/or person, number of active subscriptions per day, and their spatiotemporal distribution. These statistics help us identify potential erroneous, missing data, and anomaly, which may need to be taken account when analysis results are interpreted. Quality assurance can be also performed using resulting indicators such as residential population per administrative unit computed from CDR data. It can be examined by computing correspondence statistically with known population statistics such as census data and WorldPop(Arai et al. 2021).

Institutions responsible for the compilation, calculation and dissemination of official statistics usually implement quality assurance frameworks for quality assessment of the statistics that they produce. Precisely speaking, displacement and disaster statistics may not be the same as other official statistics that are usually published by national statistical offices, such as population, migration, and tourism statistics. However, displacement and disaster statistics are part of indispensable elements in the information system for the society, government, and economy. It is recommended that quality assurance framework for producing statistics from CDR data is based on the national quality-assessment frameworks that fit the country's practices and circumstances. This dimension is referred to as assurances of integrity in the IMF Data Quality Assessment Framework: relevance, accuracy, reliability, coherence, timeliness, and accessibility. Also, it is advised to refer to the Big Data Quality Assurance Dimension that is proposed by UNECE. It is composed of privacy and security, complexity, completeness, usability, accuracy (selectivity), coherence (linkability, consistency), and validity.

  • No labels