After the statistical processing of the MPD and before the modelled data representing individuals will be transformed into aggregated statistical indicators, there should be a second major quality gate that assesses mainly the consistency of the modelled data compliance to the data model.

Modelled MPD is the steady state of the data that represents as closely as possible the reality model of people’s spatio-temporal behaviour. This data still contains subscribers’ (modelled) data on the individual level that can be checked on a number of logical consistency criteria such as, but not limited to:

  1. The number of the subscribers (minus the number of eliminated subscribers from the sample based on filtering or elimination criteria) should be the same as the number of subscribers from the initial raw data quality gate report. This should be the case for both distributed and centralised processing options;
  2. None of the subscribers should be present in different locations at the same time;
  3. All subscribers (except those with only one location event) should have some presence within some time period;
  4. Total duration of presence of the subscriber in a limited period of time should not exceed the period (e.g. within 1 day, a person can be present somewhere not more than 24 hours);
  5. The duration of the subscriber’s presence (length of stay) should be logically related to the number of nights spent and days present (e.g. present for 48 hours, but 0 nights spent is not possible);

Process produces an error

Process does not finish

Process ingests erroneous data

Process overwrites critical data


  • No labels