The topic of ‘anonymisation’ has already been covered several times on the blog (see e.g. here, here, and here). We even have a new research paper (‘Anonymous Data v. Personal Data — A False Debate: An EU Perspective on Anonymization, Pseudonymization and Personal Data’) recently published in the Wisconsin International Law Journal on this issue for interested readers.
To be perfectly honest this very topic is not the most popular among privacy and data protection advocates. The reason is that that there is a strong feeling that it is, in fact, not worth investing time and effort to understand what exactly could be ‘anonymous’ data [that is, data that has undergone legally-effective anonymisation processes – such that the data subject is no longer identifiable from it] for the purposes of avoiding the application of data protection law. Instead, some think it makes more practical sense to start from the assumption that data protection law applies anyway when data ‘relating to’ persons are processed, and devote time and energy to other issues such as ….
- Understanding when a data controller is not in a position to identify data subjects for the purposes of Article 11 of the GDPR? Remember, Article 11(2) states that – where the controller can demonstrate that it is not in a position to identify the data subject – Articles 15 to 22 GDPR (under which the data subject would otherwise be able to exercise certain rights in respect of the data) shall not apply
- Or, determining what the legal effects of pseudonymisation are given the legal presumption that data that has undergone pseudonymisation processes is personal data? [Remember, Article 4(5) GDPR introduces ‘pseudonymisation’ as a formal and very specific legal ‘term of art’, meaning “the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person”. Recital 26 GDPR states, “[p]ersonal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person” (emphasis added).]
Well, all these issues are in fact related since they all imply striking a balance between data usage and re-identification risks, or data usage and data subject prerogatives to have their rights in respect of the processing of data relating to them adequately protected, in one way and another.
Because standards are set at the EU level in this field, it is useful to observe what is happening in other Member States and in particular in France. On 8 February 2017, the French Administrative Supreme Court (Conseil d’Etat, or CE) delivered a fascinating judgment on the issue of anonymisation.
To introduce the background facts, the French company JCDecaux had filed a request with the French Data Protection Agency (Commission Nationale de l’Informatique et des Libertés, or CNIL) to be authorised to process personal data for the purpose of testing over a period of 4 weeks a methodology for quantitatively estimating flows of pedestrians walking on the square of La Défense (a business district at the outskirt of Paris). The project required the installation of 6 WiFi tracking boxes on billboards in order to capture MAC (media access control) addresses that contain unique identifiers for each internet-connecting device, as well as other network identifiers,for each internet-connecting device situated within a distance of 25 metres in order to calculate its geographic locations.
However, the CNIL in a decision of 16 July 2015 refused to authorise the processing. JCDecaux therefore attacked the decision of the CNIL before the CE claiming that it should be annulled on the ground of abuse of power.
Art. 32 of the French Data Protection Law (Loi n° 78-17 du 6 janvier 1978 relative à l’informatique, aux fichiers et aux libertés) is crucial in this case because if the personal data collected is subjected to anonymisation processes without delay (or shortly after the collection) then the information obligation imposed on the data controller is more limited: the data controller shall only inform data subjects of his/her identity, and of the purposes for which the data is being processed. JCDecaux was thus arguing that the personal data at stake was intended to be anonymised shortly after the collection.
What does the CE say about the concept of anonymisation and its legal characterisation?
Paragraph 7 is of particular relevance. To translate the words of the CE, “personal data can only be deemed anonymous when identifying the data subject, directly or indirectly, becomes impossible for the data controller or a third party. This is not the case when it remains possible to single out a data subject or to link two records corresponding to the data subject”. [Is it true that ‘singling out’ in the sense of individualising records necessarily implies that either identification is achieved or identifiability is possible (with data relating to individual requiring that they be identified or identifiable for personal data to be present in law)? Computer scientists would generally argue the opposite from a non-legal and – might we say – pragmatic perspective.]
To anonymise the data at stake, JCDecaux had decided to truncate the last half of the final octet of the Mac Address as well as to apply the technique of hashing to it using a company-specific ‘salt’. JCDecaux argued that the application of such techniques had the effect of making the re-identification risk negligible. Moreover, the potential impact on data subjects was all the more limited (and the intended processing activities warranted) because the collection of data was to be undertaken in the context of an experiment carried out for a limited duration. In addition, the purpose of the processing was to improve the revenue-generating potential of billboards, which therefore meant that JCDecaux was not interested in re-identifying data subjects from their MAC addresses.
What does the CE respond to this?
Well, it said that the data controller, JCDecaux, had the capability to re-identify owners of devices (terminal equipment) whose MAC addresses would be captured despite the technical and organisational measures implemented, which only aimed at preventing third parties from accessing the data. The latter measures did not prevent the data controller from re-identifying individuals, linking records relating to the same individual, or inferring new information about individuals upon processing the collected data.
In addition, the CE noted that the processing consisted in counting the number of pieces of terminal equipment but also to measure the repetition of their passing and to determine their paths between billboards. The processing at stake was therefore to be undertaken, the CE concluded, precisely for the purpose of identifying the movements of individuals and their repetition on the pedestrian square of La Défense for the whole duration of the experiment.
As a result, the CE held, the CNIL had not committed any mistake when deciding that the objectives for which the data was to be collected was indeed incompatible with a finding of (legally effective) anonymisation in respect of such data .
Reading the decision (the technical term is actually ‘deliberation’) of the CNIL itself, one is given a more detailed picture of the processing at stake.
First, the CNIL notes that the legal basis for the processing is not ‘consent’ but ‘the legitimate interest of the data controller.’ The role of the CNIL is, therefore, to check whether the interests and fundamental rights and freedoms of the data subjects have been properly taken into account.
The CNIL also observes that ultimately the experiment is to be conducted by the company Fidzup – which will build the boxes, install them, store, and analyse the data – on behalf of JCDecaux.
The CNIL finds that the envisaged processing purpose was specific, explicit and legitimate. Moreover, no decision was to be taken on the basis of the processing, as well as no commercial targeting was to be undertaken at a later point of any individual about whom the data related.
As regards the anonymisation technique proposed, the CNIL does mention that the data is going to be aggregated at the end of the analysis (based on averaging the number of detections hour by hour, number of unique detections by day/week/months, and patterns of mobility), although obviously this would have only happened after 4 weeks. Ultimately, the data is meant to be destroyed.
The CNIL concludes, however, that the process at stake is not that of (legally effective) anonymisation because JCDecaux is using one of its own salts and therefore would be capable of recovering the data.
What does this mean? That as long as the data controller has the technical capability to recover the original data the dataset that has undergone a technical transformation [i.e. a sanitisation process to try to use a ‘neutral terminology’], the dataset can never be deemed to be legally anonymous… [in the hands of the data controller only? Most likely not, i.e. it is likely that the data would be deemed personal data when processed in the hands of third parties too…]
In other words, this would mean that legal controls, e.g. contracts with recipients of datasets, and/or internal policies governing the activities of the staff within the entity of the data controller relating to data access), could never be used to mitigate re-identification risks in ways sufficiently effective that they may be deemed to counterbalance technical capabilities pursuant to which residual risk remained.
As if it was not clear already, the CNIL continues by stating that “for an anonymisation solution to be efficacious, it must prevent all parties to single out [in the sense of isolating] an individual within a set of data, to link records between themselves within one set of data (or within two distinct sets of data) and to infer information from this set of data.” Once again, it also said, it was because of the precise purpose of the processing envisaged that no legally-effective anonymisation could be characterised in the case at hand.
In summary, the CNIL characterises the envisaged process at stake as a mere pseudonymisation technique not precluding the possibility for later linkability with other information, and inference of new information from the collected data. And, again, a pseudonymisation technique is not tantamount to an anonymisation technique: it is simply a security measure.
The CNIL is in fact quite cautious and finally observes that, if the processing was to be authorised in the terms requested by JCDecaux, it would then be undertaken without the knowledge of the data subjects.
If the French view was the correct way to interpret both the Data Protection Directive and the new General Data Protection Regulation (GDPR), would EU data protection law be striking a proper balance between the free flow of data and the protection of privacy of its subjects?
All this story becomes truly fascinating when read against the backdrop of the recent opinion of the EU Article 29 Working Party (Art. 29 WP) on the proposal for a ePrivacy Regulation to replace the ePrivacy Directive.
Why? Because in this opinion Art. 29 WP concludes:
“With regard to WiFi-tracking, depending on the circumstances and purposes of the data collection, such tracking under the GDPR is likely either to be subject to consent, or may only be performed if the personal data collected is anonymised.”
Could it mean that it is actually crucial to get the concept of anonymisation right?