“I am not a number …” – but to what extent does EU data protection law deem that I am identifiable from one if someone somewhere could link it back to me at a single point in time?
The Court of Justice of the EU (CJEU) has been hearing arguments in a case involving the legal status of internet protocol (IP) addresses – the strings of numbers that are assigned uniquely to individual devices when they connect to the Internet – under EU data protection law in Breyer v. Federal Republic of Germany (C-582/14). In particular, on 12 May 2016, Advocate General Campos Sánchez-Bordona (the AG) released a formal Opinion (this week made available in English) for the CJEU to consider before it issues its final decision.
One of the salient issues addressed in the Opinion is whether the processing of dynamic IP address falls within the remit of data protection rules because they may be deemed ‘personal data’. For background, the IP address assigned to your internet-connecting device may change if this address is assigned dynamically to your device by your internet service provider (ISP). Said otherwise, there is no guarantee that you will receive the same IP address from the address pool available for assignment by your ISP each time you go online. As such, it is unlike the situation where you have a static (also known as fixed) IP address, which allow continuous identification of the device connected to the network. We are, therefore, faced with two facts:
- A record of the date and time of a connection and the numerical address from which it originated do not reveal, directly or immediately, the identity of the natural person who owns a device used to access the website, or indeed the identity of the user operating the device (who could be any natural person).
- However, a dynamic IP address can facilitate the indirect identification of the owner of the device used to access the website and thus its user at a particular time. In the words of the AG himself, “Does the possibility that there may be such additional data, capable of being linked to the dynamic IP address, in itself make it possible to classify a dynamic IP address as personal data under the [data protection] directive?”
Why is this important? According to the facts of the case from which questions were referred to the CJEU by the German Federal Court of Justice, politician and activist, Patrick Breyer, argued that the German Government had violated his right to data protection by storing data about him visiting governmental websites for longer than was necessary to deliver the website content or to enable the identification and prosecution of individuals responsible for network attacks. Breyer argues that a dynamic IP addresses associated with one of his visits could be linked back to him and would thus constitute personal data, even if such an association remains an abstract possibility. Consequently, he argued that this data collection practice is incompatible with the provisions of the EU Data Protection Directive (DPD) because the consent of individuals has not been obtained for the processing of their personal data.
The German government disagreed, arguing that dynamic IP addresses are not personal data since possessing this information does not allow it to identify the natural persons making use of such addresses at a particular point in time. According to the government, the information that may allow for such identification is only available to the ISP that assigned the address. That is, as ISPs systematically log in a file the date, time, duration and dynamic IP address given to each of its subscribers, they are normally capable of connecting a particular address with a particular date/timestamp and, therefore, potentially a particular person (who may in turn have been the device user at that time).
The question before the CJEU is “must Article 2(a) of [the DPD] be interpreted as meaning that an IP address which a service provider stores when his website is accessed already constitutes personal data for the service provider if a third party (an access provider) has the additional knowledge required in order to identify the data subject?” [The other question referred to the CJEU is – assuming they are deemed personal data – and the intention behind processing such data is to ensure the general operability of a website, can operators rely upon the “legitimate interest” criteria (Article 7(f) DPD) to provide legal basis for their actions, instead of the consent criteria (Article 7(a) DPD). For background on when personal data processing may be construed as necessary for the purposes of the legitimate interests pursued by the data controller, see my previous post here.]
As mentioned in an earlier post here, the important point of law at issue under the first question relates to the proper interpretation of the identification component of the legal concept of personal data under the DPD: “”personal data” shall mean any information relating to an identified or identifiable natural person (“data subject”)” (Article 2(a)). In other words, should the legal classification depend on the additional information being accessible in practice to the data controller that carries out the processing, or should it more broadly take into account what additional information is available to a third party for whom the identification of a specific person to which a particular dynamic IP address relates is possible (even if the data controller does not hold such information)? In terms of the broader implications of this question, these can be construed as a choice between:
- A subjective approach on what is considered as ‘identifiable’ information: in considering the capacity to effect an identification of a specific individual from a piece of data, only the perspective and capabilities of the person carrying out the processing activity under consideration should be taken into account. On a strict interpretation, this would mean that the possibility that a user may ultimately be identified from data with the assistance of a third party is insufficient for it to be classified as personal data. What is relevant is the capacity of a person who has access to data to use her own resources to identify an individual from those data. If accepted, this approach may also be taken to imply a relativistic approach – a piece of data can be both personal data and non-personal data at the same time, as different people processing the data may simultaneously have different perspectives and capabilities that can be considered in the alternate. For example, what are personal data for an ISP may be deemed non-personal data for a website operator, if the identity of a natural person cannot be determined from the data by “reasonable means” that are directly available to this operator (see Recital 26 of the DPD). This may be because the additional information that would allow for identification is not held by the operator, nor is such information readily available to it.
- An objective approach on what is considered as ‘identifiable’ information: the perspectives of all people who may be able to effect an identification of a specific individual from a piece of data to whom they relate should also be considered. If accepted, this approach may also be construed as implying an absolutist approach – that is, if a piece of data is personal data for one person (because the data subject is identifiable to that person from the data), that data should be deemed personal data in all cases and for all those who process it regardless of constraints on subjective abilities and means. For example, the mere fact that additional information is held by an ISP allows for identification of natural persons related to certain IP addresses would then be sufficient to make it personal data for all. More broadly, an absolutist approach may imply the consequence that if there is the mere possibility that there could be additional information (somewhere in the possession of someone) that allows for identification of individuals, the underlying data should be construed as personal data. Conversely, as it would never be possible to rule out, with absolute certainty, the possibility that there is no third party in possession of additional data which may be combined with that information and are, therefore, capable of revealing a person’s identity, all data that relates to a person may be deemed personal data!
While, previously, the CJEU has addressed the issue of IP addresses and their status under the DPD, its identification analysis has so far been limited. For example, in Scarlet v SABAM (Case C-70/10), the CJEU confirmed that IP addresses are personal data, but the explanation for this finding was limited to a statement that this was because “they allow those users to be precisely identified”. However, to distinguish the discussion thus far, in this case (and related others at EU and Member State level) the personal data status of IP addresses was a secondary issue. Furthermore, the CJEU has yet to rule upon the status of dynamic IP addresses. Therefore, the exact wording of the CJEU in the Breyer case explaining its thought processes should be regarded as just as important as its final conclusion.
So what did the AG say? He starts by stating that the issue of whether static IP addresses are personal data is not in dispute. In respect of dynamic IP addresses, the AG states that they “must be classified, for the provider of Internet services, as personal data” in connection with access to its web page (suggesting a relativistic approach). But, this is because “of the existence of a third party (the Internet service provider) which may reasonably be approached in order to obtain other additional data that, combined with a dynamic IP address, can facilitate the identification of a user”. In other words, he appears to shy away from a strict relativist approach in favour of a more objectivist angle. This is consistent with Recital 26 of the DPD, which states that “to determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the controller or by any other person to identify the said person.”
However, there are further nuances and caveats to his reasoning that should be heeded in this respect:
- In considering which third parties, possessing potential means of identification, should be included when determining the question of identifiability, the AG eliminates ‘abstract risks of identification’ scenarios in which the requisite additional information is held by a third party who is hypothetical, unknown or inaccessible (unlike an ISP who is “known with certainty to be in possession of the data required by the service provider to identify a user”). In other words, he discards the absolutist approach as too broad.
- Conversely, the AG pointed out that it must be likely that additional information will be used – in association with the IP address – to identify the user. In other words, in considering the extent to which in the future the operator could ask the ISPs for additional data in order to combine them with the dynamic IP address, he appears to retreat to a more subjectivist approach. Abstracting the logic from this statement, there must be the reasonable possibility of the existence of an ‘accessible’ third party, having the means necessary to facilitate the identification of a person.
- Furthermore, in continuing to examine the perspective of the operator and its reasonable means to identify an individual, he discounts scenarios where ISPs are not allowed to provide additional information to operators without a proper legal ground to do so. The alternate view is that the possibility of obtaining identifying information from ISPs, even if prohibited by law, should still be taken into account when decisions about the identifiability of IP addresses are made. This, of course, reflects a more general debate about whether Illegal means of identification can ever be part of determinations as to what constitutes identifiable information. To note, the Article 29 Working Party (“WP”) has not previously addressed, and seems not to distinguish between, legal and illegal means of obtaining identifying information. This, in turn, suggests that the alternative view that it is reasonable to include both types of methods when assessing the possibility of identification from IP addresses remains a credible one, at least until the CJEU’s final decision is published. [Although undoubtedly overcoming legal obstacles may increase the amount of efforts and costs associated with obtaining identifying information!]
- On the other hand, the AG says that it is neither relevant that there is no practical intention of the controller to identify the user at the time of the IP logging, nor that the ISP may in fact refuse to reveal the additional information. Such information should still be taken into account as means ‘likely reasonably’ to be used by the operator assuming that the operator could legally get access to such additional information in the future: i.e. the practical possibility “that the data may be transferred, which is perfectly ‘reasonable’, itself transforms the dynamic IP address”. If that were not the case, he says, website operator data controllers (like governments) could retain IP addresses indefinitely and could request at any time from the ISP additional data to combine with the IP address in order to identify the user. Thus, although the AG acknowledged that while an ISP would not in all instances provide such information (e.g. due to legal restrictions), they do qualify as “means likely to be used” by federal agencies. The AG explains that, should a federal agency wish to identify an individual (e.g. to claim damages after misuse of a website) via a dynamic IP address, it would “likely reasonably” revert to the ISP to request for the identifying information belonging to the IP address.
In other words, the AG is aware of the risks involved in scoping the legal concept of personal identifiability from data too broadly and, therefore, tries to somehow find a conciliatory solution. This is line with the test set out in Recital 26, that consideration of information that could identify a natural person is limited to the means that are “likely reasonably to be used” by a specific party. This should encompass consideration of certain third parties that may hold additional information, insofar as legal ways of getting access to such information from these third parties are “likely reasonable to be used” by the specific party whose perspective is being adopted.
In conclusion, while the opinion of the AG is not binding on the CJEU, it is highly influential and often a good indication of how the court will eventually rule. Therefore, an analysis of the Opinion is valuable and a step towards more certainty regarding the legal analysis of personal identifiability under data protection rules. Undoubtedly more clarity is required in spelling out the overarching logic of conclusions drawn about what are personal data and what are not, and try to disentangle the different arguments used.
On the other hand, precisely because the AG embraces the reasonableness element of determining identifiability – which in turn requires a relativist assessment of the circumstances of the processor at issue – much depends upon the context in which the data is used. Hence, the AG’s chosen framework of analysis is tailored to the data situation at hand. The challenge, of course, is to determine which means should be deemed likely and reasonable and taken into account when decisions about the identifiability of individuals are made. Yet what aids to identification are likely and reasonable to be used (or which should be considered unlikely and unreasonable) necessarily depends on the circumstances of each individual case. (For example, the average website or portal operator normally does not have the legal and technical means of a government to request information from an ISP). Moreover, this reasonableness analysis cannot be sidestepped in determinations relying upon references in the new GDPR to online identifiers as relevant to the definition of personal data (Article 4(1), Recital 30), and the encompassing of identifiability to include the possibility of a nameless person being singled out from data (Recital 26).
Certainly, in waiting for the CJEU’s final decision, we may ponder the extent it will grasp the nettle and bring clarity to this long-running discussion either on a narrow view, or taking a broader view on the status of IP addresses in general. And remember the referring court did not draft its question to include allusion to any third party which is in possession of additional data, but only to an ISP – yet the CJEU could extend its analysis (like the AG) to consider the potential relevance of other possible holders of such data.
In the meantime, we cannot exclude the “precautionary and preventative”, risk-averse view put forward by the WP in its Opinion on the concept of personal data, 4/2007 (page 16; “unless the Internet Service Provider is in a position to distinguish with absolute certainty that the data correspond to users that cannot be identified, it will have to treat all IP information as personal data, to be on the safe side”) and Opinion on data protection issues related to search engines, 1/2008 (page 8; “though IP addresses in most cases are not directly identifiable by search engines, identification can be achieved by a third party. Internet access providers hold IP address data. Law enforcement and national security authorities can gain access to these data and in some Member States private parties have gained access also through civil litigation. Thus, in most cases – including cases with dynamic IP address allocation – the necessary data will be available to identify the user(s) of the IP address.). Yet, such statements are not absolute, e.g. even the WP acknowledges that not all IP addresses can be effectively linked to a user (e.g. those attributed to computers in an Internet café). And the WP doesn’t address the issue of what happens if an ISP’s time-stamped data logs are not retained? On the other hand, where it is possible to track and correlate logged web searches originating from a single dynamic IP address (e.g. in the case of Google that has at its disposal different means – such as user-unique ID browser cookies – to ‘identify’ a particular person from the address other than approach the ISP), this more obviously makes the case that such addresses should be treated as personal data. Context is indeed everything!
What is clear is that these issues are not as straightforward as they may seem at first glance. We must be satisfied in knowing that what this case does not promise to do is clarify whether dynamic IP addresses are, always and in all circumstances, personal data within the meaning of the DPD. Nor will it settle which factors should be considered when deciding if a means of identification is likely reasonably to be used [sophistication, time, money, expertise, manpower, information security risks, the likely emergence of future identity-linking technologies? …and how do we quantity these? And then, there is of course purpose and whether this is linked to identifying the user as a factor that could be considered (based on the theory that the greater the motivation, the greater also the willingness to acquire and use identity-revealing means)…].
Yet certainly this case should be watched closely by all those interested in the threshold point where non personal data becomes personal data (and vice versa). These include web operators that collect IP address for marketing purposes, including those that justify their actions by alluding to the fact they subsequently apply pseudonymisation techniques to them.
Is the CJEU also going to give us a heterogeneous analysis that is limited to a very specific set of facts, or more general presumptions peppered by caveats aplenty? And how will the CJEU bring it all together with consistency, bearing in mind that it has been referred questions by another Member State court here – in a case asking whether exam papers are personal data – that calls for a similarly detailed-level analysis in this area?