On 5 February 2015, the Article 29 EU Data Protection Working Party (WP) issued a letter addressed to Paul Timmers – the Director of Sustainable and Secure Society at the European Commission. Within the Annex of this letter, the WP identifies relevant criteria to determine when data processed by lifestyle and wellbeing apps and devices should be considered ‘health data’, legally defined. This clarification comes after the WP’s publication of an opinion on apps on smart devices in 2013, of which the aim was to ascertain the rights and duties of each relevant party in app interactions.
For background, in the context of its work in providing guidance on the EU legislation applicable to such apps, the European Commission had previously requested that the WP clarify the category of health data. In its working document on the existing EU legal framework applicable to lifestyle and wellbeing apps accompanying the Green Paper on mobile health, the Commission had slightly touched upon the definition of health data by referring to a WP’s working document of 2007, from which the Commission concludes that “health data should cover any personal data closely linked to the health status of a person, such as genetic data or data on consumption of medicinal products or drugs”.
In the Annex to its letter of 5 February, the WP is a bit more prolix.
First, the WP acknowledges that the Data Protection Directive does not clearly define what falls within the category of health data within the meaning of its Article 8, which in the context of ‘Big Data’ (to use a buzzword) is problematic. In effect, “defining the category of health data is important to determine in what circumstances the data processed by lifestyle and wellbeing apps and devices are to be considered data about health”. This is all the more true because the regime for the processing of health data is more restrictive than the regime for the processing of other types of personal data. In most cases explicit consent is required before health data can be processed, unless national law expressly provides that consent can never justify the processing of such data (See Article 8) [Note that there is however an important caveat: when the processing relates to data which are manifestly made public by the data subject, explicit consent is not required anymore].
According to the WP, health data within the meaning of Article 8 of the Data Protection Directive [as well as within the meaning of Article 4 of the proposed General Data Protection Regulation since the Working Party is referring to the text of the proposed Regulation to interpret that notion] covers 2 types of data:
- Medical data: “data about the physical or mental health status of a data subject that are generated in a professional, medical context’. This includes all data related to contacts with individuals and their diagnosis and/or treatment by (professional) providers of health services and any related information on diseases, disabilities, medical history and clinical treatment. This also includes any data generated by devices or apps, which are used in this context, irrespective of whether the devices are considered to be ‘medical devices’”.
- All other “data pertaining to the health status of a data subject”.
In elucidating on what exactly should be included in the second data category, the WP borrows from Recital 26 of the Proposed General Data Protection Regulation that lists the kinds of information that personal data relating to health should include. Like Recital 26, the WP opines that “any information about ‘disease risk’ and about ‘the actual physiological or biomedical state of the data subject independent of its source’” falls within the category of health data. In other words, each time health status [note that ‘health status’ is broader than ‘ill health’] can be inferred from data, such data falls within the category of health data. To be even clearer, even if the data at stake does not seem to be health data on the face of it, it can still fall within the category of health data if it is collected for the purpose of determining the health status of data subjects. To quote the words of the WP directly here: health data “may also include cases where a controller uses any personal data (health data or not) with the purpose of identifying disease risk (such as, for example, investigating exercise habits or diet with the view of testing new, previously unknown or unproven correlations between certain lifestyle factors and diseases) ”. To give an example, the WP refers to ’sad’ messages sent by users: when examined “for the purpose of diagnosis/health risk prevention or medical research” such messages constitute health data.
It is true that the Annex could seem at first glance inconsistent. On the one hand, it seems that any personal data can become health data if it is collected for the purpose of inferring health status. On the other hand, the WP draws a distinction between raw data (which would not always fall within the category of health data) and conclusions drawn from raw data (which attempt to describe or express the health status of a data subject). However, the justification underpinning this latter distinction seems to collapse into another way of saying that when raw data are collected for the purpose of describing or expressing the health status of a data subject, the whole dataset become health data.
This second subcategory of health data is thus very broad in scope and arguably includes within its remit the first subcategory of health data aforementioned, since medical data is also data collected for the purposes of determining the health status of a data subject and then to make a diagnosis and eventually provide appropriate treatment.
In consequence, what seems to be the crucial criterion is the usage of data and not so much its very nature, which could mean that in the end the criterion is more subjective than objective: what matters would be the intention of the data controller rather the characteristics of the data themselves [unless an objective or eventually mixed standard is used to determine the intention of the data controller, e.g. that of a vigilant data controller; it could be argued in such a case that if a rich dataset is in fact collected, the intention of the data controller must have been that of characterising the health status of data subjects ].
Can we argue from this that the same kind of reasoning should be used to determine the remit of the category of ‘personal data’ itself under EU data protection rules? Following this logic, personal data might be redefined as data used to identify or single out data subjects. Is this a clearer definition than the definition found in existing, as well as purportedly “soon”-to-be adopted, European legislation (e.g. “information relating to a data subject” as per Article 4 of the proposed Regulation)?
One key question to fully grasp the contours of the category of health data (and maybe that of personal data as well) is the relevance of the possibility of combining different datasets. The WP recognises that “raw, relatively low privacy impact personal data can quickly change into health data when the dataset can be used to determine the health status of a person”. This is particular likely to occur when a first set of data is then combined with another set of data.
And here is the answer from the WP in the Annex: “there has to be a demonstrable relationship between the raw data set and the capacity to determine a health aspect of a person, based on the raw data itself or on the data in combination with data from other sources”.
What does a ‘demonstrable relationship’ mean here? An example is given to clarify the statement. “If a diet app only counts the calories as calculated from input provided by the data subject, and the information about the specific foods eaten would not be stored, it would be unlikely that any meaningful conclusions can be drawn with regard to the health of that person (unless the daily intake of calories is excessive in absolute terms)”. But if this data is combined with another dataset (e.g. information coming from a social network), the WP goes on to say, it may be that conclusions could be inferred about the health status of data subjects. In those circumstances, the WP concludes that “it is likely that health data can be inferred from the combined data”.
Where does this leave us? It would seem that in order to avoid data being qualified as health data, it is crucial to assess upfront the extent to which it is possible to minimise the amount of such data collected and, above all, to make sure such data will not be combined with other datasets, in particular with data coming from more generalist websites or apps.
Does this mean that data gathered by generalist websites or apps do not fall within the category of health data? It depends…. If a generalist website or app starts to ask its users to input a health status, for example, then the data is clearly health data. The same should be true if the website or app combines a wide variety of information (information about daily physical activities, menus, habits…), as it would become quite easy to infer a health status from this information.