CPPA: identifying the inscrutable meaning and policy behind the de-identifying provisions
The Consumer Privacy Protection Act (CPPA) will make substantial changes to Canada’s privacy law. As noted previously, the bill includes many of the provisions in the Personal Information Protection and Electronic Documents Act (PIPEDA), plus a lot more. In a prior post, CPPA: transfers of personal information to service providers, I examined the new provisions dealing with transfers of personal information to service providers.
In this post I examine the significant proposed changes to the law as they relate to personal information that has been “de-identified”. The term “de-identified” is often used as a general term that includes privacy and security processes that renders personal information as either “anonymized” or “pseudonymized”. “Pseudonymization” commonly refers to a de-identification method that removes or replaces direct identifiers from a data set, leaves in place data that could be used to indirectly identify a person. This data is generally still subject to privacy laws. “Anonymization”, by contrast, generally refers to a stronger form of de-identification which (depending on the formulation) makes re-identification impossible, reasonably unlikely, or not reasonably expected. This data is generally considered to be outside of general privacy law obligations.
The proposed “de-identification” changes to PIPEDA are of considerable concern. They appear to treat all de-identified information as being subject to the CPPA. By failing to calibrate the law to address the significant differences between anonymized and pseudonymized data, the CPPA risks stultifying uses of anonymized data for many economically and socially beneficial uses. They are a departure from the highest internationally recognized standards, could impair the interoperability of our laws across the country and with those of our major trading partners, and hurt our competitiveness domestically and internationally.
De-identification under PIPEDA
PIPEDA does not expressly address whether personal information that has been anonymized (or rendered sufficiently anonymized) is still subject to the Act. However, as a matter of statutory construction it is clear that when personal information is permanently stripped of identifying information it is no longer subject to the Act.
PIPEDA defines personal information as “information about an identifiable individual”. Thus, when information is “not about” an individual that can be identified, it is not personal information. Decisions interpreting PIPEDA and other similarly worded federal and provincial privacy laws have confirmed this.
The only somewhat ambiguous issue is what level of anonymization must the information be subject to before it is considered to be anonymized. The Federal Privacy Commissioner has historically advanced the “serious possibility” of identification or re -identification test to determine when personal information becomes anonymous. More recently in commenting on the privacy implications of the COVID notification app, the Commissioner stated:
True anonymity, technically speaking, would require the complete and permanent impossibility of reversing the data processes at play, which could reveal sources of personal information and so re-identify individuals. Put another way, in order for data to be rendered truly anonymous, it must be stripped of any and all potential linkages back to individuals
The weight of federal and provincial authority, however, has adopted the more practical “reasonable expectations” test. Under this formulation, information is still about an “identifiable” individual “if it is reasonable to expect that an individual can be identified from the information in issue including when combined with information from sources otherwise available”. If it is not, the compliance obligations under PIPEDA do not apply.
PIPEDA does not expressly address whether the process for anonymizing personal information requires consent. On one view, this is a use of personal information and so requires a consent. The consent may, arguably, be one that is implied or may be considered to be included in the purposes for which consent has been given. On another view, no consent is needed since the process of de-identification is not a use as it results in information that is not subject to the Act. The decisions which have considered whether anonymized information is subject to the law implicitly accept that the process of anonymizing personal information is not a use, or at least not a use that requires an additional consent.
PIPEDA also does not expressly deal with the status of personal information that has been pseudonymized. Based on the “reasonable to expect” test, it is likely that this information would remain “personal information” and subject to PIPEDA as it would be “reasonable to expect that an individual can be identified from the information in issue including when combined with information from sources otherwise available”.
Need for reform
Government studies related to privacy reform in Canada have noted the desirability for reforming the law related to personal information de-identification to address privacy concerns. The government has also recognized the desirability of doing so while also enabling innovation and using similar approaches to what exists in other jurisdictions to help with interoperability concerns, and to bring greater certainty to individuals and organizations, especially in the cross-border context.
Reforms in this area of the law are important to enable Canadian organizations, like their counterparts, in the European Union and the United States, to use anonymized and pseudonymized personal information. Many beneficial uses have already been identified including, for example, machine learning, open data initiatives, research and development including collaborative research, creating large data sets for predictive analytics, data trusts, data pooling, and monetization. Undoubtedly there are a host of other uses that cannot even yet be imagined. For a relatively small country as Canada, being able to pool and share data for collaborative research and development purposes is critically important to address data asymmetries between Canadian organizations and much larger ones in other countries such as the United States, members of the European Union, and China.
International standards for de-identification
The GDPR expressly addresses whether anonymized personal data is subject to the Regulation. It is not. Personal data that is irreversibly and effectively anonymised is not “personal data” and the data protection principles do not have to be complied with in respect of such data. Organisations don’t have to be able to prove that it is impossible for any data subject to be identified in order for an anonymization technique to be considered successful. Rather, the GDPR applies a “reasonably likely” test. If it can be shown that it is reasonably likely that a data subject will not be identified given the circumstances of the individual case and the state of technology, the data can be considered anonymized.
Under the GDPR a new consent is not required for the process of anonymising personal data. Anonymization is considered to be compatible with the original purposes of the processing on condition the anonymization process is such as to reliably produce anonymized information. A further legal basis for such processing might be the “legitimate interests” exception to consent.
Personal data that has only been pseudonymized is still subject to the Regulation. This process is regarded as a good practice security measure. However, the GDPR permits processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes, and general analysis, if personal data has been pseudonymized. Further, where processing is done for a purpose other than the purpose for which personal data was collected, with certain exceptions, in order to ascertain whether processing for another purpose is compatible with the purpose for which the personal data was initially collected, “the existence of appropriate safeguards, which may include encryption or pseudonymization” are taken into account.
The California Consumer Privacy Act of 2018, as amended (CCPA) defines personal information as “information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household” and includes specific examples of information that is covered. The law expressly excludes “consumer information that is deidentified or aggregate consumer information”. It also states that the obligations imposed on businesses do not restrict a business’s ability to, among other things “Collect, use, retain, sell, or disclose consumer information that is deidentified or in the aggregate consumer information.”
The CCPA defines the term “deidentified” as follows:
“Deidentified” means information that cannot reasonably identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer, provided that a business that uses deidentified information:
(1) Has implemented technical safeguards that prohibit reidentification of the consumer to whom the information may pertain.
(2) Has implemented business processes that specifically prohibit reidentification of the information.
(3) Has implemented business processes to prevent inadvertent release of deidentified information.
The CCPA also contains a general prohibition against reidentifying or attempting to reidentify deidentified personal information. But, it is subject to certain exceptions such as for certain treatment and health cases purposes, public health activities, research, to conduct testing, analysis, or validation of deidentification, or related statistical techniques, or where required by law.
Thus, as in the European Union, businesses in California are permitted to deidentify and use such information free of the restrictions that would otherwise apply to personal data or personal information. However, the rules under both the GDPR and CCPA still impose relatively high standards in practice to truly deidentify personal information.
The CPPA position on anonymization
The CPPA would leave unamended the definition of personal information in PIPEDA as meaning “information about an identifiable individual”. However, it would add the following new definition of “de-identify”:
de-identify means to modify personal information — or create information from personal information — by using technical processes to ensure that the information does not identify an individual or could not be used in reasonably foreseeable circumstances, alone or in combination with other information, to identify an individual. (dépersonnaliser)
s.20 of the CPPA would also clarify that no fresh consent is required to de-identify personal information:
De-identification of personal information
20 An organization may use an individual’s personal information without their knowledge or consent to de-identify the information.
The CPPA would also include a new prohibition against re-identification of information.
75 An organization must not use de-identified information alone or in combination with other information to identify an individual, except in order to conduct testing of the effectiveness of security safeguards that the organization has put in place to protect the information.
Prohibitions against re-identification of personal information are becoming more common. A similar prohibition exists in s11.2 of Ontario’s Personal Health Information Protection Act. However, the prohibition contains a number of exceptions which permit re-identification by health custodians and others. That Act also permits other organizations to be added by via regulations, a good idea that addresses the need for flexibility, but which is lacking in the CPPA. The prohibition against re-identification in the CPPA also does not include the much longer list of beneficial exceptions that are available under the CCPA (treatment and health cases purposes, public health activities, research, to conduct testing, analysis, or validation of deidentification, or related statistical techniques, or where required by law).
The new CPPA provisions which permit de-identification of personal information would appear, looking only at those amendments, to generally confirm the existing interpretations of PIPEDA. So interpreted, they would also generally align with the provisions in the GDPR and CCPA in that they would confirm that organizations can anonymise personal information and use it outside of the new privacy law. The standard for de-identification to render data anonymous has now been codified as a “reasonably foreseeable” rather than the “reasonable to expect” or “serious possibility”, of re-identification tests. In this respect, the test now somewhat departs from the generally understood (reasonable to expect) wording adopted across the country, which raises questions as to whether the government intends this wording to be different and to create somewhat different standards across the country for data to become anonymous, and why.
It appears, however, that de-identified information may still be subject to the CPPA. Officials from ISED (Jennifer Miller and Charles Taillefer), in a briefing to members of the Chamber of Commerce’s Innovation and IP Committee on November 27, 2020, stated this was the intention. The drafting of the CPPA also suggests this possibility. Unlike the GDPR and the CCPA there is no express statement that de-identified information is not subject to the CPPA. Other proposed amendments suggest this also.
Sections 21, 22(1) and 39(1) contain new exceptions for research and development, prospective business transactions, and socially beneficial purposes.
Research and development
21 An organization may use an individual’s personal information without their knowledge or consent for the organization’s internal research and development purposes, if the information is de-identified before it is used.
Prospective business transaction
22 (1) Organizations that are parties to a prospective business transaction may use and disclose an individual’s personal information without their knowledge or consent if
(a) the information is de-identified before it is used or disclosed and remains so until the transaction is completed;
(b) the organizations have entered into an agreement that requires the organization that receives the information
(i) to use and disclose that information solely for purposes related to the transaction,
(ii) to protect the information by security safeguards appropriate to the sensitivity of the information, and
(iii) if the transaction does not proceed, to return the information to the organization that disclosed it, or dispose of it, within a reasonable time;
(c) the organizations comply with the terms of that agreement; and
(d) the information is necessary
(i) to determine whether to proceed with the transaction, and
(ii) if the determination is made to proceed with the transaction, to complete it.
Completed business transaction
(2) If the business transaction is completed, the organizations that are parties to the transaction may use and disclose the personal information referred to in subsection (1) without the individual’s knowledge or consent if
(a) the organizations have entered into an agreement that requires each of them
(i) to use and disclose the information under its control solely for the purposes for which the information was collected or permitted to be used or disclosed before the transaction was completed,
(ii) to protect that information by security safeguards appropriate to the sensitivity of the information, and
(iii) to give effect to any withdrawal of consent made under subsection 17(1);
(b) the organizations comply with the terms of that agreement;
(c) the information is necessary for carrying on the business or activity that was the object of the transaction; and
(d) one of the parties notifies the individual, within a reasonable time after the transaction is completed, that the transaction has been completed and that their information has been disclosed under subsection (1).
(3) Subsections (1) and (2) do not apply to a business transaction of which the primary purpose or result is the purchase, sale or other acquisition or disposition, or lease, of personal information.
Socially beneficial purposes
39 (1) An organization may disclose an individual’s personal information without their knowledge or consent if
(a) the personal information is de-identified before the disclosure is made;
(b) the disclosure is made to
(i) a government institution or part of a government institution in Canada,
(ii) a health care institution, post-secondary educational institution or public library in Canada,
(iii) any organization that is mandated, under a federal or provincial law or by contract with a government institution or part of a government institution in Canada, to carry out a socially beneficial purpose, or
(iv) any other prescribed entity; and
(c) the disclosure is made for a socially beneficial purpose.
Definition of socially beneficial purpose
(2) For the purpose of this section, socially beneficial purpose means a purpose related to health, the provision or improvement of public amenities or infrastructure, the protection of the environment or any other prescribed purpose.
If information that does not identify or cannot be used to identify an individual was not subject to the other provisions of the CPPA there would be no reason for these exceptions.
The CPPA would also prescribe a new proportionality standard for assessing whether the measures taken to de-identify personal information are adequate having regard to the purpose for which the information is de-identified and the sensitivity of the personal information. It states:
De-identification of Personal Information
Proportionality of technical and administrative measures
74 An organization that de-identifies personal information must ensure that any technical and administrative measures applied to the information are proportionate to the purpose for which the information is de-identified and the sensitivity of the personal information.
There would also be no reason for this exception if personal information was not intended to be subject to the CPPA. Further, if the exception applied to all uses of de-identified information it would be uncertain how the test could be applied to new unrestricted uses that could be made of the information. The provision would be potentially relevant, however, if information could only de-identified for the purposes of the new exceptions where the purposes of the de-identification and the sensitivity of the personal information could vary.
The CPPA, construed to apply to anonymized information, would make it inconsistent with the GDPR and the CCPA and would make economic and socially beneficial uses of personal information difficult for Canadian organizations as all of the fair information practice principles would apply to this data. These consequences were depicted in a paper written by Mike Hintze and Khaled El Emamet Comparing the Benefits of Pseudonymization and Anonymization Under the GDPR:
Most of the obligations that would not apply to anonymized personal data under the GDPR would apply to the applicable Canadian analogues under the CPPA. While the privacy obligations under the CPPA may be sensible in order to protect an individual’s reasonable expectations of privacy in respect of pseudonymized data, the same rationale in respect of anonymized data either does not exist or, at the very least, is significantly attenuated.
The increased burdens and practical obstacles that are created when anonymized data is subject to all of the same obligations as personal information cannot be justified. For example, artificial intelligence, and in particular, machine learning applications, require very large data sets to train their algorithms. Privacy law rules that limits collection of personal information to minimal amounts or that require data to be deleted could undermine the uses of the data. They could also interfere with other developing norms including the new CPPA requirement that organizations be able to explain predictions, recommendations, or decisions made about an individual using an automated decision system. It may be argued that the proposed exception for using de-identified information for research and development adequately addresses these problem. But, it does not because while such information can be used for R&D purposes, the data is still subject to all of the other applicable CPPA requirements which could undermine the usefulness of the exception.
The CPPA as interpreted to apply to anonymized data would be a radical change from PIPEDA and a departure from the highest international privacy standards. While Canada’s major trading partners can use anonymized personal information without restrictions, Canadian organizations could only do so for the limited purposes set out in the new proposed exceptions. The CPPA risks impeding uses of anonymized data for many beneficial uses. The provisions dealing with de-identification will impose heavy compliance costs that do not exist in the EU or the United States, even under the CCPA. The new rules around de-identification could also impair the interoperability of our federal privacy law with provincial laws and with those of our major trading partners and undermine our competitiveness domestically and internationally.
Despite the reasons why the CPPA may be construed to apply to personal information that has been pseudonymized or anonymized, there are also reasons to believe that the CPPA intends only pseudonymized information to be subject to the law. As noted above, the CPPA has exceptions for the uses of de-identified information for research and development, prospective business transactions, and for certain socially beneficial uses. These exceptions would make little practical sense and be of little value if they applied to anonymized information. Unlike personal information that has been anonymized, personal information that is only pseudonymized permits organizations to retain additional data sets that would enable the information to be re-identified as long as it is subject to technical and organisational measures to ensure that the personal information is not attributed to an identified or identifiable person. Organizations seeking to rely on the exceptions would be very unlikely to anonymise their personal information only so they could engage in these exceptions. Disclosing information as part of a proposed business transaction is an obvious example. The new proportionality principle might well also have been intended only to apply to pseudonymized information. This construction of the amendments would generally also align with the GDPR which contains specific exceptions for personal data that has only been pseudonymized (for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes, and general analysis). However, unlike the GDPR or the CCPA, the CPPA does not contain a definition for this term and the definition of the term “de-identify” makes this interpretation a challenge. The drafting, at best, is ambiguous.
Privacy laws are designed to strike a balance between different interests. On the one hand, individuals and the public have reasonable expectations that their privacy will be protected. The importance of the right has been repeatedly emphasized by the Supreme Court. To the extent that information retains the characteristics of being “personal information” namely, that it is about an identifiable individual, then it is subject to a balance that includes other legitimate uses of the information. The overall intended balance of the CPPA is reflected in s.5:
5 The purpose of this Act is to establish — in an era in which data is constantly flowing across borders and geographical boundaries and significant economic activity relies on the analysis, circulation and exchange of personal information — rules to govern the protection of personal information in a manner that recognizes the right of privacy of individuals with respect to their personal information and the need of organizations to collect, use or disclose personal information for purposes that a reasonable person would consider appropriate in the circumstances.
In respect of de-identified data, when the risk is high that de-identified personal information can be re-identified or is foreseeably likely to be re-identified, then the privacy interests of protecting it apply. When the risk of re-identification is low or when the information becomes (theoretically) impossible to re-identify then the rationale for continuing to protect the information under privacy laws, at the expense of other important interests, is either attenuated, severely attenuated, or ceases to exist. There may be other policy reasons for limiting the uses of such information, but the privacy interests decline as one moves along a sliding scale of information being about an “identifiable” individual to when it is impossible to identify an individual.
As against the attenuated interest of protecting anonymized information under privacy laws are the very important economic and socially beneficial uses associated with anonymized data.
There is also a longstanding policy of the law not to impose unjustified restrictions on the uses of information. For example, copyright imposes limitations on the uses of original works and other subject matter, but these are subject to numerous exceptions that the Supreme Court has called “user rights” in order to foster uses of information by the public. The law also protects against certain harmful forms of information dissemination such as information that is defamatory or that constitutes hate speech. However, because of freedom of speech rights under the Charter, there are constitutional limits on these restrictions. Prohibitions on the uses of information must be carefully tailored so that rights are impaired no more than necessary and there must be proportionality between the objective and the measures adopted by the law, and between the salutary and deleterious effects of the law. The Supreme Court has held that the Charter also limits the extent to which privacy laws can impair freedom of speech rights. These principles are also applicable to assessing the balance the CPPA takes to potentially limiting the uses to which de-identified information can be put.
The federal government, as noted above, has recognized the need to take a risk based approach to dealing with anonymized and pseudonymized data. The CPPA does not expressly address the latter category and is ambiguous as to how it applies to the former. Perhaps this is just technical drafting rather than a failure to take the risk based approach that had been considered. Either way, these provisions need fixing.
This article was first posted on www.barrysookman.com.
 The term is defined in the General Data Protection Regulation (GDPR): “pseudonymization” means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person;” and in the California Consumer Privacy Act of 2018, (CPPA) as follows: “’Pseudonymize’ or “Pseudonymization” “means the processing of personal information in a manner that renders the personal information no longer attributable to a specific consumer without the use of additional information, provided that the additional information is kept separately and is subject to technical and organizational measures to ensure that the personal information is not attributed to an identified or identifiable consumer.”
 See, Mike Hintze et al Comparing the Benefits of Pseudonymization and Anonymization Under the GDPR, Privacy Analytics August 2017.
“Information will be about an identifiable individual if there is a serious possibility that someone could identify the available information. It is not necessary, she commented, to demonstrate that someone would necessarily go to all lengths to actually do so. Consequently, de-identified data will not constitute “truly anonymous information” when it is possible to subsequently link the de-identified data back to an identifiable individual.”
Also, Gordon v. Canada (Health), 2008 FC 258:
“Counsel for the Privacy Commissioner, the Intervener, urged the adoption of the following test in determining when information is about an identifiable individual:
Information will be about an identifiable individual where there is a serious possibility that an individual could be identified through the use of that information, alone or in combination with other available information.
I am satisfied that the foregoing is an appropriate statement of the applicable text.”
 See, Canada (Information Commissioner) v. Canadian Transportation Accident Investigation & Safety Board, 2006 FCA 157 in interpreting the Privacy Act:
These two words, “about” and “concernant”, shed little light on the precise nature of the information which relates to the individual, except to say that information recorded in any form is relevant if it is “about” an individual and if it permits or leads to the possible identification of the individual. There is judicial authority holding that an “identifiable” individual is considered to be someone whom it is reasonable to expect can be identified from the information in issue when combined with information from sources otherwise available (Colin H. H. McNairn and Christopher D. Woodbury, Government Information: Access and Privacy (Toronto: Carswell, 1992), at page 7‑5; Ontario (Attorney General) v. Ontario (Information and Privacy Commissioner) (2001), 2001 CanLII 32755 (ON SCDC), 39 Admin. L.R. (3d) 112 (Ont. Div. Ct.); affd [sub nom. Ontario (Attorney General) v. Pascoe] (2002), 2002 CanLII 30891 (ON CA), 22 C.P.R. (4th) 447 (Ont. C.A.)).
Saskatchewan Health Authority (Re), 2019 CanLII 44080 (SK IPC) interpreting the FOIP
As stated in my office’s Guide to Exemptions, available on my office’s website, for there to be an identifiable individual, it must be reasonable to expect that an individual may be identified if the information were disclosed. The information must reasonably be capable of identifying particular individuals because it either directly identifies a person or enables an accurate inference to be made as to their identity when combined with other available sources of information, or due to the context of the information in the record…
The process of removing information that would enable the identification of a specific individual is referred to as de-identifying information. De-identification is used to protect privacy and while the goal is to ensure individuals cannot be identified, there may still be an inherent risk of re-identification. De-identification is not a guarantee of anonymity however when appropriate de-identification methods are used, the risk of re-identification is minimized as much as possible. De-identification to protect privacy continues to be an acceptable and reasonable process of protecting privacy…
Saskatchewan (Education) (Re), 2014 CanLII 47639 (SK IPC)
In order to qualify, it must be reasonable to expect that an individual may be identified if the information were disclosed. The Health Information Protection Act (HIPA) defines what sufficient de-identification of information means:
2(d) “de-identified personal health information” means personal health information from which any information that may reasonably be expected to identify an individual has been removed;
Hastings and Prince Edward District School Board (Re), 2008 CanLII 24747 (ON IPC)
To qualify as personal information, it must be reasonable to expect that an individual may be identified if the information is disclosed [Order PO-1880, upheld on judicial review in Ontario (Attorney General) v. Pascoe, 2002 CanLII 30891 (ON CA),  O.J. No. 4300 (C.A.)].
Also to the same effect, Ottawa (City) (Re), 2016 CanLII 68086 (ON IPC)
Also, see, IPC De-identification Guidelines for Structured Data
As noted above, de-identification is the process of removing personal information from a record or data set. “Personal information” is defined in FIPPA and MFIPPA as “recorded information about an identifiable individual.” The Office of the Information and Privacy Commissioner of Ontario (IPC) and the courts have elaborated on this definition, specifically on the meaning of “identifiable,” in various orders and reviews. Based on these, de-identification may be defined more precisely as the process of removing any information that (i) identifies an individual, or (ii) for which there is a reasonable expectation that the information could be used, either alone or with other information, to identify an individual.
Applying a “reasonableness standard” to the definition of personal information means that you must examine the context to de-identify information. When de-identifying a data set, you must navigate and consider a number of issues, including: • Different release models. In de-identification, a data set may be released publicly, semi-publicly (also called “quasi-public”) or non-publicly.
Personal Health Information Protection Act, 2004, S.O. 2004,
“de-identify”, in relation to the personal health information of an individual, means to remove any information that identifies the individual or for which it is reasonably foreseeable in the circumstances that it could be utilized, either alone or with other information, to identify the individual, and “de-identification” has a corresponding meaning; (“anonymiser”)
 See, Strengthening Privacy for the Digital Age, ISED, Proposals to modernize the Personal Information Protection and Electronic Documents Act:
- “Under PIPEDA, personal information is defined as information about an identifiable individual. According to the Federal Court of Canada, “information will be about an ‘identifiable individual’ where there is a serious possibility that an individual could be identified through the use of that information, alone or in combination with other information.” “About” means that the information is not just the subject of something but also relates to or concerns the subject. Generally speaking, the definition of personal information is given a broad and expansive interpretation. In the era of Big Data, however, when vast amounts of data are being created every day, this potentially means that any piece of data could be considered to be about an identifiable individual. Moreover, there are increasingly sophisticated means to re-identify information that ostensibly appears to be non-personal. The idea that the anonymization of information, which would render such information outside the scope of privacy legislation, is practically attainable, is unlikely. That said, a risk-based approach, in which de-identified information could be defined and its use allowed in certain specified circumstances, with penalties for re-identification, could be taken to both address privacy concerns and enable innovation.
- Concepts such as pseudonymous information are being incorporated into other privacy laws, in recognition that there is a desire to use information that need not necessarily be personally identified, but that remain identifiable, and that protections are needed for such information. The concept of pseudonymous information could be incorporated into exceptions to consent, to clarify that while this information may not be “identified”, it still retains a privacy interest and must be protected….”
- ”The concept of de-identified/pseudonimized data is being recognized in other laws as a way forward to enable innovation and protect privacy, with appropriate conditions surrounding it. Adopting a similar approach as exists in other jurisdictions will also help with interoperability concerns and will bring greater certainty to individuals and organizations, especially in the cross-border context.”
- “One important definitional issue is whether the Privacy Act should recognize new subsets of personal information in order to facilitate the application of a more nimble and context-sensitive rule set. For example, to create something akin to a “data trust” that differentiated in its treatment of identifiable and de-identified information, it would be important to be able to effectively and with sufficient certainty determine which rule set applied to which data elements. The current “in or out” approach to personal information does not accommodate more nuanced rules that may be organized around different levels of risk and foster compliance. Defining de-identified, anonymized, and pseudonymized information could support the development of new compliance incentives, allow for a more targeted and nuanced application of certain rules, and assist to ease some of the difficulties of practical application that arise under the current approach…”
- “Currently, other data protection regimes provide an important role for the use of de-identified personal information, and limit some obligations where encryption is employed. For example, in the GDPR, several provisions create incentives for data controllers to use de-identification, pseudonymization, or encryption methods, which can have an impact on how obligations apply, or even provide relief from certain obligations.”
 See, for example, 9 Data Anonymization Use Cases You Need To Know Of , Canadian Anonymization Network, Frequently Asked Questions (Last Updated: January 15, 2020).
 ‘personal data’ is defined to mean “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person”.
 See, Recital (26) of the GDPR:
|The principles of data protection should apply to any information concerning an identified or identifiable natural person. Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person. To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly. To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments. The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable. This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes.
 See, Irish Data Authority, Guidance Note: Guidance on Anonymisation and Pseudonymisation June 2019
“Irreversibly and effectively anonymised data is not “personal data” and the data protection principles do not have to be complied with in respect of such data. Pseudonymised data remains personal data. If the source data is not deleted at the same time that the ‘anonymised’ data is prepared, where the source data could be used to identify an individual from the ‘anonymised’ data, the data may be considered only ‘pseudonymised’ and thus still ‘personal data’, subject to the relevant data protection legislation. Data can be considered “anonymised” from a data protection perspective when data subjects are not identified or identifiable, having regard to all methods reasonably likely to be used by the data controller or any other person to identify the data subject, directly or indirectly…
The concept of “identifiability” is closely linked with the process of anonymisation. Even if all of the direct identifiers are stripped out of a data set, meaning that individuals are not “identified” in the data, the data will still be personal data if it is possible to link any data subjects to information in the data set relating to them. Recital 26 of the GDPR provides that when determining whether an individual is identifiable or not “[…] account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly” and that when determining whether means are ‘reasonably likely to be used’ to identify the individual “[…] account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.” Recital 26 also clarifies that the principles of data protection do not apply to anonymous information.
Therefore, to determine when data are rendered anonymous for data protection purposes, you have to examine what means and available datasets might be used to reidentify a data subject. Organisations don’t have to be able to prove that it is impossible for any data subject to be identified in order for an anonymisation technique to be considered successful. Rather, if it can be shown that it is unlikely that a data subject will be identified given the circumstances of the individual case and the state of technology, the data can be considered anonymous.”
U.K. ICO Is pseudonymised data still personal data?
“What about anonymised data?
The GDPR does not apply to personal data that has been anonymised. Recital 26 explains that:
“…The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable. This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes.”
This means that personal data that has been anonymised is not subject to the GDPR. Anonymisation can therefore be a method of limiting your risk and a benefit to data subjects too. Anonymising data wherever possible is therefore encouraged…
In order to be truly anonymised under the GDPR, you must strip personal data of sufficient elements that mean the individual can no longer be identified. However, if you could at any point use any reasonably available means to re-identify the individuals to which the data refers, that data will not have been effectively anonymised but will have merely been pseudonymised. This means that despite your attempt at anonymisation you will continue to be processing personal data.”
 See, Does anonymization or de-identification require consent under the GDPR? Quoting from the Opinion 05/2014 of the Article 29 Working Party on Anonymisation Techniques
“The Working Party considers that anonymisation as an instance of further processing of personal data can be considered to be compatible with the original purposes of the processing but only on condition the anonymisation process is such as to reliably produce anonymised information in the sense described in this paper.”
“Pseudonymisation is a technique that replaces or removes information in a data set that identifies an individual.
The GDPR defines pseudonymisation as:
“…the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.”
Pseudonymisation may involve replacing names or other identifiers which are easily attributed to individuals with, for example, a reference number. Whilst you can tie that reference number back to the individual if you have access to the relevant information, you put technical and organisational measures in place to ensure that this additional information is held separately.
Pseudonymising personal data can reduce the risks to the data subjects and help you meet your data protection obligations.
However, pseudonymisation is effectively only a security measure. It does not change the status of the data as personal data. Recital 26 makes it clear that pseudonymised personal data remains personal data and within the scope of the GDPR.
“…Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person…””
Recital, 29 (In order to create incentives to apply pseudonymisation when processing personal data, measures of pseudonymisation should, whilst allowing general analysis, be possible within the same controller when that controller has taken technical and organisational measures necessary to ensure, for the processing concerned, that this Regulation is implemented, and that additional information for attributing the personal data to a specific data subject is kept separately. The controller processing the personal data should indicate the authorised persons within the same controller.)
Recital 156 (The processing of personal data for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes should be subject to appropriate safeguards for the rights and freedoms of the data subject pursuant to this Regulation. Those safeguards should ensure that technical and organisational measures are in place in order to ensure, in particular, the principle of data minimisation. The further processing of personal data for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes is to be carried out when the controller has assessed the feasibility to fulfil those purposes by processing data which do not permit or no longer permit the identification of data subjects, provided that appropriate safeguards exist (such as, for instance, pseudonymisation of the data).
Article 89 (Processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes, shall be subject to appropriate safeguards, in accordance with this Regulation, for the rights and freedoms of the data subject. Those safeguards shall ensure that technical and organisational measures are in place in particular in order to ensure respect for the principle of data minimisation. Those measures may include pseudonymisation provided that those purposes can be fulfilled in that manner. Where those purposes can be fulfilled by further processing which does not permit or no longer permits the identification of data subjects, those purposes shall be fulfilled in that manner.)
 GDPR Art. 6.4.
 See, 1798.145.
 “Research” is defined to means “scientific, systematic study and observation, including basic research or applied research that is in the public interest and that adheres to all other applicable ethics and privacy laws or studies conducted in the public interest in the area of public health. Research with personal information that may have been collected from a consumer in the course of the consumer’s interactions with a business’s service or device for other purposes shall be:
(1) Compatible with the business purpose for which the personal information was collected.
(2) Subsequently pseudonymized and deidentified, or deidentified and in the aggregate, such that the information cannot reasonably identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer.
(3) Made subject to technical safeguards that prohibit reidentification of the consumer to whom the information may pertain.
(4) Subject to business processes that specifically prohibit reidentification of the information.
(5) Made subject to business processes to prevent inadvertent release of deidentified information.
(6) Protected from any reidentification attempts.
(7) Used solely for research purposes that are compatible with the context in which the personal information was collected.
(8) Not be used for any commercial purpose.”
 s. 1798.148.
(a) A business or other person shall not reidentify, or attempt to reidentify, information that has met the requirements of paragraph (4) of subdivision (a) of Section 1798.146, except for one or more of the following purposes:
(1) Treatment, payment, or health care operations conducted by a covered entity or business associate acting on behalf of, and at the written direction of, the covered entity. For purposes of this paragraph, “treatment,” “payment,” “health care operations,” “covered entity,” and “business associate” have the same meaning as defined in Section 164.501 of Title 45 of the Code of Federal Regulations.
(2) Public health activities or purposes as described in Section 164.512 of Title 45 of the Code of Federal Regulations.
(3) Research, as defined in Section 164.501 of Title 45 of the Code of Federal Regulations, that is conducted in accordance with Part 46 of Title 45 of the Code of Federal Regulations, the Federal Policy for the Protection of Human Subjects, also known as the Common Rule.
(4) Pursuant to a contract where the lawful holder of the deidentified information that met the requirements of paragraph (4) of subdivision (a) of Section 1798.146 expressly engages a person or entity to attempt to reidentify the deidentified information in order to conduct testing, analysis, or validation of deidentification, or related statistical techniques, if the contract bans any other use or disclosure of the reidentified information and requires the return or destruction of the information that was reidentified upon completion of the contract.
(5) If otherwise required by law.
(b) In accordance with paragraph (4) of subdivision (a) of Section 1798.146, information reidentified pursuant this section shall be subject to applicable federal and state data privacy and security laws including, but not limited to, the Health Insurance Portability and Accountability Act, the Confidentiality of Medical Information Act, and this title.
(c) Beginning January 1, 2021, any contract for the sale or license of deidentified information that has met the requirements of paragraph (4) of subdivision (a) of Section 1798.146, where one of the parties is a person residing or doing business in the state, shall include the following, or substantially similar, provisions:
(1) A statement that the deidentified information being sold or licensed includes deidentified patient information.
(2) A statement that reidentification, and attempted reidentification, of the deidentified information by the purchaser or licensee of the information is prohibited pursuant to this section.
(3) A requirement that, unless otherwise required by law, the purchaser or licensee of the deidentified information may not further disclose the deidentified information to any third party unless the third party is contractually bound by the same or stricter restrictions and conditions.
(d) For purposes of this section, “reidentify” means the process of reversal of deidentification techniques, including, but not limited to, the addition of specific pieces of information or data elements that can, individually or in combination, be used to uniquely identify an individual or usage of any statistical method, contrivance, computer software, or other means that have the effect of associating deidentified information with a specific identifiable individual.
(Added by Stats. 2020, Ch. 172, Sec. 3. (AB 713) Effective September 25, 2020.
 For the CCPA see, “De-Identified” Data under the CCPA – Some Words of Caution
“Is De-Identifying the Answer to CCPA Compliance?
Maybe, they ask, can businesses escape some of the burdens of the law by “de-identifying” the information they have about their customers? (“Aggregated” information is, essentially, an average of information about a group of customers – something else entirely. See Cal.Civ. Code § 1798.140(a).)
Unfortunately, there are reasons to be skeptical that de-identification will be useful in significantly mitigating CCPA obligations. One key reason is that the law defines “de-identified” in an extremely stringent way. For a business to count information it has collected about consumers as de-identified, the following criteria must be met:
- The information “cannot reasonably identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular customer;” and
- The business must have implemented technical safeguards and business processes that prohibit re-identification; and
- The business must have implemented business processes to prevent inadvertent release even of the de-identified data; and
- The business must not make any attempt to re-identify the information.
Cal. Civ. Code § 1798.140(h).
The first requirement will be particularly hard to meet. It’s hard to see how a business could make the required judgment of “reasonableness,” because the law doesn’t define any metrics to decide how difficult it must be to re-identify the data. On this exact point, data scientists are getting increasingly clever at figuring ways to do exactly that.”