CPPA: problems and criticisms – anonymization and pseudonymization of personal information
This article is part of our Bill C-27 Business Insights Series: Navigating Canada’s Evolving Privacy Regime, written by McCarthy Tétrault’s multidisciplinary Cyber/Data team. This series brings you practical and integrative perspectives on Canada’s Bill C-27: Digital Charter Implementation Act, 2022 and how your organization can stay ahead of the curve.
View other blog posts in the series here.
Canada is planning to revamp its comprehensive privacy law by repealing the existing comprehensive privacy law, PIPEDA, and by enacting Bill C-27, the Digital Implementation Act (“DIA”) to enact the Consumer Privacy Protection Act (CPPA), Personal Information and Data Protection Tribunal Act (PIDTA), and Artificial Intelligence and Data Act (AIDA). Bill C-27 replaced Bill C-11 (the former drafts of the CPPA and PIDTA). While the DIA attempts to rectify some of the criticisms with Bill C-11, many of the problems remain and problems have emerged in the new Bill. This blog series will address some of the more important problems with the DIAincluding issues in the CPPA, PIDTA and AIDA. Prior posts focused on Bill C-27’s preamble and an overview and how the bill fails to meet these purposes; the problems with the “appropriate purposes” override section; and the service provider provisions. This post focuses on the amendments that deal with anonymization and pseudonymization of personal information.
After Bill C-11 passed first reading, I did an extensive post highflying flaws in how the Bill addressed the de-identifying provisions. See, CPPA: identifying the inscrutable meaning and policy behind the de-identifying provisions. Bill C-27 corrects some of the drafting problems with Bill C-11. However, problems still remain with these provisions.
Background and Overview
My prior blog provided an extensive analysis of the current and proposed changes to the law proposed in Bill C-11. The following is a summary of this analysis. For those interested in further details, I refer you to my prior blog.
The term “de-identified” is often used as a general term that includes privacy and security processes that renders personal information as either “anonymized” or “pseudonymized”. As I explained before:
“Pseudonymization” commonly refers to a de-identification method that removes or replaces direct identifiers from a data set leaving in place data that could be used to indirectly identify a person. This data is generally still subject to privacy laws. “Anonymization”, by contrast, generally refers to a stronger form of de-identification which (depending on the formulation) makes re-identification impossible, reasonably unlikely, or not reasonably expected. This data is generally considered to be outside of general privacy law obligations.
Accepted meaning of anonymize
Under PIPEDA, personal information that has been anonymized is not subject to that privacy law. However, to be anonymized it is not necessary that the information be irreversibly and permanently modified. The Federal Privacy Commissioner has historically advanced the “serious possibility” of identification or re-identification test to determine when personal information becomes anonymous.
The weight of federal and provincial authority, however, has adopted the more practical “reasonable expectations” or “reasonably foreseeable” test. Under this formulation, information is still about an “identifiable” individual “if it is reasonable to expect that an individual can be identified from the information in issue including when combined with information from sources otherwise available”. If it is not, the compliance obligations under PIPEDA do not apply.
It is noteworthy that this reasonable expectation test is not only the test adopted in interpreting PIPEDA; It is also the leading test applied provincially including under provincial health privacy legislation such as Saskatchewan’s Health Information Protection Act (HIPA), the Ontario IPC's De-identification Guidelines for Structured Data, and Ontario's Personal Health Information Protection Act,.
A “reasonably foreseeable” test was adopted in Quebec’s Bill 64 which states (at s. 28):
“For the purposes of this Act, information concerning a natural person is anonymized if it is, at all times, reasonably foreseeable in the circumstances that it irreversibly no longer allows the person to be identified directly or indirectly.”
The GDPR adopts a “reasonably likely” test. Data can be considered “anonymized” from a data protection perspective when data subjects are not identified or identifiable, having regard to all methods reasonably likely to be used by the data controller or any other person to identify the data subject, directly or indirectly.
The California Consumer Privacy Act of 2018, as amended (the CCPA) defines personal information as “information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household”
Bill C-11 anonymization
Bill C-11 introduced the following definition of the term de-identify:
de-identify means to modify personal information — or create information from personal information — by using technical processes to ensure that the information does not identify an individual or could not be used in reasonably foreseeable circumstances, alone or in combination with other information, to identify an individual. (dépersonnaliser)
Bill C-11’s definition of “de-identify”, looking only at that definition, appeared to generally confirm the existing interpretations of PIPEDA and also generally aligned with the provisions provincially and under the GDPR and CCPA. It adopted, however, a “reasonably foreseeable” rather than the “reasonable to expect” or “serious possibility”, of re-identification tests.
However, it was apparent in reviewing the rest of the provisions of Bill C-11 that de-identified information included both personal information that was pseudonymized or anonymized and that both were still subject to all of provisions of the CPPA. Thus, Bill C-11’s treatment of de-identified information was inconsistent with PIPEDA, provincial privacy laws and with the laws of our major trading partners, and with the legitimate needs of organizations to use anonymized information.
Bill C-27 proposed amendments
Bill C-27 made substantial changes to the proposed treatment of de-identified data by attempting to delineate between de-identified information (which remains subject to the CPPA) and anonymized information (that would expressly no longer be subject to the CPPA.)
The term “de-identify” was modified as follows:
de-identify means to modify personal information
or create information from personal information by using technical processes to ensureso thatthe information does not identifyan individualor could notcannot beused in reasonably foreseeable circumstances, alone or in combination with other information, to identify andirectly identified from it, though a risk of the individual. (being identified remains. (dépersonnaliser)
A new term “anonymize” was introduced and defined as follows:
anonymize means to irreversibly and permanently modify personal information, in accordance with generally accepted best practices, to ensure that no individual can be identified from the information, whether directly or indirectly, by any means. (anonymiser)
s. 6(5) was also added to state that “For greater certainty, this Act does not apply in respect of personal information that has been anonymized.”
Problems with the definition of anonymize
The definition of anonymize sets a very high and impractical standard. The Canadian Anonymization Network (CANON) described it aptly stating it “sets an extremely high and practically unworkable threshold”.
First, the proposed definition eschews the generally accepted “reasonable to expect” or “reasonably foreseeable” standard or the GDPR's “reasonably likely” standard and instead applies a strict liability standard of ensuring that no individual can be identified from the information. Of course, as the UK ICO stated in its code of practice on anonymization, “It can be impossible to assess re-identification risk with absolute certainty” and “It is worth stressing that the risk of re-identification through data linkage is essentially unpredictable because it can never be assessed with certainty what data is already available or what data may be released in the future.” The Irish Data Protection Authority takes the same view stating:
Therefore, to determine when data are rendered anonymous for data protection purposes, you have to examine what means and available datasets might be used to re_identify a data subject. Organisations don’t have to be able to prove that it is impossible for any data subject to be identified in order for an anonymisation technique to be considered successful. Rather, if it can be shown that it is unlikely that a data subject will be identified given the circumstances of the individual case and the state of technology, the data can be considered anonymous.
That is why the other generally accepted formulations of “reasonable to expect” or “reasonably foreseeable” or “reasonably likely” better reflect the realities of data anonymization practices.
Second, the methodology that can be used to anonymize personal information is a prescribed very high standard. It must be “in accordance with generally accepted best practices”. While Quebec’s Bill 64 also requires information to be anonymized “according to generally accepted best practices and according to the criteria and terms determined by regulation”, this high standard is not consistent with provincial privacy laws. Nor is it consistent with the more flexible GDPR standard. The Irish Data Protection Authority interpreted the GDPR standard as follows:
However, the duty of organizations is to make all reasonable attempts to limit the risk that a person will be identified. In assessing what level of anonymisation is necessary in a particular case, you should consider all methods reasonably likely to be used by someone (either an “intruder” or an “insider”) to identify an individual data subject given the current state of technology and the information available to such a person at present. An approach to anonymisation which affords a reasonable level of protection today may likely prevent identification into the future, but this will have to be monitored and assessed over time.
The standard proposed in Bill C-27 is inconsistent with other accepted standards. Moreover, it removes flexibility of Canadian organizations to innovate in data anonymization techniques. For example, Canadian organizations would be prevented from using novel and even stronger anonymization techniques merely because they have not yet become “generally accepted best practices”. They would, conversely, also require Canadian based organizations including small and medium sized organizations to adopt only “best practices”, regardless of costs and commercial practicality which could, in some instances, inhibit innovations including in the fast and highly competitive AI technologies space.
The new standards for anonymization being proposed in the CPPA runs counter to the stated objectives for the law set out in the CPPA’s pre-amble. They are not interoperable with standards provincially or internationally, and are calibrated at such a high level that they could impede innovation in Canada. While the high standards might theoretically be achievable by very large multinational firms with significant capital, they would not likely be, or always be, commercially reasonable for smaller innovative firms. The language as noted above is not qualified by any measure of commercial reasonability. It also does not balance the costs of achieving best practices with the incremental likelihood of de-identification risk when other anonymization methodologies are employed.
This is yet another example of a major problem with the CPPA. That is, the regulatory structure is designed as a best in class privacy law for those firms that can afford the very steep compliance costs while eschewing a more balanced approach that takes into account the ecosystem of organizations that need to make reasonable commercial efforts to both protect privacy and to use information to inhibit and grow their businesses and to compete in the global marketplace on par with their international competitors.
Problems with the definition of de-identify
When the government changed the definition of de-identify and added the new definition of anonymize it did not make corresponding changes to other provisions in the CPPA to make it work as intended.
For example, the definition of de-identify is broadly defined and would include anonymized information. But, as noted above, information that has been anonymized is no longer subject to the law. This suggests that the government intends that the term de-identify does not include information that has been anonymized. Several sections would suggest this, such as, for example:
-
- s.1(3) which clarifies that personal information that has been de-identifed is considered to be personal information and which, helpfully, lists sections of the CPPA to which the new law would not apply to.
- The CPPA would continue to have special exceptions where de-identified information can be used without obtaining consents such as for research, analysis and development (s. 21), prospective business transactions (s. 22), and socially beneficial purposes (s. 39),
- An organization that de-identifies personal information must ensure that any technical and administrative measures applied to the information are proportionate to the purpose for which the information is de-identified and the sensitivity of the personal information. (s. 74) This provision may not apply to anonymization and may not need to if the proper standards are in that definition. But, the government’s intentions on this are not clear.
However, the CPPA has not modified other sections to expressly include anonymized information along with de-identified information in provisions one may suspect were intended to include both types of information. Examples are:
-
- Consent is not required to de-identify information, but this does not expressly extend to anonymized information (s. 20).
- An organization must not use information that has been de-identified, alone or in combination with other information to identify an individual. (s. 75, s. 116) One would expect that since anonymized information can still theoretically be used to re-identify an individual that this prohibition would also apply to anonymized information. Prohibiting the re-identification of anonymized information would serve to strengthen the overall protection for anonymized data. It would also shift the risks associated with re-identification to the culpable party, and as a matter of policy, would help justify adopting a more flexible approach to the standards necessary to achieve anonymization under the CPPA.
Based on the foregoing, I recommend the following changes to Bill C-27.
Recommend: Amend the definition of “anonymize” to read: “anonymize means to modify personal information so that there is no reasonably foreseeable risk in the circumstances that an individual can be identified from the information, whether directly or indirectly, by any means."
Recommend: Clarify that information that has been anonymized is excluded from “de-identified” personal information. But, also clarify that ss.20, 75 and 116 apply to both de-identified and anonymized information.
I also generally agree with the recommendations of the Canadian Anonymization Network (CANON) for changes to other provisions of the CPPA.
This article was first posted on www.barrysookman.com.
- An Evolving Digital Privacy Landscape—Comparing the Federal Bill C-27’s CPPA to Quebec’s Bill 64
- The End No Longer Justifies the Means: Bill C-27’s new Constraints on Processing Personal Information
- A Canadian Perspective on Regulating Dark Patterns
- The Dawn of AI Law: The Canadian Government Introduces Legislation to Regulate Artificial Intelligence in Canada
- Privacy Legislation Overhaul: Canada Takes a Second Shot at the CPPA
- Bill C-27 and Managing Information: Legitimate Interest, AI and Other Implications for Data Governance