Each week we find a new topic for our readers to learn about in our AI Education column.
Business and information have always gone hand-in-hand—but with AI, it’s complicated.
The kind of information businesses are gathering, and the things that businesses and other entities want to do with that data, require a new way of thinking about what that information means, who it should belong to, and how it should be handled.
Welcome to AI Education, where our discussion today will focus on data ethics, the idea that there are obligations that come with gathering, storing and using information, especially personal information.
As some of you know, part of my background is in health care, where, since the 1990s, health providers have struggled to comply with HIPAA and other state and federal regulations around the gathering, storage and use of personally identifiable health information—when we worked in that space, we were still living in a universe of open intensive care units designed as panopticons and shared or semi-private acute care rooms, making sure the family on the other side of the flimsy curtain knew nothing about your hemorrhoids, let alone things like your address, insurance and your phone number, was not going to be a realistic ask. But we had to try. Filters were put on computer screens, conversations were moved to more private areas, and over the decades, more private facilities have been constructed to make HIPAA compliance much easier.
Similarly, due to both regulation and to industry-accepted best practices, financial institutions started to build environments with data ethics in mind even before AI implementation became a concern. Now that AI has accelerated the gathering, storage and use of data in the financial service sector, placing additional importance on data ethics.
Harvard’s 5 Principles of Data Ethics
In a March 2021 article, Harvard Business School Online’s Catherine Cote offers up five guiding principles businesses can apply:
Ownership—An individual has ownership over their personal information. Don’t collect data without consent.
Transparency—Data subjects have the right to know how you plan to collect, store and use their information. Do not withhold information about methods or intentions regarding data gathering and storage.
Privacy—Consent to collect, store and use information does not imply consent to make that information publicly available. It is the organization’s responsibility to ensure that data is kept safe, secure and private.
Intention—Even if data is collected and handled in an ethical manner, it is not ethical to collect data for malicious purposes. Only gather, store and handle data for which there is a need or purpose.
Outcomes—Data analysis can cause inadvertent harm to individuals or groups of people, even if intentions are good.
The 5 C’s
Another somewhat older framework for thinking about data ethics has been circulating around the web—we’re not entirely clear on whose idea it was in the first place, so we’re not going to attribute it to any single source, though iterations of the 5 C’s can be found for reference on free, public-facing sites like O’Reilly Media and Atlan.
Consent
An individual must give informed consent before an entity gathers, processes or uses their personal data in any way. Informed consent implies the transparency of Harvard’s framework—organizations must be clear about how and why information is being collected, processed, stored and used.
Collection
Organizations should only collect the data that is necessary and must avoid collecting irrelevant or unnecessary data. This is one we see violated all the time online and in apps. Data collection should be safe and secure, and organizations should be transparent about their data collection processes, disclosing what data is being collected, how it is protected, and what it will be used for.
Control
Individuals have the right to control their own data. This includes the ability to access and review their data and personally update their data, as well as the right to know who has accessed their data, who is able to access their data, and how it has been used. Individuals have a right to see and control whether their data is being used in ways that violate their personal principles—for example, a pacifist should be able to prevent their personal data from being used for military purposes.
Confidentiality
Organizations, not the individual, bear the primary responsibility for making sure the data they collect is safe from not just breaches or leaks, but from any unauthorized or unnecessary access whatsoever. Data should be stored and transmitted via secure methods, and controls must be in place to prevent unauthorized access.
Compliance
Finally, organizations must also meet any local, state, national and transnational rules and regulations regarding data privacy and security. Legal compliance is to be considered in addition to ethical principles, industry standards and best practices, not as a baseline expectation in and of itself.
Why Is Data Ethics Important?
Data ethics intersect the financial services sector in a few places. One is the ever-present risk of cybersecurity breaches. Having a firm grounding in data ethics provides guidance that can not just help minimize data incidents, but also guide behaviors and best practices in the wake of a breach.
As we use technology to make more decisions, however, it becomes critical that we’re able to understand and take responsibility for the data we’re using to guide our technology, because the stakes are high. Biased data can lead to issues like skewed college admissions or hiring practices where certain groups are given undue weight. Don’t think it can happen? Recall Amazon’s issues when an algorithm used for internal hiring decisions was accused of gender bias, or facial recognition software’s struggles to identify people with darker skin tones. Or Facebook’s Cambridge Analytica scandal. Or Uber’s vulnerabilty that allowed anyone—bad actors included—to track users’ real-time location.
In an industry responsible for keeping everyone’s money safe, the potential for things to go bad with data collection and storage cannot be minimized, especially as the volume of data collected and stored mushrooms in the AI era. It’s like a gigantic bomb waiting to go off in the financial services industry’s face.
And data ethics is the tool companies can use to defuse that bomb.






