Can We Trust Artificial Intelligence?
by Adrian Bowles, PhD
Arthur Clarke’s third law—Any sufficiently advanced technology is indistinguishable from magic (1973)—captures the essence of our current AI-trust dilemma. We are quite capable of constructing deep learning solutions that perform admirably, but whose performance cannot be sufficiently explained to satisfy some regulatory or consumer concerns. This explainability problem is one barrier to trust.
Another barrier is the lack of rigor or standards for developing AI-powered solutions. There is nothing in the AI world comparable to the Software Engineering Institute’s Capability Maturity Models (measuring the quality of the development process) or the USDA’s Food Safety and Inspection Service processes. Trust is generally based on the fragile reputation of the supplier.
Can a Voluntary Reporting Approach Help?
IBM researchers recently proposed the development of a standard Supplier’s Declaration of Conformity (SDoC) or factsheet for AI services to address this trust issue.
The SDoC would be a document similar to a food nutrition label or information sheet for an appliance. It would provide facts about important attributes of the service, including information about the processes used to design and test it. As the team currently envisions the SDoC, it would provide data about product-level—not component-level—functional testing.
The sample checklist of process-oriented questions is aligned with IBM’s four pillars for trusted AI systems:
- Fairness: Training data and models should be free of bias.
- Robustness: AI systems should not be vulnerable to tampering and training data must be protected from compromise.
- Explainability: AI systems should provide output that can be understood by their users and developers.
- Lineage: “AI systems should include details of their development, deployment, and maintenance so they can be audited throughout their lifecycle.”
If “yes,” describe bias policies that were checked, bias checking methods, and results?
Was any bias mitigation performed on the dataset?
If “yes” describe the mitigation method.
IBM is proposing that the SDoC be voluntary, but with recent well-publicized examples of bias in facial recognition systems (e.g., falsely matching 28 members of Congress with mugshots from a public database), something like this is likely to become mandatory soon.
Mathematically sound techniques to make programs provably correct have existed for decades, but they are cumbersome and generally impractical for large, complex, enterprise applications. That doesn’t stop us from building these applications. Deep learning and other opaque modern AI techniques will continue to be developed and disseminated in products based on correlation data; for most applications, that’s a good thing. The pace of advancements in this area is much faster than the ability of most organizations—private and public—to objectively evaluate the processes and identify potential edge cases that may fail spectacularly.
Efforts to develop practical quality and machine learning explainability solutions should continue, especially for mission or life-critical systems. For most application buyers, however, insisting on SDoCs as part of the purchasing criteria looks like a good way to improve buyer confidence and help vendors earn their trust.