Some observers have touted the promise of artificial intelligence (AI) in improving the quality of bank supervision. Recent staffing cutbacks at bank regulatory agencies have given this issue more immediacy. Can robo-examiners come to the rescue? More generally, what are some of the more promising applications of AI in bank supervision? And what are the limitations?
Let’s start with a disclaimer. I know a lot more about bank supervision than AI. Some AI applications may be further along than I imagined, while others may be less ready for prime time. Please share your own knowledge and perspectives in the comments.
Federal banking agencies made substantial staffing cutbacks this year. They provided little in the way of explanation or justification for these changes but did reference “leveraging technology.” That could include AI. But trying to enhance AI capabilities after staffing cutbacks may be putting the cart before the horse. As we’ll discuss later in the post, it’s difficult to build an expert system without … experts.
An Experiment
One way to gauge what AI might and might not do when it comes to bank supervision, is to see how a chatbot responds to a set of facts. As linked here, I gave ChatGPT a hypothetical bank with some background financial information and a set of findings. The bank isn’t named, but the facts correspond to Silicon Valley Bank and its liquidity target exam of August 2021. The advantage here is that we know what eventually happened and the supervisory letters for SVB are public information. ChatGPT then generated a supervisory letter (SL) to the bank’s management based on this information. For comparison purposes, here is the Fed’s actual SL.
The resulting SL is okay. It captures the key findings and is more concise than the actual SL. It includes three matters requiring attention (MRA) and two matters requiring immediate attention (MRIA). The actual SL contained four MRAs and two MRIAs. Both letters covered similar ground, but the MRAs and MRIAs map differently. The MRIAs for the AI-generated SL show sufficient immediacy, requiring changes to the bank’s contingency funding plan (CFP) and internal liquidity stress testing (ILST) within 60 to 90 days, respectively.
Effective business writing should be concise, but not necessarily brief. The AI-generated SL veers towards brief. It could use a bit more narrative to flesh out the concerns and their consequences. The letter needs to further explain the need to fix these deficiencies. It communicates specific findings fairly well. The letter also makes some first order inferences from the findings. For example, the actual supervisory letter noted that the CFP relied on unrealistic assumptions. The query described the individual weaknesses without using the word “unrealistic.” However, MRIA #1 characterized these as “unrealistic assumptions.”
This was less the case with more complex inferences. The actual SL included an MRIA related to independent risk and challenge. The AI query didn’t specifically cite weaknesses in those areas. But it did indicate the bank had a deficient CFP and ILST, inadequate liquidity limits framework, and modeling weaknesses. Given those set of facts, a good examiner would naturally ask, where were Independent Risk and Internal Audit? The AI SL didn’t make this inference. Nor did it conclude that an aggressively growing bank with a large concentration of uninsured deposits and deficient risk management practices might be engaging in unsafe or unsound practices. Bank supervisors don’t use the terms “unsafe or unsound” lightly, though it might fit in this case.
Some of the critiques of the Fed’s supervision of SVB might also apply to the AI-generated letter. And the AI version has the benefit of hindsight. The AI letter focuses largely on process weaknesses rather than specific financial risks. The letter includes an MRA on deposit concentration risk. However, that MRA requires the bank to improve its liquidity and stress testing framework rather than, say, reduce its concentration risk. This approach aligns with current practice. SVB’s supervisors may have hoped that more accurate ILSTs and CFPs and a more meaningful limit framework would shock SVB’s management and the board into bringing down the bank’s liquidity risk. That didn’t work out, but current MRA guidelines also focus more on risk management than risk itself.
Chat GPT adds an appendix that maps the findings to various supervisory guidance. I would advise extreme caution in sharing this appendix with the bank. Current regulations allow examiners to reference guidance, even in writing, but they cannot cite “violations” of guidance as a basis for an MRA. Banks might challenge the legal basis for the supervisory actions. At best, inclusion of a guidance mapping might be akin to the current practice on supervisory recommendations. At OCC, we couldn’t include recommendations in our SLs but could share them with bank management on a more informal basis. The recommendations often contained so many disclaimers as to make them almost meaningless. A mapping of findings to guidance makes more sense internal documents, like Conclusion Memos (CMs).
Implementation Issues
The experiment involved an off-the-shelf application that relied entirely on public information. A more bespoke application could also make use of thousands of internal documents, including SLs, CMs, and examiner workpapers. Much of an examiner’s work involves gathering vast amounts of information and then distilling that information into a more usable form. AI does the same thing.
Bank supervisors already make use of low-tech work aids, such as templates for SLs and CMs. If presented with a set of findings, AI can generate a decent first draft of an SL. The draft may need further editing, especially since supervisors usually try to tailor the tone of the letter to the situation. The tone will depend on the bank and specific exam findings. Examiners can become over-reliant on these work aids and never develop their writing skills or even the ability to edit the work of others.
Bank supervisors can also access many bank source documents. These may include bank MIS, policies, procedures, and meeting materials. Examiners receive this material as part of their ongoing supervision and as part of target examinations. While the actual writing of a supervisory letter usually takes only a few days, going through and analyzing bank documents can be much more labor-intensive. AI may hold some promise here. Maybe. It’s one thing to gather and summarize the documents and quite another to identify and prioritize concerns arising from that review. For example, an AI review could go through model development and validation documents and identify gaps with best practices. But they also need to determine which items are most likely to affect the quality of the model’s output. Currently, that involves a discussion between examiners and the regulator’s modeling experts that can’t be easily automated.
The Army You Have
Donald Rumsfeld famously said, “You go to war with the army you have, not the army you might want or wish to have at a later time.” The same goes for attempts to implement AI. Regulators aren’t especially adept in adopting and implementing new technologies. Information systems at the OCC were clunky and far from cutting edge. Discussions with former colleagues indicated that the FDIC’s information systems were even less advanced. Concerns over the confidentiality of supervisory information can also complicate the use of outside vendors.
Developers train large language models (LLMs) like ChatGPT on vast amounts of data, which can include regulations, policies, and internal guidance. But its efficacy depends on the quality of the training data. Examiner handbooks provide an obvious information source. However, sections of these handbooks can go decades without major updates. Regulators can also be slow to address emerging areas. The Comptroller’s Handbook didn’t include a section on mortgage banking until 2014, a half dozen years after the originate-to-distribute model helped blow up the U.S. financial system. Examiner handbooks usually provide useful information, at least when kept up to date and with input from subject matter experts (SMEs).
The role of SMEs can vary significantly, however. Early in my career, the Federal Home Loan Bank System undertook a massive project to rewrite outdated examiner handbooks. Although I was on for less than two years, I drafted sections related to interest rate risk and dealer floor plan lending. While I had developed a good understanding of IRR early on, I knew little about floor plan lending. Feedback from others with more experience helped, but I would not want an examiner (much less a chatbot) to rely on my “expertise” in this area. Huge staffing cutbacks at banking agencies will inevitably create a serious brain drain. That makes the garbage in, garbage out problem even more severe.
More Promising Applications
AI might provide some useful work aids, but their application to front line exam work is probably limited, at least in the short run. I see more potential in quality assurance and in the supervisory appeals process. QA reviews usually look at whether exam teams are following existing policies and may also identify better practices. Decisions on supervisory appeals usually compare the exam’s findings with existing policy guidance. For example, does a bank’s CAMELS rating correspond to established rating criteria? AI seems especially well suited for these types of tasks. A human should still make and take responsibility for the final decision. But AI could make that job easier.