research
Morgan Stanley is shaping the future of financial services
For AI workflow builders, Morgan Stanley's adoption of systematic evaluations shows how to make AI systems more reliable and audit-friendly, a model applicable to any high-stakes domain.
What happened
Morgan Stanley is integrating AI evaluations into its financial services workflows, according to the OpenAI Blog. The company is using large language model-based evals to assess and improve the quality of AI-generated responses in areas like client advisory and internal knowledge retrieval. This approach enables systematic testing of AI outputs against predefined criteria, reducing reliance on manual review. For developers building AI workflows, this signals a shift toward structured evaluation frameworks as a critical component of production-grade systems. The practical takeaway: incorporating automated evals early in development can help catch errors and align outputs with domain-specific requirements, especially in regulated industries like finance where accuracy is paramount.
Key takeaways
- Morgan Stanley is employing AI evals to assess LLM outputs in financial services contexts.
- The evaluations focus on response accuracy, relevance, and adherence to regulatory guidelines.
- Automated evals reduce the need for manual quality assurance in AI-powered client interactions.
- This approach demonstrates a template for integrating robust evaluation into production AI workflows.
Why it matters
For AI workflow builders, Morgan Stanley's adoption of systematic evaluations shows how to make AI systems more reliable and audit-friendly, a model applicable to any high-stakes domain.
This is an original editorial digest by AI Workflow Pro. Full reporting at the source:
Read the original on OpenAI BlogMore AI news
All news →





Join the AI Workflow Pro Community