Last year, I shared reflections from the AI in Finance Summit in an article titled The key to unlocking innovation is to put the subject matter expert back in the driver’s seat. That idea hasn’t changed. In fact, it only gained strength. This year at Data Science Salon NYC, the conversation moved from proof-of-concepts (POCs) to the real-world use of generative AI (GenAI) across different use cases. But the theme remained consistent: empowering experts with intelligent, trustworthy tools is still one of the fundamental principles of meaningful AI-driven innovation in finance.
Highlights from DSS NYC 2025
Held at the S&P Global headquarters in the heart of New York’s Financial District, and organized by Data Science Salon, DSS NYC brought together AI leaders and practitioners to share how machine learning is moving beyond experimentation and into production within banks, asset managers, and fintechs.
“DSS NYC 2025 marked a pivotal shift—from pilots to production. What stood out most is that no matter how sophisticated our models get, true innovation in financial AI still starts with empowering domain experts. When we embed trust, transparency, and usability into our systems, we’re not just scaling AI—we’re scaling human expertise.”
— Anna Anisin, Founder, Data Science Salon
Foundations for Intelligent Action
Kristi Baishya, AVP, Data and AI Product Management at Nomura, opened the day introducing Large Action Models (LAMs). Baishya described LAMs as systems capable of executing multi-step processes. LAMs can trigger actions based on language input following a defined procedure. This makes them particularly well-suited for automating structured, multi-step tasks within financial institutions—where precision and auditability are essential. But because of their autonomous action nature, LAMs depend on trusted, expert-verified training data, without which their outputs risk being not only unusable but dangerous.

This theme was echoed by Yashasvi Singh, Senior Data Analyst at Navy Federal Credit Union, who focused her presentation on the groundwork required to enable GenAI systems to deliver value. Singh described her team’s effort to bring order to fragmented data operations suffering from inconsistent pipelines, undefined taxonomies, and siloed tools, by introducing centralized data catalogs, automated quality checks, and shared governance frameworks. Together, these practices help pave the way for more advanced use of AI.

Singh emphasized that trust in automation starts with trust in data. When the foundations are solid, tools like LAMs can move from experimental pilots to production impact.
Data Science in Stress Testing
Akhil Khunger, VP, Quantitative Analytics at Barclays, emphasized the increasing role of machine learning (ML) in the simulation of extreme market conditions. His talk underscored how scenario modeling must balance statistical power with transparency to satisfy both innovation and regulatory scrutiny.
When I asked Khunger about whether he used qualitative data as part of their modeling, he said that he indeed leveraged Natural Language Processing (NLP) to identify early signs of market distress. By analyzing financial news, analyst commentary, and even social media content, firms can detect shifts in sentiment that may lead to volatility. This application of NLP is especially valuable during financial crises or periods of market turbulence, when timely insight into public and expert sentiment becomes both a reflection and a driver of risk. Khunger added that the biggest challenge with sentiment data was ensuring the analysis was based on reputable sources of news and information. In an era of content saturation and misinformation, filtering for reliability is just as important as the algorithm behind the analysis.

Khunger didn’t shy away from implementation challenges either. He discussed how legacy systems often hinder institutions from adopting AI at scale.
Modernizing Legacy Infrastructure
On the topic of legacy systems, Harry Mendell, AI Architect and AI Co-chair Innovation Roundtable at the Federal Reserve Bank of New York, told the story of how his team is tackling the problem of legacy infrastructure head-on. The example he used was the conversion of on-premise ETL workflows into cloud-native PySpark analytical data-lake pipelines. Mendell explained that the difficulty was in parsing XML files containing task definitions, identifying the blocks that contained those definitions, and refactoring those blocks to perform similar tasks in the new environment. The challenge wasn’t just technical, but also one of expertise alignment due to the platforms’ different design patterns. This disconnect added to the technical challenge by introducing a context gap that was difficult to bridge.

Mendell and his team took advantage of GenAI’s remarkable capability to act as a powerful translator, not just of code, but of context. By giving development squads AI tools that explain, document, and suggest refactoring, Mendell’s team enabled collaboration across distinct technical domains, accelerated modernization, and turned a daunting task into an inclusive and energizing team effort.
Inside BlackRock’s GenAI Copilot: Safe, Scalable AI at Work
Yu Yu, Director of Data Science at BlackRock, delivered an educational and engaging presentation on deploying GenAI in production environments using multi-agent architectures. Her team’s focus was not just on generating insights, but on doing so within a framework designed to integrate trust, safety, and scalability.

Yu presented a detailed evaluation rubric for this framework based on intent recognition, guardrails, answer accuracy, and end-to-end system performance. Her discussion of horizontal versus vertical agent specialization highlighted how architectural choices can significantly impact adaptability and maintainability in enterprise AI systems.
BlackRock’s use of rigorous guardrails was particularly noteworthy, including hallucination detection, sensitive content filtering, and output validation pipelines—underscoring their commitment to making GenAI enterprise-safe. Yu’s insights reaffirmed that successful GenAI implementation depends not just on technical prowess, but also on governance.
Inside SqPal: Agentic GenAI for Analysts
A GenAI-powered text-to-SQL tool was presented as a tangible example of a robust approach to operationalizing generative AI to translate natural language inquiries into structured SQL queries through a modular, multi-agent architecture. A Supervisor Agent coordinates the workflow, dynamically managing other specialized agents that handle everything from schema discovery and SQL writing to plotting and summarizing results.

The system’s design reflects a human-in-the-loop mindset: even though agents can operate autonomously, a Supervisor ensures each output is reviewed—either by other agents or by a human expert—before it’s executed. This balance between automation and accountability is exactly the kind of structure NovaceneAI supports: one where GenAI can be productive, but not fully autonomous.
Perhaps most impressive was SqPal’s real-world orientation. It addressed practical pain points like finding relevant tables, explaining schemas, and letting analysts interact with data through dialogue. This mirrors Novacene’s focus on self-serve AI tools that allow subject matter experts to run complex logic workflows without deep technical knowledge. Just like Novacene’s reusable prompt and ML model architecture, SqPal was designed to scale: users can ask, refine, re-ask, and visualize—all within a governed, expert-feedback-driven loop.
A Cohesive Industry Shift
The panel brought together Kalpan Dharamshi, Senior Software Engineer, Sasibhushan Rao Chanthati, AVP, Senior Software Engineer at T. Rowe Price, and Vishnupriya Devarajulu, Software Engineer.
One aspect of the conversation centered on scalability and cost control. Panelists emphasized that using smaller, more focused language models can often meet business needs while significantly reducing infrastructure demands. When I asked whether blending GenAI with traditional AI and ML might also be part of the answer, Mr. Dharamshi responded that yes, small language models (SMLs) and classical AI and ML methods can sometimes be the better choice. This mirrors Novacene’s hybrid AI architecture, which integrates GenAI for language and reasoning with ML for structure and explainability, giving users the flexibility to optimize for cost, performance, and trust. Mr. Dharamshi added that integrating Small Language Models (SLMs) with traditional ML allows for a synergistic approach where the semantic understanding of text provided by SLMs complements the analytical capabilities of ML on structured data. This fusion enables the creation of more comprehensive models capable of making informed predictions by considering both the nuances of language and the patterns within numerical or categorical information. Moreover, training SLMs generally incurs significantly lower costs compared to LLMs. This is primarily due to the reduced number of parameters in SLMs, requiring less computational power (fewer GPUs/TPUs and shorter training times).

Mr. Chanthati shared insights on utilizing and training LLMs in enterprise settings, emphasizing the value of Retrieval-Augmented Generation (RAG) to ground responses in data and the use of hybrid architecture. He explained that RAG not only improves response accuracy but also enhances transparency, auditability, and contextual relevance, key advantages for high-stakes, multi-domain applications. He also highlighted best practices for organizations deploying personal or private LLMs in handling Personally Identifiable Information (PII), advocating for structured data masking to balance privacy and utility. His talk offered a clear roadmap for building secure, compliant, and context-aware AI systems within institutions. He discussed cost optimization strategies, noting that selective use of lightweight models and retrieval mechanisms can significantly reduce infrastructure and inference costs without compromising performance, while following best practices for hybrid deployment, model orchestration, and latency-aware architecture design.
Ms. Devarajulu offered an in-depth perspective on AI-driven performance optimization within enterprise software systems. She illustrated how traditional machine learning models such as Random Forest and Bayesian optimization can be effectively integrated into application frameworks to predict system failures, proactively minimizing downtime, and enhancing scalability. These predictive capabilities enable organizations to ensure greater system resilience and efficiency. In addition to classical ML techniques, Ms. Devarajulu discussed GenAI implementations that embed performance metrics and infrastructure efficiency into AI design. She emphasized secure governance and innovation, noting that successful AI deployment hinges on performance, compliance, API-driven orchestration, efficient LLM integrations, and cost-effective, enterprise-ready systems.
Simplifying AI Interactions for Users
Other presentations touched on building user interfaces that simplify how non-technical users interact with AI. These talks emphasized how thoughtful design can make complex workflows feel intuitive, empowering business users to help drive the development roadmap that otherwise would be solely in the hands of data science teams.
Particularly important to usability are foundational investments in metadata management, better retrieval systems, and AI-driven visualizations. These ideas map closely to Novacene’s governed dashboards, which combine natural language interfaces with rich, interactive visualizations—enabling business users to explore AI-generated outputs in a way that is both intuitive and secure. The talks also discussed real-time collaboration workflows for to capture expert feedback to improve the AI’s performance, similar to Novacene’s approach of enabling human-in-the-loop workflows directly within the operational inner workings of AI systems.
Looking Ahead
As I continue to have conversations with those at the forefront of business AI adoption, I keep seeing growing demand for AI systems that are robust, transparent, and usable by professionals outside of data science teams. The DSS NYC conversations reaffirmed what we’ve observed across industries: the need for explainability, safety, and collaboration is only increasing.
Save the date: DSS NYC returns on December 11, 2025, at the same fantastic S&P Global headquarters in New York’s Financial District. Keep an eye on the Data Science Salon website to register and join the conversation on taking AI from pilot to production.
What resonated deeply during the sessions was how closely aligned his challenges and solutions were with our own mission of packaging complex systems into self-serve tools for business users. Similarly, the architecture of SqPal echoed Novacene’s integration of modular agents, real-time dialogue, and a built-in human-in-the-loop interface. And Dr. Yu’s talk on BlackRock’s GenAI Copilot echoed Novacene’s vision for a platform that enables AI operationalization supported by expert-vetted guardrails.
A year later, the industry has made progress in moving from experimentation to early-stage deployment. Yet the message from last year still rings true: innovation succeeds when technology puts business users—subject matter experts who understand the context in which intelligent systems are expected to create value—at the heart of the AI loop.