Artificial intelligence is no longer sitting at the edges of banking technology stacks. It is moving directly into production environments, shaping customer interactions, internal workflows and decision-making processes at scale.
For QA and software testing teams, this shift is changing the nature of quality itself. Systems are becoming less deterministic, outputs harder to predict and traditional validation approaches increasingly strained.
At the same time, regulatory expectations around resilience, explainability and control are intensifying.
Banks are under pressure to demonstrate not just that AI systems work, but that they behave consistently, can be audited and do not introduce unmanaged risk into critical operations. That combination is exposing a growing disconnect between how quickly AI is being deployed and how effectively it is being tested.
U.S.-based Chris Sheehan, EVP of High Tech and AI at Applause, frames this moment as a turning point for QA teams tasked with validating these systems.

He singled out new research from Applause’s State of Digital Quality in Testing AI 2026 report highlights a sharp acceleration in AI rollout alongside persistent weaknesses in validation, governance and testing maturity, raising concerns for QA teams operating under increasing regulatory and operational scrutiny.
More than half of organisations have now released AI powered features, reflecting a rapid shift from proof of concept to production, Sheehan said.
Yet this progress is uneven, with only a minority of initiatives reaching full scale deployment, while a significant share are rolled back or deactivated after release due to cost, performance or quality issues.
QA teams struggle
The report pointed to a structural imbalance between AI velocity and testing capability. Organisations are racing to embed generative and agentic AI into customer journeys, internal workflows and decision making systems, but lack the tooling and expertise required to validate non deterministic outputs reliably.
“In the current market, organisations are facing resource bottlenecks and technical immaturity as they work to develop comprehensive testing strategies for AI,” Sheehan said.
He stressed that “we’re at an inflection point where teams are increasingly relying on AI and automation to be able to test quickly, but lack the expertise to ensure that those systems they’re using to test have been properly trained and tuned to effectively reduce risk.”
This imbalance is already translating into real world failures. AI systems continue to produce hallucinations, lose context in multi step interactions and generate inconsistent outputs, issues that traditional QA frameworks are not designed to catch.
For financial institutions, where explainability, auditability and customer trust are critical, these weaknesses present a direct challenge to operational resilience and regulatory compliance.
“Organisations face resource bottlenecks and technical immaturity as they develop testing strategies for AI.”
– Chris Sheehan
One of the clearest signals from the report is the frequency of AI feature rollbacks. A substantial proportion of organisations have pulled live AI systems after deployment, often because operational costs outweighed user value or testing revealed flaws too late in the lifecycle.
This pattern reflects a broader issue. AI initiatives are still struggling to transition from controlled environments into production grade systems that can operate reliably at scale.
Quality issues are also becoming more visible to end users. Reports of hallucinations, misunderstood prompts and incomplete responses are increasing, even as productivity gains from AI remain high.

For QA leaders in banking, this creates a dual pressure, enabling faster AI delivery while preventing reputational, financial and regulatory fallout from poorly tested systems.
Despite advances in automation and AI driven testing, the report reinforces a consistent theme. Human oversight remains central to effective AI validation.
Most organisations still rely heavily on human evaluation to assess AI outputs, particularly for context, bias and user experience, areas where automated testing alone falls short. Hybrid models combining AI driven testing with human validation are emerging as the most effective approach.
High performing teams are also adopting continuous evaluation loops, embedding testing throughout the software development lifecycle rather than treating it as a final checkpoint. This includes real world testing, domain expert input and ongoing monitoring of AI behaviour post deployment.
For financial services firms, the findings underline a broader shift. QA is no longer just a delivery function, but a critical control layer for AI risk.
As banks deploy AI into core systems, from customer interactions to decision support, testing must evolve to address probabilistic behaviour, model drift and complex, multimodal outputs.
QA FINANCIAL EVENTS



QA FINANCIAL NEWSLETTER
Why not become a QA Financial subscriber?
It’s entirely FREE – Sign up here!
* Receive our weekly newsletter every Wednesday * Get priority invitations to our Forum events
QA FINANCIAL PODCAST


CLICK HERE TO LISTEN TO MORE EXCLUSIVE CONVERSATIONS

ALSO DON’T SKIP THESE WEBINARS

READ MORE
- Inside banking’s shift to smarter QA to tackle complexity and risk
- SmartBear CPTO on AI in banking QA: ‘Impressive metrics but no critical scenarios’
- Banks push beyond traditional QA as resilience testing gains ground
- Banking QA professionals warn AI still doesn’t know ‘where the bodies are buried’
- RECAP: The QA Financial Healthcare & Insurance Forum Philadelphia 2026

