Banks rethink AI testing as ‘confidence’ becomes the QA benchmark

Hélder Ferreira, Director of Product Management at Sembi

As generative AI rapidly reshapes software delivery, QA and software testing teams in banking and financial services are being pushed into a new phase of automation, one defined less by novelty and more by accountability.

AI-driven tooling can now produce test cases, suggest scenarios, and even attempt self-healing maintenance, but financial institutions operate in environments where reliability, auditability, and regulatory expectations leave little room for brittle or untraceable testing artefacts.

For banks, insurers, and capital markets firms, the question is no longer whether GenAI can accelerate testing, but whether it can do so without eroding confidence.

As AI features introduce non-deterministic behaviour into production systems, and as delivery cycles compress, QA leaders are increasingly focused on keeping test intent, execution, and governance connected end to end.

Bruno Mazzotta

Hélder Ferreira, Director of Product Management at Sembi, and Bruno Mazzotta, Solution Engineer Manager at testRigor, said early enthusiasm for AI-driven testing has often centred on automating the writing of test cases.

But they warned that “speed doesn’t equal confidence,” particularly when auto-generated tests fail to reflect real product risk.

“GenAI showed up in QA through test case generation because writing cases takes time, and coverage backlogs are real,” Ferreira and Mazzotta wrote. “But speed doesn’t equal confidence.”

They argued that “auto-generated tests that don’t reflect the product, conventions, or real risks just push work downstream, leaving testers to rewrite and revalidate.”

Instead, they said what holds up in practice is “lifecycle-wide support,” where AI assists planning, execution, triage, and maintenance, while “humans stay accountable for what ships.”

Why ‘one-shot’ AI testing breaks down

The pair cautioned against what they described as “one-shot” AI testing, where tools produce a full test case in seconds and the output becomes “official” before assumptions are verified.

“Many autonomous tools promise a full test case in seconds, but issues arise when the output becomes ‘official’ before anyone verifies the assumptions,” Ferreira and Mazzotta said.

They noted that this dynamic leads to predictable outcomes: “Teams accept low-quality artifacts because they’re busy,” or “they spend time cleaning up output that never should have been generated in the first place.”

A more sustainable model, they argued, is review-first, where AI proposes coverage ideas and testers approve or edit before cases are written.

“A healthier pattern is review-first: AI can propose coverage ideas, edge cases, and acceptance criteria, and a tester approves or edits the plan before detailed cases are written,” they said.

For banks and insurers facing complex environments, they said expertise remains critical in deciding “what’s meaningful, what’s redundant, and what carries risk.”


“Testing AI-infused features changes what ‘expected result’ means.”

Bruno Mazzotta

Ferreira and Mazzotta framed the shift as part of a broader move toward what they called an “intelligent quality ecosystem,” rather than isolated AI features.

“An ‘intelligent quality ecosystem’ is less about one feature and more about a holistic connection,” they wrote.

In that model, “test intent stays anchored in a test management system,” execution happens in an automation layer “built for resilience,” and AI helps bridge intent and execution by translating requirements into maintainable checks.

“That connection is what prevents AI from turning into a test factory that produces volume without traceable intent,” they said.

They also pointed to day-to-day areas where AI can reduce friction, including test data creation, failure triage, and automation maintenance.

“When a pipeline fails, QA and developers often dig through noise to find what broke,” Ferreira and Mazzotta said.

“AI can reduce pipeline noise by accelerating failure attribution and shortening time-to-signal for teams responsible for the change.”

They added that AI-assisted self-healing can help stabilise automation after UI changes, but only when it remains reviewable.

“AI-assisted self-healing helps when it stays reviewable,” they wrote, warning that “human confirmation and monitoring prevent silent drifts.”

AI-infused products

For financial institutions increasingly deploying AI-infused products, Ferreira and Mazzotta said testing also changes fundamentally, because “expected result” is no longer always deterministic.

“Testing AI-infused features changes what ‘expected result’ means,” they said. “A support assistant, coding helper, or summarizer won’t always return the same text twice, even when it is behaving as intended.”

Instead of matching exact outputs, teams validate intent, enforce safety guardrails, verify retrieval behaviour, and monitor drift over time.

“So, the checks shift,” they wrote. “Instead of only matching exact outputs, teams validate intent… enforce safety and policy guardrails… verify retrieval behavior… and watch for drift over time.”

Ultimately, Ferreira and Mazzotta argued that the real measure of GenAI in QA is not the volume of tests produced, but whether teams can trust what is being shipped.

“The litmus test for AI in QA is confidence, not volume,” they said.

“When evaluating AI for testing, prioritize approaches that keep teams in control and maintain end-to-end testing connectivity,” they wrote.

“If teams can’t trace what was tested, why it matters, and what changed between runs, AI just creates more output. If they can, it becomes a practical way to protect coverage and ship with confidence.”


QA FINANCIAL EVENTS



Why not become a QA Financial subscriber?

It’s entirely FREE

* Receive our weekly newsletter every Wednesday * Get priority invitations to our Forum events *

REGISTER HERE TODAY


REGULATION & COMPLIANCE

Looking for more news on regulations and compliance requirements driving developments in software quality engineering at financial firms? Visit our dedicated Regulation & Compliance page here.


READ MORE


WATCH NOW


QA FINANCIAL PODCASTS

CLICK HERE TO LISTEN TO OUR EXCLUSIVE CONVERSATIONS