AI in QA: how flexible testing is redefining assurance for financial firms

CEO Anthony Schmidt

As financial institutions embed artificial intelligence deeper into their customer platforms, risk engines, and compliance systems, quality assurance is entering uncharted territory.

Algorithms that evolve continuously and outputs that differ from one test cycle to another are breaking the assumptions behind traditional, deterministic testing.

For banks and insurers, where precision and reliability are non-negotiable, this raises a pressing question: how can QA teams ensure consistency in systems that no longer behave the same way twice?

A new white paper by U.S.-based software-testing automation firm Usetrace, led by its CEO Anthony Schmidt, explores exactly this issue.

Titled ‘Flexible Testing Approaches for Non-Deterministic AI and Data-Driven Systems’, the study outlines practical ways to restore confidence in QA pipelines as generative and predictive technologies reshape software behaviour.

“Software applications are evolving faster than ever as artificial intelligence and data-driven features become ubiquitous,” Schmidt stated.

“While these capabilities enable dynamic personalisation and powerful analytics, they also introduce non-deterministic behaviours: the same input can produce different valid outputs,” he added.

For financial QA teams accustomed to comparing actual versus expected results, that shift means old testing scripts “trigger false failures or demand frequent updates.”

Schmidt argued that such environments require “flexible testing strategies [that] can tame non-deterministic applications without sacrificing reliability.”

Instead of verifying a single correct value, the approach validates that results fall within a “valid range or meet certain properties,” he stressed.

For example, if an AI system delivers fluctuating fraud-risk scores or variable chatbot responses, flexible assertions ensure those differences remain within acceptable business or compliance thresholds.

Traditional automation

Traditional scripted automation, according to the white paper, is especially brittle in these conditions.

“Traditional test scripts often encode specific expected values or sequences. If data or AI logic changes, the scripts fail unless they are updated,” Schmidt shared.

That fragility, the authors note, drives up maintenance costs and erodes confidence: “Analysts note that test automation’s ROI plummets when more than 30% of developer or QA time goes to ‘chasing failing tests’ rather than genuinely improving coverage.”

To address this, Schmidt advocates layered validation techniques. “At the heart of flexible testing is the concept of layers of assertions, each verifying a different aspect of correctness.”

He stressed that “rather than one binary check, such as ‘output must be X’, multiple conditions define what ‘correct’ looks like, even if the system is non-deterministic.”

These layers can include threshold checks for numeric outputs, regex-driven patterns for language variations, and property-based testing to verify relationships between runs.

Schmidt also highlighted metamorphic testing, where instead of comparing one static outcome, testers confirm that predictable transformations in input lead to corresponding changes in output, an approach that “allows for testing that is more flexible and constraining across applications where the exact output is not fully deterministic.”

A case study involving legal research platform Trellis illustrates the concept in practice. Trellis initially struggled with brittle scripts that failed whenever data sets changed.

By adopting Usetrace’s scriptless tool and replacing exact results with thresholds, checking, for instance, that a query returns between 5 and 50 matches, Trellis “dramatically reduced false alarms and maintenance overhead.”

As the report noted: “The QA team still caught genuine issues…but valid fluctuations, like 15 to 17 results, caused no test failures.”


“By layering assertions, setting thresholds, or using regex and metamorphic checks, QA can capture real intent without overfitting to one specific output.”

– Anthony Schmidt

Scriptless automation is also a recurring theme. “A crucial enabler of Trellis’s success was scriptless, low-code, automation,” Schmidt continued.

This lowers the barrier to advanced testing by letting non-technical stakeholders define assertions through a visual interface, he explained.

“Thresholds, regex patterns, or property checks can be tweaked in a GUI rather than code,” he added, enabling faster adaptation as AI models evolve.

For banks and insurers facing similar dynamics, such as fluctuating credit-risk scores, changing data feeds, or evolving language models in customer-service bots, the implications are significant.

As Schmidt noted: “These flexible methods matter more than ever because AI is increasingly baked into mainstream software, from chatbots to recommendations to predictive analytics.”

The benefits, Usetrace concluded, extend beyond technical stability: “Implementing flexible, scriptless testing yields benefits in cost savings, speed, and lower production risk. Stable automation fosters continuous testing. Flaky tests rarely block deployments, so releases flow faster.”

Schmidt sees a future where QA is less about enforcing rigid expectations and more about governing intelligent variability.

“Flexible testing offers a pragmatic solution,” he concluded. “By layering assertions, setting thresholds, or using regex and metamorphic checks, QA can capture real intent without overfitting to one specific output.”

In an era when AI is rewriting the rules of both banking and software assurance, Schmidt’s message to QA leaders is clear: evolve your test strategies now, or risk chasing false alarms while the algorithms move on without you.




WATCH NOW


Why not become a QA Financial subscriber?

It’s entirely FREE

* Receive our weekly newsletter every Wednesday * Get priority invitations to our Forum events *

REGISTER HERE TODAY


REGULATION & COMPLIANCE

Looking for more news on regulations and compliance requirements driving developments in software quality engineering at financial firms? Visit our dedicated Regulation & Compliance page here.


READ MORE


WATCH NOW


QA FINANCIAL PODCASTS

Listen to Sudeepta Guchhait on Nasdaq’s new Mimic AI testing platform
QA Financial sits down with Sudeepta Guchhait, Senior Director of Product Framework & Quality Engineering at Nasdaq

——–

Listen to Wesley Scheffel and Robin Rain on Schroders’ DevOps strategy
We catch up with Wesley Scheffel, Head of Cloud Platform and Product Engineering at Schroders, and Robin Rain, Head of Cloud Platform Architecture

——–

Listen to Citi’s Jason Morris on Lightspeed and the future of continuous delivery
Jason Morris, Head of Developer Pipelines for Securities Markets and Banking at Citi, talks about Lightspeed