Thing Event SystembyPentatonic
Blog/AI Bias Detection in Production
AI SafetyMarch 28, 202610 min read

AI bias detection in production: moving beyond static audits

Most bias detection is a pre-deployment checkbox. Run a fairness audit, document the results, ship the model. But bias in production isn't static — it emerges from data distributions that shift, user populations that change, and AI systems that drift. You need detection that evolves as fast as your data.

The problem with static audits

A static bias audit is a photograph. It tells you what bias looked like at a single point in time, on a specific dataset, with specific metrics. By the time you act on it, the world has moved.

In production AI systems — especially autonomous agents making decisions about pricing, routing, and settlement — three things change continuously:

  • Data distribution. The items, users, and transactions your agent processes today are different from last month.
  • Model behaviour. Even without retraining, LLM behaviour shifts via prompt changes, context window contents, and API updates.
  • Interaction patterns. Users adapt to the system, creating feedback loops that static audits can't capture.

What Bias Evolution does differently

Bias Evolution runs continuous detection on your live event stream. Instead of hand-crafted fairness metrics, it uses evolutionary algorithms to discover bias patterns you didn't know to look for.

The process:

  1. 01Generate hypotheses. Detection rules are encoded as expression trees. An initial population of 500 rules is generated — each one a hypothesis about a potential bias pattern.
  2. 02Test against events. Each rule is evaluated against your event stream. Fitness is measured by signal strength — does this rule detect a real, recurring pattern?
  3. 03Evolve. High-fitness rules survive. Crossover and mutation operators create novel hypotheses. Low-fitness rules die. Over generations, the population converges on rules that detect real bias.
  4. 04Anti-degeneration lock. Every evolved rule must pass 1,000 runs without false positives before deployment. This prevents the evolution from producing noise.
Bias pattern lifecycle
// Bias patterns have their own lifecycle:
// emerging → stable → declining → dormant

const patterns = await tes.query(`{
  biasPatterns(status: "stable") {
    id
    description
    confidence
    velocity        // how fast is it growing?
    affected_count  // how many events matched?
    first_detected
    last_detected
  }
}`);

// Each pattern includes Bayesian confidence
// that updates with every new matching event

What it detects

Because the detection rules are evolved, not hand-written, Bias Evolution finds patterns that humans wouldn't think to test for. Common findings include:

Price bias

Systematic over- or under-valuation by brand, category, or condition grade. Detects when market pricing diverges from actual sale prices.

Routing bias

Items from certain categories or origins being disproportionately routed to specific outcomes (recycle vs resell).

Agent preference drift

An agent gradually favouring certain suppliers, products, or decision paths over time without explicit instruction.

Demographic patterns

Correlations between holder attributes and processing speed, valuation, or outcome. Flags potential fair-lending issues.

Temporal anomalies

Processing times or approval rates that vary by time of day, day of week, or season in unexpected ways.

Classification drift

Vision AI condition grades trending systematically higher or lower over time, suggesting model degradation.

Why event sourcing makes this possible

Bias Evolution works because TES provides a complete, immutable record of every decision. Traditional databases overwrite state — you can't detect temporal patterns in data that's been updated. With event sourcing, every historical state is preserved.

The evolutionary algorithms run directly on the event stream. They have access to the full decision chain: what the agent saw, what it decided, who it affected, and what the outcome was. This is the dataset that static audits reconstruct imperfectly from snapshots.

Combined with Agent Memory, the system can also detect bias in what agents remember versus what they forget — a class of bias that's invisible to systems without persistent memory tracking.

The regulatory angle

The EU AI Act Article 15 requires that high-risk AI systems maintain "accuracy, robustness, and cybersecurity" throughout their lifecycle — not just at deployment. Continuous bias detection satisfies the "ongoing monitoring" requirement in a way that static audits cannot.

When a regulator asks "how do you monitor for bias in production?", the answer is a dashboard showing live pattern velocity, confidence scores, and affected populations — not a PDF from six months ago.

Pentatonic Engineering

London, UK

Detect what you can't predict

Bias detection that evolves with your data

Evolutionary algorithms on live event streams. Not static audits — living detection that gets smarter over time.