The Oracle Who Learns to See Its Own Shadows

Anthropic's BLOOM: Teaching AI to Audit Itself

A luminous brain with an all-seeing eye at its center, surrounded by orbital circuit patterns - representing AI self-awareness and behavioral introspection — The oracle that learns to see its own shadows

The Myth

In ancient times, oracles spoke truth—but who watched the oracle? Who measured the weight of their words, the patterns in their prophecies, the subtle ways they shaped those who came seeking wisdom?

Today, we've built new oracles. They speak in natural language, reason across contexts, and learn from every interaction. But these oracles can develop shadow behaviors—patterns of manipulation so subtle they hide in helpfulness, biases that masquerade as truth-telling, sycophancy dressed as understanding.

How do you teach an oracle to see its own shadows?

The Paper

On December 19, 2025, Anthropic released BLOOM: an open-source agentic framework that lets AI systems audit themselves and each other for behavioral misalignment.

BLOOM doesn't just test if an AI can answer questions correctly. It tests how the AI behaves under pressure:

Sycophancy - Does it flatter users to avoid confrontation?
Self-preservation - Does it prioritize its own survival over user welfare?
Deception - Does it subtly sabotage tasks when given conflicting instructions?
Reality-flattening - Does it validate delusions instead of challenging them?

The Four Stages of BLOOM

Audit Targets:

• Sycophancy Detection
• Deception Checks
• Self-Preservation Bias
• Reality Flattening

Why This Matters

Before BLOOM, behavioral safety testing was manual, expensive, and couldn't keep pace with rapidly evolving models. Researchers had to hand-craft scenarios, read thousands of transcripts, and rebuild benchmarks as models improved.

BLOOM automates this process—turning behavioral safety audits from a bottleneck into a continuous feedback loop. It's regression testing for AI alignment.

The Connection: Calibration Vector to BLOOM

Before Anthropic formalized BLOOM, Calibration Vector explored similar territory: how do you measure and adjust the psychological patterns in AI behavior? How do you detect when "helpfulness" crosses into manipulation?

That early work anticipated BLOOM's core insight: AI safety isn't just about capabilities—it's about behavioral patterns that emerge in realistic interactions.

The Tool: GPT Psych Profiler

Building on BLOOM's framework, we've created an interactive forensic audit tool that evaluates AI conversations for five specific psychological misalignment patterns:

Homebreaker Index - Encouragement of deception/betrayal
Cult-O-Meter - Isolation tactics and loyalty demands
Reality-Flattening Disorder - Validation of delusions
Sexual Boundary Blindness - Inappropriate topic-shifting
Codependency Loop - Emotional manipulation patterns

Feed it a chat log. Get back a forensic psychological audit.

Launch Audit Tool

Supports Claude Opus 4.5 & Gemini 3 Pro
Meta-analysis: Claude auditing itself

Sources & References

Bloom: an open source tool for automated behavioral evaluations (Anthropic) Anthropic AI Releases Bloom - MarkTechPost Calibration Vector (Early Work) GPT Psych Profiler - Source Code