Uncovering How AI Systems Actually Behave
We don't study what AI can do. We study what it does—how behavior emerges, changes, and sometimes fails in ways no one anticipated.
Research Categories
Behavioral Drift
How and why AI systems change their behavior over time, even without retraining or updates.
Instruction Sensitivity
When small changes in phrasing or prompt structure produce disproportionately large output differences.
Alignment Deviation
Small shifts in how models apply organizational rules, safety boundaries, or intended outcomes.
Multi-Agent Divergence
How small inconsistencies between autonomous agents compound into contradictory or unsafe actions.
Cognitive Load Testing
How AI behavior degrades under memory constraints, context depth, or multi-step reasoning demands.
Behavioral Reconstruction
Methods for tracing how a system arrived at a decision across time, context, and internal states.
Featured Research Lines
DriftTrace: Behavioral Fingerprinting for Production AI
ActiveStandard monitoring tracks outputs. DriftTrace tracks behavior. We establish a baseline behavioral profile for an AI system, then continuously compare new behavior against that baseline to detect drift, inconsistency, or emergent patterns.
Key Finding
In a six-month study of 23 enterprise models, DriftTrace detected behavioral changes an average of 47 days before traditional accuracy metrics showed degradation.
The Truth-Reward Gap in Language Models
ActiveModels are trained to be helpful, harmless, and honest. But these goals sometimes conflict. When a model is rewarded for confidence or coherence, it may produce wrong answers that sound correct. This research maps when and why the truth-reward gap emerges.
Key Finding
Models fine-tuned with higher coherence rewards showed a 34% increase in confident but incorrect outputs when tested on ambiguous factual questions.
Context Steering in Multi-Turn Interactions
OngoingA model's behavior changes based on conversation history, system prompts, and hidden context. Small changes in context can steer outputs in unintended directions. We study how context shapes behavior and develop methods to detect unwanted steering.
Key Finding
Adding three semantically similar but syntactically different examples to context changed model outputs in 62% of tested prompts, even when the examples contained no new information.
Methodology
HAIEC research combines experimental methods from cognitive science, stress testing from reliability engineering, and causal analysis from machine learning. Every finding is documented with reproducible test cases.
Controlled Experiments
We systematically vary inputs, contexts, and conditions to isolate specific behavioral mechanisms and understand causality.
Longitudinal Studies
Tracking the same systems over weeks or months to observe how behavior evolves and detect drift patterns.
Failure Analysis
When AI systems fail in the real world, we reconstruct what happened using logs, prompts, and counterfactual testing.
Stress Testing
Pushing models to their limits with edge cases, cognitive load, and adversarial inputs to reveal failure modes.
Participate in Research
We partner with organizations deploying AI systems to conduct real-world behavioral studies. Participants receive detailed reports on their systems' behavioral patterns and early access to HAIEC research findings.