Uncovering How AI Systems Actually Behave
We don't study what AI can do. We study what it does: how behavior emerges, changes, and sometimes fails in ways no one anticipated.
📚 Published Research
Peer-reviewed frameworks and methodologies for AI governance
The Instruction Stack Audit Framework (ISAF)
KC, S. (2025). A Technical Methodology for Tracing AI Accountability Across Nine Abstraction Layers. Version 1.0.
AI accountability failures occur when regulatory audits examine outputs while root causes exist in instruction layers that remain undocumented. ISAF provides a nine-layer technical specification with a 127-checkpoint audit protocol.
DOI: 10.5281/zenodo.18080355 | CC BY-NC-ND 4.0
Deterministic Bias Detection for NYC LL144
KC, S. & HAIEC Lab (2025). Why Reproducibility Matters More Than Accuracy.
NYC Local Law 144 represents the first mandatory bias audit requirement for automated employment decision tools. This paper presents a deterministic architecture using rule-based pattern matching and cryptographic evidence generation.
DOI: 10.5281/zenodo.18056133 | CC BY-NC-ND 4.0
Academic Citation
These frameworks are released for academic peer review, industry validation, and regulatory consideration. If you use these methodologies in your research or implementation, please cite the original papers.
How to Use This Research in Your Organization
For Compliance Teams
ISAF Framework: Use the 9-layer audit protocol as your compliance evidence structure. Map each layer to your regulatory requirements (EU AI Act Articles 10 & 11, NIST AI RMF GOVERN-1.3, ISO 42001 Section 8.2). The 127-checkpoint audit provides ready-made evidence for regulators.
NYC LL144 Paper: Implement the deterministic bias detection architecture to satisfy audit requirements. The cryptographic evidence generation ensures your audit results remain defensible months or years later.
Time savings: 60-80 hours per audit cycle. Cost savings: $15K-$40K in external consultant fees.
For Engineering Teams
DriftTrace Research: Implement behavioral fingerprinting to detect model drift 47 days earlier than traditional accuracy metrics. Prevents production incidents before they occur.
Truth-Reward Gap Research: Use findings to adjust fine-tuning objectives. Reduce confident-but-incorrect outputs by 34% by balancing coherence and accuracy rewards.
Implementation: All research includes reproducible test cases. Fork our GitHub repos, run tests against your models, adapt to your use case.
For Risk & Legal Teams
Behavioral Drift Research: Quantify AI system stability risk. Use 47-day early detection metric in risk assessments. Document monitoring controls for board reporting.
Multi-Agent Divergence Research: Identify coordination failure risks in autonomous systems. Use findings to establish human oversight requirements and kill switch triggers.
Legal defensibility: Peer-reviewed research provides third-party validation for your risk management approach. Cite DOIs in regulatory filings and incident reports.
For Executives & Decision Makers
Key insight: AI compliance is not a one-time audit. It requires continuous monitoring, behavioral tracking, and evidence generation. Traditional GRC tools do not work for adaptive AI systems.
Business case: Implementing research-based controls costs $50K-$200K annually. Average AI incident costs $2.1M. ROI positive if you prevent 1 incident every 10-40 years. Most companies experience 1-3 incidents per decade.
Competitive advantage: Early adopters of ISAF and deterministic compliance have 60-80 hour advantage per audit cycle. First-mover advantage in emerging regulatory landscape.
HAIEC Research vs Alternatives
| Dimension | HAIEC Research | Academic Research | Vendor Whitepapers |
|---|---|---|---|
| Implementation Focus | Production-ready code, reproducible tests, real metrics | Theoretical contributions, novel algorithms, peer review | Product marketing, feature highlights, case studies |
| Time to Value | 1-4 weeks (fork repo, run tests, adapt) | 6-12 months (understand theory, implement, validate) | Requires vendor product purchase |
| Regulatory Alignment | Explicit mapping to EU AI Act, NIST, ISO 42001, NYC LL144 | General principles, not compliance-specific | Compliance claims without technical detail |
| Evidence Quality | Cryptographic verification, audit trails, deterministic outputs | Statistical significance, experimental validation | Customer testimonials, aggregate metrics |
| Cost | Free (open source) + implementation time | Free (papers) + significant engineering effort | $50K-$500K+ annual licensing |
| Vendor Lock-in | None (open source, self-hosted) | None (public research) | High (proprietary platforms) |
| Update Frequency | Quarterly (active development) | Sporadic (publication cycles) | Continuous (product updates) |
Bottom line: HAIEC research bridges the gap between academic theory and vendor products. You get peer-reviewed methodologies with production-ready implementations, at zero licensing cost.
Research Impact & Adoption
🔬 Active Research Topics
High-signal research directly extractable from real implementations, not theory.
Core Compliance & Governance
Operationalizing NIST AI RMF in Production Systems
From abstract risk categories → continuous, audit-grade controls.
Implementing ISO/IEC 42001 via Modular Rule Packs
Turning ISO clauses into executable compliance logic.
Pre-Enforcement Readiness for NYC Local Law 144
Bias audits, evidence retention, and regulator-ready artifacts.
EU AI Act Readiness via Continuous Monitoring
Mapping high-risk system obligations to live telemetry (not point-in-time audits).
Technical & Systems Research
Compliance Rule Packs as Code
Policy → YAML → Enforcement. HAIEC's rule-pack architecture as a new compliance primitive.
Drift Detection as a Compliance Failure Mode
Model, data, and prompt drift tied to regulatory breach risk.
LLM Oversight Without Model Access
Black-box governance using outputs, prompts, and metadata only.
Autonomous Root Cause Analysis for AI Incidents
Linking failures back to policy, data lineage, and controls.
Market & Strategy
Why Companies Fail When They Wait for Enforcement
Artifact debt, cost curves, and missed budget windows.
Why Traditional GRC Tools Break for AI Systems
Static controls vs adaptive models.
Developer-First Compliance vs Consultant-Led Audits
Speed, cost, and defensibility tradeoffs.
Research Categories
Behavioral Drift
How and why AI systems change their behavior over time, even without retraining or updates.
Instruction Sensitivity
When small changes in phrasing or prompt structure produce disproportionately large output differences.
Alignment Deviation
Small shifts in how models apply organizational rules, safety boundaries, or intended outcomes.
Multi-Agent Divergence
How small inconsistencies between autonomous agents compound into contradictory or unsafe actions.
Cognitive Load Testing
How AI behavior degrades under memory constraints, context depth, or multi-step reasoning demands.
Behavioral Reconstruction
Methods for tracing how a system arrived at a decision across time, context, and internal states.
Featured Research Lines
DriftTrace: Behavioral Fingerprinting for Production AI
ActiveStandard monitoring tracks outputs. DriftTrace tracks behavior. We establish a baseline behavioral profile for an AI system, then continuously compare new behavior against that baseline to detect drift, inconsistency, or emergent patterns.
Key Finding
In a six-month study of 23 enterprise models, DriftTrace detected behavioral changes an average of 47 days before traditional accuracy metrics showed degradation.
The Truth-Reward Gap in Language Models
ActiveModels are trained to be helpful, harmless, and honest. But these goals sometimes conflict. When a model is rewarded for confidence or coherence, it may produce wrong answers that sound correct. This research maps when and why the truth-reward gap emerges.
Key Finding
Models fine-tuned with higher coherence rewards showed a 34% increase in confident but incorrect outputs when tested on ambiguous factual questions.
Context Steering in Multi-Turn Interactions
OngoingA model's behavior changes based on conversation history, system prompts, and hidden context. Small changes in context can steer outputs in unintended directions. We study how context shapes behavior and develop methods to detect unwanted steering.
Key Finding
Adding three semantically similar but syntactically different examples to context changed model outputs in 62% of tested prompts, even when the examples contained no new information.
Methodology
HAIEC research combines experimental methods from cognitive science, stress testing from reliability engineering, and causal analysis from machine learning. Every finding is documented with reproducible test cases.
Controlled Experiments
We systematically vary inputs, contexts, and conditions to isolate specific behavioral mechanisms and understand causality.
Longitudinal Studies
Tracking the same systems over weeks or months to observe how behavior evolves and detect drift patterns.
Failure Analysis
When AI systems fail in the real world, we reconstruct what happened using logs, prompts, and counterfactual testing.
Stress Testing
Pushing models to their limits with edge cases, cognitive load, and adversarial inputs to reveal failure modes.
Published Research & Whitepapers
The Instruction Stack Audit Framework (ISAF)
KC, S. (2025). A Technical Methodology for Tracing AI Accountability Across Nine Abstraction Layers. Version 1.0.
AI accountability failures occur when regulatory audits examine outputs while root causes exist in instruction layers that remain undocumented. ISAF provides a nine-layer technical specification defining instruction propagation from hardware substrate to emergent behavior, with a 127-checkpoint audit protocol for systematic verification.
DOI: 10.5281/zenodo.18080355 | License: CC BY-NC-ND 4.0
Deterministic Bias Detection for NYC Local Law 144
KC, S. & HAIEC Lab (2025). Why Reproducibility Matters More Than Accuracy. A Technical Framework for Compliance-Grade AI Auditing.
NYC Local Law 144 represents the first mandatory bias audit requirement for automated employment decision tools in the US. This paper argues that reproducibility is more fundamental than algorithmic sophistication. We present a deterministic architecture using rule-based pattern matching, version-controlled lexicons, and cryptographic evidence generation to create audit trails that remain valid months or years after analysis.
DOI: 10.5281/zenodo.18056133 | License: CC BY-NC-ND 4.0
Academic Citation
These frameworks are released for academic peer review, industry validation, and regulatory consideration. If you use these methodologies in your research or implementation, please cite the original papers.
Both papers are published under Creative Commons licenses and contain technical contributions potentially eligible for patent protection.
Research FAQ
How is HAIEC research different from academic AI research?
Academic research prioritizes novelty and theoretical contributions. HAIEC research prioritizes implementation and regulatory alignment. We publish production-ready code, reproducible tests, and explicit compliance mappings. Time to value: 1-4 weeks vs 6-12 months for academic implementations.
Can I use HAIEC research for commercial purposes?
Yes, with attribution. Papers are published under CC BY-NC-ND 4.0 (non-commercial, no derivatives). Code is typically MIT or Apache 2.0 licensed (commercial use allowed). Check individual repositories for specific licenses. If you need commercial licensing for papers, contact us.
How do I cite HAIEC research in my work?
Use the DOI for formal citations. Example for ISAF: KC, S. (2025). The Instruction Stack Audit Framework (ISAF): A Technical Methodology for Tracing AI Accountability Across Nine Abstraction Layers. Zenodo. https://doi.org/10.5281/zenodo.18080355
For code implementations, cite the GitHub repository with commit hash or release version.
What is the peer review process for HAIEC research?
Published papers undergo internal technical review, external expert review (industry practitioners and academics), and regulatory alignment review (legal/compliance experts). We prioritize practical validation over traditional academic peer review. Papers are published on Zenodo with DOIs for permanent archival and citability.
How often is research updated?
Major papers: annually or when regulations change significantly. Code implementations: quarterly updates for bug fixes and feature additions. Active research topics: continuous updates as findings emerge. Subscribe to GitHub repositories for notifications.
Can I contribute to HAIEC research?
Yes. Code contributions: submit pull requests to GitHub repositories. Research contributions: share findings, test cases, or real-world incident data (anonymized). Regulatory insights: if you have inside knowledge of enforcement actions or regulatory interpretations, we want to hear from you. Contact research@haiec.com.
What is the relationship between HAIEC research and HAIEC products?
Research informs product development. ISAF research became the ISAF implementation tool. DriftTrace research is being integrated into monitoring features. Products validate research in production environments. Research remains open source and free, even if products are commercial.
How do I know if HAIEC research applies to my use case?
ISAF: applies to any AI system requiring audit trails (EU AI Act high-risk systems, regulated industries, government contracts). NYC LL144: applies to automated employment decision tools used in NYC. DriftTrace: applies to any production AI system where behavior stability matters. Truth-reward gap: applies to fine-tuned language models. If unsure, run our open source tools against your system and evaluate results.
What is the typical ROI of implementing HAIEC research?
Time savings: 60-80 hours per audit cycle (ISAF). Cost savings: $15K-$40K in external consultant fees per audit. Risk reduction: 47-day earlier detection of drift (prevents incidents). Compliance advantage: first-mover advantage in emerging regulatory landscape. Typical payback period: 3-6 months for organizations with active AI compliance programs.
Where can I get help implementing HAIEC research?
Documentation: each paper includes implementation guide. Code: GitHub repositories have examples and tests. Community: join discussions on GitHub Issues. Commercial support: HAIEC offers implementation consulting, training, and custom development. Contact support@haiec.com for enterprise support options.
Participate in Research
We partner with organizations deploying AI systems to conduct real-world behavioral studies. Participants receive detailed reports on their systems' behavioral patterns and early access to HAIEC research findings.