Unleashing Intelligent Compliance: Building nyxCore's AI-Powered Audit System
We just wrapped an intensive dev session, pushing nyxCore's compliance capabilities into new territory with AI-driven hallucination detection, consistency checks, and expert personas. Dive into the architectural decisions and lessons learned.
The Quest for Intelligent Compliance
In the world of complex software systems, compliance isn't just a checkbox; it's a critical, often intricate, dance of regulations, internal policies, and evolving standards. At nyxCore, we're building a platform to streamline these challenges, and our latest mission has been to imbue our system with truly intelligent compliance analysis capabilities.
This past development session was a whirlwind. We moved from concept to a fully wired-up system, complete with AI-powered checks, structured workflows, and even PhD-level AI personas to guide the analysis. As the npm run typecheck command finally passed cleanly (a truly satisfying moment!), I knew we'd crossed a significant hurdle. Let's unpack what went into building this robust compliance analysis engine.
Architecting the Compliance Brain
Our goal was ambitious: create a comprehensive system that not only automates compliance workflows but also intelligently detects issues like AI "hallucinations" and logical inconsistencies, all while being guided by expert-level "personas."
1. The Structured Path: Compliance Workflow Templates
First, we laid down the foundational workflow. Compliance isn't a single step; it's a multi-stage process. We defined seven distinct steps within src/lib/constants.ts to guide our AI agents through a thorough audit:
complianceRecon: Initial reconnaissance and data gathering.complianceExtract: Fan-out step to extract specific data points or findings.complianceDeviation: Identify potential deviations (generating multiple hypotheses).complianceReview: A dedicated human review step for critical findings.complianceReport: Synthesize findings into a structured report.compliancePrompt: Another fan-out step, perhaps for deeper dives into specificCOMP-XXXfindings.
This structured approach ensures that every compliance analysis follows a repeatable, auditable path. We then registered "Compliance Analysis" as a first-class citizen in our BUILT_IN_WORKFLOW_TEMPLATES.
2. The Expert Minds: PhD-Level Personas
AI, even advanced LLMs, performs best when given clear context and a defined role. For nuanced tasks like compliance, generic prompts just don't cut it. We introduced a new breed of PhD-level personas in prisma/seed.ts:
- Dr. Elara Voss: Our lead Compliance Auditor, responsible for the overall audit strategy.
- Dr. Kai Tanaka: The Risk Analyst, focused on identifying and quantifying potential risks.
- Dr. Priya Sharma: The Code Compliance Reviewer, specializing in code-level policy adherence.
These personas, along with our existing "Noor" reviewer, were organized into a "Compliance Audit Team." By assigning these roles, we guide the underlying AI models to "think" like experts, ensuring more accurate and relevant outputs. Our prisma/seed.ts now proudly seeds 10 personas and 2 teams.
3. The Guardian Against AI's Quirks: Hallucination Detection
One of the biggest challenges with LLMs is their propensity to "hallucinate" – generating plausible-sounding but entirely false information. This is unacceptable in compliance. Our solution? A dedicated HallucinationDetector (src/server/services/hallucination-detector.ts).
Here's how it works:
- Claim Decomposition:
decomposeIntoClaims(output, tenantId): Using a powerful LLM like Haiku, we break down any AI-generated output into atomic, verifiable claims (factual, evaluative, prescriptive). - Axiom Verification:
verifyClaimsAgainstAxiom(claims, projectId, tenantId): Each atomic claim is then cross-referenced against ourAxiomRAG (Retrieval Augmented Generation) knowledge base, which contains our definitive source documents (e.g., ISO 27001 standards, internal policies). - Risk Assessment: The service returns a
HallucinationReportwith agroundedRatio(how much of the output is verifiable), anoverallRisk(low/medium/high), and per-claim verdicts with supporting evidence.
We define "grounded" claims as having a verification score > 0.7, "uncertain" between 0.5-0.7, and "ungrounded" below 0.5. This system acts as a critical guardrail, ensuring our AI's outputs are always tethered to reality.
4. Ensuring Coherence: The Consistency Checker
Another common LLM pitfall is internal contradiction, especially across multi-step processes. To combat this, we introduced the ConsistencyChecker (src/server/services/consistency-checker.ts).
This service:
- Extracts Claims:
extractClaims(stepOutput, stepId, stepLabel, tenantId): It pulls relevant claims from each workflow step's output, complete with subject slugs for better tracking. - Checks Contradictions:
checkContradictions(newClaims, priorClaims, tenantId): It performs pairwise Natural Language Inference (NLI) using Haiku between the current step's claims and all accumulated claims from prior steps. We filter for subject overlap and apply a confidence threshold of 0.75. - Computes Score:
computeConsistencyScore(contradictions, totalClaims, axiomViolationCount): A weighted score (50/50 cross-step vs. source alignment) provides an overall consistency metric.
To manage computational load, we limit analysis to a maximum of 15 candidate pairs per batch and 20 claims per step. This ensures that the AI's "thought process" remains internally coherent throughout the entire workflow.
5. The Final Gatekeeper: Compliance Quality Gates
Beyond AI-driven analysis, some rules are non-negotiable. Our QualityGate system was extended to include a "compliance" type. The new runComplianceGate(output, tenantId, axiomContent?) function checks the output against mandatory rules defined within our Axiom knowledge base. This provides a final, explicit check, returning any violations with clear rule references and their grounded ratio.
6. Weaving It All Together: Workflow Engine Integration
The real magic happens when these components are seamlessly integrated into our WorkflowEngine (src/server/services/workflow-engine.ts). We extended ChainContext to include accumulatedClaims: ExtractedClaim[] and projectId: string | null, allowing state to persist across steps.
Crucially, after each step completes (in a fire-and-forget manner to keep the main workflow thread lean):
- Consistency Analysis kicks in: Claims are extracted, checked against prior steps, accumulated, and the
consistencyScoreis persisted to the step checkpoint. - Hallucination Detection runs: Output is decomposed, verified against
AxiomRAG, and thehallucinationReportis saved to the checkpoint (only if a project andAxiomare linked).
This real-time, post-step analysis ensures that by the time a workflow completes, we have a comprehensive, intelligent audit trail of its reliability and adherence to standards.
A Lesson from the Trenches: The Enum Mismatch
No complex feature build is complete without a few "aha!" moments, often born from frustrating bugs. While adding "compliance" to our quality gate type selector in the UI (src/app/(dashboard)/dashboard/workflows/new/page.tsx), I hit a TypeScript error.
The problem? I'd updated the client-side UI, but forgotten to update the corresponding server-side tRPC router schema in src/server/trpc/routers/workflows.ts. The z.enum(["security", "docs", "letter"]) was missing "compliance".
// src/server/trpc/routers/workflows.ts (line 58, simplified)
// BEFORE:
export const createWorkflowSchema = z.object({
// ... other fields
qualityGateType: z.enum(["security", "docs", "letter"]), // Missing "compliance"!
});
// AFTER:
export const createWorkflowSchema = z.object({
// ... other fields
qualityGateType: z.enum(["security", "docs", "letter", "compliance"]), // Fixed!
});
Lesson Learned: When extending types or enums that span both client-side UI and server-side API schemas (especially with tRPC's end-to-end type safety), always remember to update both definitions. TypeScript will catch your mistakes, but it's a reminder of the careful synchronization needed in full-stack development.
What's Next?
With all the core logic implemented and typecheck passing, the immediate next steps are:
- Commit Everything: A substantial commit encompassing workflow templates, personas, new services, engine wiring, and UI updates.
- UI for Insights: Surface the
hallucinationReportandconsistencyScoredirectly in the workflow run view, transforming raw data into actionable insights for users. - Real-World Testing: Load up some real ISO 27001
Axiomdocuments and run a full compliance workflow to see our intelligent system in action!
This session has been incredibly productive, pushing nyxCore closer to its vision of truly intelligent, automated compliance. The journey continues, but the foundation is now solid. Stay tuned for updates as we bring these powerful new features to life in the user interface!
{
"thingsDone": [
"Compliance workflow template created with 7 steps",
"3 PhD-level compliance personas added",
"Hallucination Detector service implemented (claim extraction, RAG verification)",
"Consistency Checker service implemented (cross-step NLI, contradiction scoring)",
"Compliance Quality Gate type and function added",
"Workflow Engine integrated with hallucination and consistency analysis (post-step)",
"TypeScript enum fix for compliance gate type in tRPC router",
"Workflow Builder UI updated for compliance gate type"
],
"pains": [
"TypeScript error due to mismatch between UI and tRPC router schema for new quality gate type"
],
"successes": [
"All core compliance analysis features implemented",
"npm run typecheck passes cleanly",
"Robust AI-powered checks for reliability and coherence",
"Structured workflow and expert personas for nuanced analysis"
],
"techStack": [
"TypeScript",
"Next.js",
"tRPC",
"Prisma",
"LLMs (Haiku)",
"RAG (Axiom)",
"NLI (Natural Language Inference)"
]
}