Projects
Every LLM tool call passes through a seven-step pipeline: RBAC, inbound PII scan, role-specific policy, bounded execution, outbound scan, outbound policy, structured audit log. Six policy actions. Two bundled philosophies (permissive analyst, strict financial). 15-case golden-set eval, 100% pass.
Single-page calculator that answers every CTO's first three infrastructure questions: What GPU do we need? What does it cost monthly? When does self-hosted beat API? Architecture-aware VRAM calculation for 30+ models accounting for GQA, MLA, and MoE. Interactive SVG break-even chart. Zero build step.
QLoRA fine-tune of Qwen2.5-3B-Instruct for FDCPA rule classification. Three-way eval: o3-mini (ceiling) vs base Qwen (floor) vs QLoRA (fine-tuned). All 6 errors are false negatives from keyword-level pattern matching, not legal reasoning. The pre-filter pattern: small model handles easy cases, API handles the rest.
Paste a redacted collections call transcript. Get a structured compliance report against 12 FDCPA/Reg F rules with verbatim evidence quotes, statutory citations, and autofail violation summary. Dual-path evaluator: one LLM call for semantic rules, deterministic Python for metadata rules.
Trains RL agents to audit financial services call transcripts for CFPB, TCPA, and GDPR/CCPA violations. Solves the 100% Coverage Problem. Human QA reviews 1-3% of calls; RegTriage covers the other 97% with Draft Incident Reports for human sign-off.
Compiles unstructured rubrics into machine-readable schemas, then evaluates documents against them with golden-set ground truth. Handles three real-world variance cases: clean CSV, boolean composites in comment cells, PDF exports masquerading as spreadsheets. Per-category precision/recall/F1.