Projects

Compliance-aware MCP server with structured audit logging

Every LLM tool call passes through a seven-step pipeline: RBAC, inbound PII scan, role-specific policy, bounded execution, outbound scan, outbound policy, structured audit log. Six policy actions. Two bundled philosophies (permissive analyst, strict financial). 15-case golden-set eval, 100% pass.

Python MCP Pydantic FastAPI
Architecture-aware GPU sizing, cost comparison, and break-even analysis

Single-page calculator that answers every CTO's first three infrastructure questions: What GPU do we need? What does it cost monthly? When does self-hosted beat API? Architecture-aware VRAM calculation for 30+ models accounting for GQA, MLA, and MoE. Interactive SVG break-even chart. Zero build step.

HTML React Tailwind CSS SVG
When fine-tuning small models is (and isn't) worth it for compliance classification

QLoRA fine-tune of Qwen2.5-3B-Instruct for FDCPA rule classification. Three-way eval: o3-mini (ceiling) vs base Qwen (floor) vs QLoRA (fine-tuned). All 6 errors are false negatives from keyword-level pattern matching, not legal reasoning. The pre-filter pattern: small model handles easy cases, API handles the rest.

Python PEFT QLoRA Qwen2.5
FDCPA/Reg F call transcript audit in 60 seconds

Paste a redacted collections call transcript. Get a structured compliance report against 12 FDCPA/Reg F rules with verbatim evidence quotes, statutory citations, and autofail violation summary. Dual-path evaluator: one LLM call for semantic rules, deterministic Python for metadata rules.

Python FastAPI Pydantic LLM
RL environment for regulatory compliance auditing . baselines published, training loop in progress

Trains RL agents to audit financial services call transcripts for CFPB, TCPA, and GDPR/CCPA violations. Solves the 100% Coverage Problem. Human QA reviews 1-3% of calls; RegTriage covers the other 97% with Draft Incident Reports for human sign-off.

Python FastAPI OpenEnv Docker Pydantic RL
Reference pattern for compiling rubrics into evaluable schemas

Compiles unstructured rubrics into machine-readable schemas, then evaluates documents against them with golden-set ground truth. Handles three real-world variance cases: clean CSV, boolean composites in comment cells, PDF exports masquerading as spreadsheets. Per-category precision/recall/F1.

Python Pydantic Pandas