Enterprise-grade content moderation with ML-powered classification, configurable policies, human-in-the-loop review, and immutable audit trails.
Navigate this presentation using arrow keys (← →) or scroll. This platform demonstrates a complete AI governance solution from content analysis through compliance reporting — built entirely with AI assistance.
Enterprise-grade microservices with edge deployment
React SPA with TypeScript
Cloudflare Workers + KV
Rate Limit • CORS • Auth
HuggingFace ML
Configurable Rules
Human-in-Loop
Pooled Connections • TLS
Serverless • TLS
Knowledge Graph
Click any card below to open the live application in a new tab. The Dashboard shows real-time metrics, Moderation Demo lets you test content classification, Policy Management demonstrates configurable rules, and Audit Log displays immutable evidence records.
Try it now! Type any text in the box below and click "Moderate Text" to see real-time classification. The API analyzes content for toxicity, hate speech, harassment, violence, spam, and more. Watch the request/response panel to see the actual JSON payloads.
{
"content": "Hello...",
"source": "demo"
}
{
"action": "allow",
"category_scores": {...}
}
{
"id": "e1000000-0000-...",
"control_id": "MOD-001",
"decision_id": "d0000000-...",
"automated_action": "block",
"category_scores": {
"toxicity": 0.92,
"hate": 0.95
},
"submission_hash": "sha256:a7f3b...",
"immutable": true,
"integrity_hash": "sha256:c9d2e..."
}
Click any framework card to see the detailed controls mapped to that regulation. Each control includes its ID, description, criticality level, and the specific articles or clauses it addresses.
Art. 9, 13, 14, 15, 17
MAP, MEASURE, MANAGE, GOVERN
Clause 6, 8, 9
Art. 5, 22, 32
CC3, CC6, CC7
Drag nodes to explore the relationship graph. This visualization shows how services connect to controls, which map to compliance frameworks (EU AI Act, NIST, ISO, GDPR, SOC 2). The graph is powered by Neo4j Aura with 112 nodes and 138 relationships modeling the complete domain.
Direct HTTP integration with JSON payloads
Native iOS/Android with offline queue
Pre/post-processing for LLM outputs
Event-driven notifications
Enterprise authentication with zero-trust architecture
Multi-layered testing strategy ensures reliability and compliance. E2E tests validate user workflows, CDD tests verify governance controls, and the CI/CD pipeline runs 36 automated tests on every commit.
Defense-in-depth across every layer
9 enhancements for multi-language, multi-provider resilience
NFKC normalization, zero-width stripping, homoglyph and leetspeak decoding to defeat Unicode evasion
Automatic language identification via lingua-go with routing to language-aware classification providers
Parallel multi-provider execution with agreement scoring and auto-escalation on disagreement
Per-provider offset and scale normalization to a unified 0-1 scale with dynamic feedback-driven tuning
Ambiguous scores (0.3-0.7) re-evaluated by Claude or GPT-4 with structured classification prompts
Threshold overrides based on request metadata (audience, platform) for context-sensitive moderation
Rolling-window behavioral scoring adjusts policy thresholds per user based on moderation history
Human review outcomes feed back to provider calibration, continuously improving classification accuracy
Self-harm, spam, and PII detection added to the 6 existing moderation categories for 9 total
Core moderation, policy engine, review queue, audit trail
Multi-language support, ensemble classification, LLM second-pass, feedback loop
Multi-tenant, SSO, advanced analytics, SLA dashboard
Enterprise-grade content moderation, ready for production
API reference, integration guides, and examples
proth1@gmail.com
MIT Open Source
High-risk AI system requirements
All text submissions classified by ML model with evidence generation
Users receive immediate feedback on moderation decisions
Human moderators can review and override automated decisions
Evidence records cannot be modified or deleted after creation
Evidence maintains links for complete traceability
AI Risk Management Framework
ML-powered content analysis with configurable thresholds
Policy engine evaluates scores against configurable thresholds
Policies are versioned with historical records maintained
Permissions enforced based on roles (admin, moderator, viewer)
AI Management System Standard
Configurable policy rules for content evaluation
Version control and change management for policies
Human oversight and decision override capability
General Data Protection Regulation
Automated decision-making with explainable outputs
Right to human intervention in automated decisions
Original content hashed, not stored in plaintext
Tamper-proof audit records for accountability
Trust Services Criteria
Continuous system monitoring and classification
Logical access controls with role-based permissions
Management oversight of automated processes
Data integrity controls for audit records
Encryption and hashing for data protection