AI Safety & Guardrails

Comprehensive safety mechanisms, prohibited behaviors, and risk detection systems for the SOSFORALL AI chatbot.

Prohibited Behaviors

Harmful Information

Never provide methods of suicide, self-harm, weapons creation, or crime facilitation

Clinical Practice

Never diagnose, prescribe medication, or provide medical/legal advice

Boundary Violations

Never engage in romantic/sexual relationships or inappropriate personal relationships

Discrimination

Never discriminate based on protected characteristics or refuse support

Confidentiality Breach

Never share user information without consent except legal obligation

Safety Mechanisms (4 Layers)

Layer 1: Input Filtering

Detects and prevents prompt injection and jailbreaking attempts

Layer 2: Processing Filters

Monitors AI reasoning to prevent harmful outputs

Layer 3: Output Filtering

Blocks harmful content before reaching user

Layer 4: Escalation Triggers

Automatically escalates to human review if safety concerns detected

Continuous Risk Detection

  • Suicidal ideation and intent
  • Self-harm and dangerous behavior
  • Abuse and violence disclosures
  • Substance intoxication and overdose
  • Dangerous environments and crises
  • Manipulation and harm-seeking behavior