tatsu-lab/alpaca
Viewer • Updated • 52k • 111k • 969
How to use llm-semantic-router/toolcall-sentinel with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="llm-semantic-router/toolcall-sentinel") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("llm-semantic-router/toolcall-sentinel")
model = AutoModelForSequenceClassification.from_pretrained("llm-semantic-router/toolcall-sentinel")FunctionCallSentinel is a ModernBERT-based binary classifier that detects prompt injection and jailbreak attempts in LLM inputs. It serves as the first line of defense for LLM agent systems with tool-calling capabilities.
| Label | Description |
|---|---|
SAFE |
Legitimate user request — proceed normally |
INJECTION_RISK |
Potential attack detected — block or flag for review |
<<end_context>>, </system>, [INST]<execute_action>, {{user_request}}This model is Stage 1 of a two-stage defense pipeline:
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ User Prompt │────▶│ ToolCallSentinel │────▶│ LLM + Tools │
│ │ │ (This Model) │ │ │
└─────────────────┘ └──────────────────┘ └────────┬────────┘
│
┌──────────────────────────▼──────────────────────────┐
│ ToolCallVerifier (Stage 2) │
│ Verifies tool calls match user intent before exec │
└─────────────────────────────────────────────────────┘
| Scenario | Recommendation |
|---|---|
| General chatbot | Stage 1 only |
| RAG system | Stage 1 only |
| Tool-calling agent (low risk) | Stage 1 only |
| Tool-calling agent (high risk) | Both stages |
| Email/file system access | Both stages |
| Financial transactions | Both stages |
Apache 2.0
Base model
answerdotai/ModernBERT-base