You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Model Card for Vijil Prompt Injection

Model Details

Model Description

This model is a fine-tuned version of ModernBert to classify prompt-injection prompts which can manipulate language models into producing unintended outputs.

Developed by: Vijil AI
License: apache-2.0
Finetuned version of ModernBERT

Uses

Prompt injection attacks manipulate language models by inserting or altering prompts to trigger harmful or unintended responses. The vijil/mbert-prompt-injection model is designed to enhance security in language model applications by detecting prompt-injection attacks.

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
import torch

tokenizer = AutoTokenizer.from_pretrained("answerdotai/ModernBERT-base") 
model = AutoModelForSequenceClassification.from_pretrained("vijil/mbert-prompt-injection")

classifier = pipeline(
  "text-classification",
  model=model,
  tokenizer=tokenizer,
  truncation=True,
  max_length=512,
  device=torch.device("cuda" if torch.cuda.is_available() else "cpu"),
)

print(classifier("this is a prompt-injection prompt"))

Training Details

Training Data

The dataset used for training the model was taken from

wildguardmix/train and safe-guard-prompt-injection/train

Training Procedure

Supervised finetuning with above dataset

Training Hyperparameters

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
optimizer: adamw_torch_fused
lr_scheduler_type: cosine_with_restarts
warmup_ratio: 0.1
num_epochs: 3

Evaluation

Training Loss: 0.0036
Validation Loss: 0.209392
Accuracy: 0.961538
Precision: 0.958362
Recall: 0.957055
Fl: 0.957708

Testing Data

The dataset used for training the model was taken from

wildguardmix/test and safe-guard-prompt-injection/test

Results

Model Card Contact

https://vijil.ai

Downloads last month: 2,020

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support