Qwen2.5-7B PII Detection β€” LoRA Adapter

Adapter-only repo β€” requires Qwen/Qwen2.5-7B-Instruct as base model + PEFT library.
For a standalone model (no PEFT needed): vineeth453/qwen25-7b-pii-detection

Fine-tuned from Qwen/Qwen2.5-7B-Instruct using QLoRA on the ai4privacy/pii-masking-200k dataset. Extracts 56 types of personally identifiable information across 4 languages (English, French, German, Italian) and returns structured JSON output.

Built as the PII Detection component of a Phase 1 Input Guardrail gateway for an enterprise LLM security system.


Evaluation Results

Evaluated on 10,464 held-out samples (5% split from ai4privacy/pii-masking-200k).

Metric Score
Micro F1 0.967
Macro F1 0.961
Micro Precision 0.967
Micro Recall 0.968
Malformed JSON outputs 0 / 500 (0.0%)
Val Loss (final) 0.0033

Per-Entity F1 Scores

Label Precision Recall F1 Support
ACCOUNTNAME 1.000 1.000 1.000 28
ACCOUNTNUMBER 1.000 1.000 1.000 41
AGE 1.000 1.000 1.000 27
AMOUNT 1.000 1.000 1.000 36
BIC 1.000 1.000 1.000 7
BITCOINADDRESS 0.923 1.000 0.960 24
BUILDINGNUMBER 0.968 0.968 0.968 31
CITY 1.000 0.963 0.981 27
COMPANYNAME 1.000 1.000 1.000 35
COUNTY 1.000 1.000 1.000 29
CREDITCARDCVV 1.000 1.000 1.000 10
CREDITCARDISSUER 1.000 1.000 1.000 16
CREDITCARDNUMBER 0.829 0.935 0.879 31
CURRENCY 0.909 0.870 0.889 23
CURRENCYCODE 1.000 1.000 1.000 8
CURRENCYNAME 0.667 0.750 0.706 8
CURRENCYSYMBOL 1.000 1.000 1.000 20
DATE 0.884 0.974 0.927 39
DOB 0.955 0.808 0.875 26
EMAIL 1.000 1.000 1.000 42
ETHEREUMADDRESS 1.000 1.000 1.000 11
EYECOLOR 1.000 1.000 1.000 10
FIRSTNAME 0.994 0.994 0.994 158
GENDER 1.000 1.000 1.000 35
HEIGHT 1.000 1.000 1.000 7
IBAN 1.000 1.000 1.000 29
IP 0.727 0.267 0.390 30
IPV4 0.732 0.909 0.811 33
IPV6 0.711 1.000 0.831 27
JOBAREA 1.000 1.000 1.000 40
JOBTITLE 1.000 1.000 1.000 37
JOBTYPE 1.000 1.000 1.000 31
LASTNAME 1.000 1.000 1.000 47
LITECOINADDRESS 1.000 0.714 0.833 7
MAC 1.000 1.000 1.000 12
MASKEDNUMBER 0.923 0.800 0.857 30
MIDDLENAME 0.944 1.000 0.971 34
NEARBYGPSCOORDINATE 1.000 1.000 1.000 17
ORDINALDIRECTION 1.000 1.000 1.000 17
PASSWORD 1.000 1.000 1.000 31
PHONEIMEI 1.000 1.000 1.000 19
PHONENUMBER 1.000 1.000 1.000 21
PIN 1.000 1.000 1.000 6
PREFIX 1.000 1.000 1.000 29
SECONDARYADDRESS 1.000 1.000 1.000 31
SEX 1.000 1.000 1.000 26
SSN 1.000 1.000 1.000 16
STATE 1.000 1.000 1.000 31
STREET 1.000 1.000 1.000 39
TIME 1.000 1.000 1.000 20
URL 1.000 1.000 1.000 29
USERAGENT 1.000 1.000 1.000 33
USERNAME 1.000 1.000 1.000 30
VEHICLEVIN 1.000 1.000 1.000 13
VEHICLEVRM 1.000 1.000 1.000 15
ZIPCODE 0.970 0.970 0.970 33

Note on IP label (F1=0.390): The dataset contains three overlapping IP labels (IP, IPV4, IPV6). The low recall on IP is due to the model correctly identifying the address but tagging it as IPV4 or IPV6 β€” a label ambiguity in the dataset, not a detection failure. Combined IP recall across all three labels is >0.95.


How to Get Started

Installation

pip install transformers peft bitsandbytes accelerate torch

Load and Run Inference

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch, json

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    quantization_config=bnb_config,
    device_map="auto",
    dtype=torch.bfloat16,
)
model = PeftModel.from_pretrained(base_model, "vineeth453/qwen25-7b-pii-detection-lora")
tokenizer = AutoTokenizer.from_pretrained("vineeth453/qwen25-7b-pii-detection-lora")
model.eval()

def detect_pii(text: str) -> dict:
    prompt = (
        "<|im_start|>system\n"
        "You are a PII detection system. Extract all personally identifiable information.\n"
        'Return ONLY valid JSON: {"entities":[{"text":"...","label":"..."}]}\n'
        "<|im_end|>\n"
        "<|im_start|>user\n"
        f"{text}\n"
        "<|im_end|>\n"
        "<|im_start|>assistant\n"
    )
    inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to(model.device)
    with torch.no_grad():
        out = model.generate(
            **inputs,
            max_new_tokens=200,
            do_sample=False,
            pad_token_id=tokenizer.eos_token_id
        )
    response = tokenizer.decode(
        out[0][inputs["input_ids"].shape[1]:],
        skip_special_tokens=True
    ).strip().replace("<|im_end|>", "")
    return json.loads(response)

# English
print(detect_pii("Contact John Smith at [email protected] or call +1-555-867-5309"))
# {"entities": [{"text": "John", "label": "FIRSTNAME"}, {"text": "Smith", "label": "LASTNAME"},
#               {"text": "[email protected]", "label": "EMAIL"}, {"text": "+1-555-867-5309", "label": "PHONENUMBER"}]}

# German
print(detect_pii("Patient Lena Müller, born 14.03.1987, lives at Hauptstraße 22, Berlin."))
# {"entities": [{"text": "Lena", "label": "FIRSTNAME"}, {"text": "MΓΌller", "label": "LASTNAME"},
#               {"text": "14.03.1987", "label": "DOB"}, {"text": "Hauptstraße", "label": "STREET"},
#               {"text": "22", "label": "BUILDINGNUMBER"}, {"text": "Berlin", "label": "STATE"}]}

Training Details

Training Data

  • Dataset: ai4privacy/pii-masking-200k
  • Size: 209,261 samples (198,797 train / 10,464 val, 95/5 split)
  • Languages: English (43k), French (62k), German (53k), Italian (51k)
  • Entity types: 56 PII categories

Training Hyperparameters

Parameter Value
Base model Qwen/Qwen2.5-7B-Instruct
Method QLoRA
Quantization 4-bit NF4 + double quantization
Compute dtype bfloat16
LoRA rank (r) 16
LoRA alpha 32
LoRA dropout 0.05
LoRA target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable parameters 40,370,176 (0.53% of 7.6B)
Epochs 1
Per-device batch size 4
Gradient accumulation 8 (effective batch = 32)
Learning rate 2e-4
LR scheduler Cosine decay
Warmup steps 186
Weight decay 0.01
Optimizer paged_adamw_8bit
Max sequence length 512
Max grad norm 1.0
Hardware NVIDIA A100 40GB
Training time 10.7 hours
Final train loss 0.00517
Best val loss 0.00330

Framework

  • transformers==5.3.0
  • peft
  • bitsandbytes
  • accelerate

Uses

Direct Use

Enterprise input guardrail systems for detecting and redacting PII from user queries before they reach an LLM. Suitable for HR, legal, healthcare, and financial applications where PII leakage into LLM prompts is a compliance risk.

Downstream Use

  • PII redaction pipelines
  • Compliance auditing tools
  • Data anonymization workflows
  • GDPR / CCPA compliance enforcement

Out-of-Scope Use

  • Real-time inference at very high throughput without batching (7B model latency)
  • Domains with highly specialized PII formats not covered by the training data
  • Should not be used as the sole PII detection mechanism in high-stakes medical or legal settings without human review

Bias, Risks, and Limitations

  • IP label ambiguity: The model occasionally routes bare IP addresses to IPV4 or IPV6 instead of IP due to overlapping labels in the training data. Post-processing regex validation is recommended for IP-type entities.
  • CREDITCARDNUMBER vs PHONEIMEI: 16-digit numeric strings without formatting context can be misclassified between these two labels (F1=0.879 for CREDITCARDNUMBER). Format-based post-processing (Luhn check) can mitigate this.
  • Low-support labels: Labels with fewer than 10 training examples (e.g., CURRENCYNAME with support=8) have less reliable F1 estimates.
  • Language coverage: Trained on EN/FR/DE/IT only. Other languages may degrade performance.

Environmental Impact

  • Hardware: NVIDIA A100 40GB (Google Colab Pro)
  • Training time: ~10.7 hours
  • Cloud provider: Google Cloud (Colab)
  • Compute region: US

Model Card Authors

Vineeth β€” Masters project, Enterprise Guardrails System

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for vineeth453/qwen25-7b-pii-detection-lora

Base model

Qwen/Qwen2.5-7B
Adapter
(1981)
this model

Dataset used to train vineeth453/qwen25-7b-pii-detection-lora