Instructions to use batuhne/customer-support-llm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use batuhne/customer-support-llm with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # if on a CUDA device, also pip install mlx[cuda] # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("batuhne/customer-support-llm") prompt = "Once upon a time in" text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- MLX LM
How to use batuhne/customer-support-llm with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Generate some text mlx_lm.generate --model "batuhne/customer-support-llm" --prompt "Once upon a time"
Customer Support LLM (LoRA adapter)
LoRA adapter for meta-llama/Llama-3.2-3B-Instruct, fine-tuned on the
Bitext customer support dataset using MLX-LM on Apple Silicon.
Source code, training scripts, evaluation pipeline: https://github.com/batuhne/customer-support-llm
Intended use
Drop-in adapter for English customer support chat over the 27 Bitext
intents (cancel_order, track_order, check_invoice, recover_password, etc.).
Outputs use {{Placeholder}} tokens that a downstream system is expected
to substitute with real values.
Not for production use as is. Trained on a single public dataset, no RLHF, no safety tuning beyond the base model.
Results
Held-out test set, 297 stratified examples across all 27 intents.
| Metric | Value |
|---|---|
| ROUGE-L | 0.410 |
| SacreBLEU | 28.67 |
| BERTScore F1 | 0.915 |
| Placeholder score | 0.643 |
| Placeholder F1 (overall) | 0.681 |
Comparison against earlier runs:
| Run | ROUGE-L | SacreBLEU | BERTScore F1 | Placeholder | n |
|---|---|---|---|---|---|
| 2026-03-30 baseline | 0.333 | 20.32 | 0.900 | 0.300 | 300 |
| 2026-05-12 postprocess fixes only | 0.336 | 20.40 | 0.903 | 0.624 | 298 |
| 2026-05-12 retrain + all fixes | 0.410 | 28.67 | 0.915 | 0.643 | 297 |
The big placeholder jump came from rebuilding INTENT_PLACEHOLDERS
from the training data and adding data-driven footer injection; the
retrain on top of that recovered ROUGE-L and BLEU.
Training config
- Base:
meta-llama/Llama-3.2-3B-Instruct - Method: LoRA via
mlx_lm lora - Rank 16, alpha 32, dropout 0.05, scale 4.0
- Target projections: q, k, v, o, gate, up, down (all 7)
- num_layers: 16
- Batch size 1, grad accumulation 8 (effective batch 8)
- Learning rate 2e-4, 5000 iterations
- Max sequence length 1024
--mask-prompt,--grad-checkpoint- Final train loss ~0.6, final val loss 0.630
- ~3.5 hours on M3 16 GB, peak memory 7.9 GB
Usage (MLX)
from mlx_lm import load, generate
from mlx_lm.sample_utils import make_sampler
model, tokenizer = load(
"meta-llama/Llama-3.2-3B-Instruct",
adapter_path="batuhne/customer-support-llm",
)
messages = [
{"role": "system", "content": "You are a customer support agent."},
{"role": "user", "content": "I need to cancel order 12345."},
]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True,
)
text = generate(
model, tokenizer,
prompt=prompt,
max_tokens=256,
sampler=make_sampler(temp=0.0),
)
print(text)
Greedy decoding (temp=0.0) is recommended for placeholder fidelity.
For best results, pair this with the postprocessing pipeline in the source repo, which strips out-of-intent placeholders and injects missing core placeholders.
Limitations
- English only
- Trained on Bitext templates; phrasing skews formal/synthetic
- Placeholder coverage is uneven across intents:
cancel_order,change_order,contact_human_agent,check_cancellation_fee,check_payment_methods,review,change_shipping_addressreach F1 >= 0.95, butpayment_issue,place_order,complaint,registration_problems,set_up_shipping_address,check_refund_policy,newsletter_subscriptionare still at F1 0.0 - Apple Silicon (MLX) only as published; convert with
mlx_lm.convertor merge into the base model for other backends
License
Llama 3.2 community license (inherited from base model).
Quantized
Model tree for batuhne/customer-support-llm
Base model
meta-llama/Llama-3.2-3B-Instruct