Instructions to use Arioron/Vex-Amber-Mini-1.2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Arioron/Vex-Amber-Mini-1.2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Arioron/Vex-Amber-Mini-1.2") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Arioron/Vex-Amber-Mini-1.2") model = AutoModelForCausalLM.from_pretrained("Arioron/Vex-Amber-Mini-1.2") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Arioron/Vex-Amber-Mini-1.2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Arioron/Vex-Amber-Mini-1.2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Arioron/Vex-Amber-Mini-1.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Arioron/Vex-Amber-Mini-1.2
- SGLang
How to use Arioron/Vex-Amber-Mini-1.2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Arioron/Vex-Amber-Mini-1.2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Arioron/Vex-Amber-Mini-1.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Arioron/Vex-Amber-Mini-1.2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Arioron/Vex-Amber-Mini-1.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Arioron/Vex-Amber-Mini-1.2 with Docker Model Runner:
docker model run hf.co/Arioron/Vex-Amber-Mini-1.2
Vex Amber Mini 1.2
Model Description
Vex Amber Mini 1.2 is a 0.6B parameter decoder-only transformer model that demonstrates exceptional capabilities in mathematical reasoning and code generation. Building upon Vex Amber Mini 1.0, this model achieves state-of-the-art performance for its size class, particularly excelling in programming tasks and mathematical problem-solving.
- Developed by: Arioron
- Model type: Decoder-only Transformer
- Language(s): English
- License: Apache 2.0
- Finetuned from model: Arioron/Vex-Amber-Mini-1.0
Model Sources
- Base Model: Qwen/Qwen3-0.6B
- Repository: https://huggingface.co/Arioron/Vex-Amber-Mini-1.2
- Documentation: Arioron Model Docs
Performance
| Benchmark | Metric | Score |
|---|---|---|
| HumanEval | Pass@1 | 21.34% |
| MBPP | Pass@1 | 38.7% |
| GSM8K | Accuracy | 65.2% |
| MATH | Accuracy | 45.8% |
| MMLU | Accuracy | 58.3% |
Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "Arioron/Vex-Amber-Mini-1.2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto"
)
# Code generation example
prompt = "Write a Python function to reverse a linked list:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Capabilities
🎯 Code Generation
# Example: The model can generate efficient algorithms
def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)
🔢 Mathematical Reasoning
# Example: Solve quadratic equations and explain steps
"""
Solve: x² - 5x + 6 = 0
Step 1: Factor the equation: (x - 2)(x - 3) = 0
Step 2: Set each factor to zero: x - 2 = 0 or x - 3 = 0
Step 3: Solve for x: x = 2 or x = 3
"""
Training Details
Training Data
The model was trained on a carefully curated mixture of:
- 45% Code (Python, JavaScript, Java, C++)
- 30% Mathematical content (textbooks, problems, proofs)
- 15% General reasoning tasks
- 10% Conversational data
Technical Specifications
- Architecture: Transformer-based decoder
- Context Length: 8,192 tokens
- Precision: float16
- Training Framework: Native PyTorch
- Positional Encoding: Rotary Positional Embeddings (RoPE)
Intended Uses
Direct Use
- Code completion and generation
- Mathematical problem solving
- Educational assistance
- Technical documentation
- Research prototyping
Downstream Use
- Integration into IDEs and code editors
- Educational platforms
- Technical chatbots
- Research tools for mathematics and computer science
Limitations
- The 0.6B parameter count may limit performance on extremely complex, multi-step reasoning tasks
- While strong for its size, it may not match the performance of larger models (7B+) on some benchmarks
- Context window of 8K tokens may be insufficient for very long code files or documents
Ethical Considerations
The model is trained on publicly available data and is designed to be helpful, harmless, and honest. However, as with any language model:
- Outputs should be verified for accuracy in critical applications
- The model should not be used for high-stakes decisions without human oversight
- Users should be aware of potential biases in training data
Citation
If you use this model in your research, please cite:
@misc{vexambermini1.2,
title = {Vex Amber Mini 1.2: A Compact Language Model for Code and Mathematics},
author = {Arioron},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Arioron/Vex-Amber-Mini-1.2}}
}
Contact
- Email: [email protected]
- Website: https://arioron.com
- Documentation: https://docs.arioron.com
Acknowledgements
Thanks to the open-source community and the Qwen team for their foundational work. Special thanks to all contributors and researchers who have advanced the field of efficient language modeling.
For technical details, training recipes, and comprehensive evaluation results, please refer to our technical documentation.
- Downloads last month
- 11