Text Generation
Transformers
Safetensors
English
qwen3_5_text
soren
soren-1
soren-1-small
syntropy-ai
project-syntropic
chat
assistant
conversational
instruct
reasoning
thinking
long-context
qwen
qwen3
qwen3.5
qwen3.5-2b
transformer
decoder-only
sft
dpo
lora
merged-lora
unsloth
trl
preference-optimization
supervised-finetuning
coding
code-generation
code-assistant
software-engineering
python
reasoning-model
math
problem-solving
instruction-following
agentic
analytical
1m-context
long-context-model
yarn
yarn-4x
1048576-context
human-like
low-hallucination
honest-ai
helpful-assistant
rtx-pro-6000-blackwell
blackwell
nvidia
claude-style
reasoning-assistant
coding-model
compact-model
edge-ai
local-llm
open-weights
open-source-ai
Instructions to use syntropy-ai/Soren-1-Small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use syntropy-ai/Soren-1-Small with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="syntropy-ai/Soren-1-Small") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("syntropy-ai/Soren-1-Small") model = AutoModelForCausalLM.from_pretrained("syntropy-ai/Soren-1-Small") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use syntropy-ai/Soren-1-Small with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "syntropy-ai/Soren-1-Small" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "syntropy-ai/Soren-1-Small", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/syntropy-ai/Soren-1-Small
- SGLang
How to use syntropy-ai/Soren-1-Small with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "syntropy-ai/Soren-1-Small" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "syntropy-ai/Soren-1-Small", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "syntropy-ai/Soren-1-Small" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "syntropy-ai/Soren-1-Small", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use syntropy-ai/Soren-1-Small with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for syntropy-ai/Soren-1-Small to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for syntropy-ai/Soren-1-Small to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for syntropy-ai/Soren-1-Small to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="syntropy-ai/Soren-1-Small", max_seq_length=2048, ) - Docker Model Runner
How to use syntropy-ai/Soren-1-Small with Docker Model Runner:
docker model run hf.co/syntropy-ai/Soren-1-Small
| language: | |
| - en | |
| license: apache-2.0 | |
| base_model: | |
| - Qwen/Qwen3.5-2B | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| tags: | |
| - soren | |
| - soren-1 | |
| - soren-1-small | |
| - syntropy-ai | |
| - project-syntropic | |
| - chat | |
| - assistant | |
| - conversational | |
| - instruct | |
| - reasoning | |
| - thinking | |
| - long-context | |
| - qwen | |
| - qwen3 | |
| - qwen3.5 | |
| - qwen3.5-2b | |
| - transformer | |
| - decoder-only | |
| - sft | |
| - dpo | |
| - lora | |
| - merged-lora | |
| - unsloth | |
| - trl | |
| - preference-optimization | |
| - supervised-finetuning | |
| - coding | |
| - code-generation | |
| - code-assistant | |
| - software-engineering | |
| - python | |
| - reasoning-model | |
| - math | |
| - problem-solving | |
| - instruction-following | |
| - agentic | |
| - analytical | |
| - 1m-context | |
| - long-context-model | |
| - yarn | |
| - yarn-4x | |
| - 1048576-context | |
| - human-like | |
| - low-hallucination | |
| - honest-ai | |
| - helpful-assistant | |
| - rtx-pro-6000-blackwell | |
| - blackwell | |
| - nvidia | |
| - claude-style | |
| - reasoning-assistant | |
| - coding-model | |
| - compact-model | |
| - edge-ai | |
| - local-llm | |
| - open-weights | |
| - open-source-ai | |
| datasets: | |
| - syntropy-ai/Its-Me-Soren | |
| - Roman1111111/claude-sonnet-4.6-120000x | |
| - angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k | |
| - TeichAI/Claude-Opus-4.6-Reasoning-887x | |
| - dalisoft/claude-opus-4.6-high-reasoning-700x | |
| - TeichAI/claude-4.5-opus-high-reasoning-250x | |
| - Roman1111111/claude-opus-4.6-10000x | |
| - TeichAI/Claude-Sonnet-4.6-Reasoning-1100x | |
| - Hastagaras/Claude-Sonnet-X-Opus-4.6-Reasoning-small-500 | |
| - TeichAI/lordx64-claude-opus-4.7-max-cleaned | |
| - ianncity/Hunter-Alpha-Programming-160000x | |
| - TeichAI/gpt-5-codex-250x | |
| - TeichAI/gpt-5.2-high-reasoning-250x | |
| - Nettoov/Gpt-5.4-Xhigh-Reasoning-2000x | |
| - nvidia/HelpSteer | |
| - m-a-p/CodeFeedback-Filtered-Instruction | |
| - Sashvat/HyperThink-X-Nvidia-Opencode-Reasoning-200K | |
| - openbmb/UltraFeedback | |
| - argilla/OpenHermesPreferences | |
| - argilla/distilabel-capybara-dpo-7k-binarized | |
| - Vezora/Code-Preference-Pairs | |
| - Code-Refinement/dpo-sample-perfect-less | |
| widget: | |
| - text: "Write a Python implementation of A* pathfinding." | |
| - text: "Explain special relativity like I'm 15." | |
| - text: "Debug this Rust code." | |
| - text: "Design a PostgreSQL schema for a social network." | |
| inference: false | |
| # Soren-1-Small | |
| <div align="center"> | |
|  | |
|  | |
|  | |
|  | |
| **Soren is a fine-tuned AI assistant built to think before it speaks, write code that actually works, and talk like a person — not a product.** | |
| </div> | |
| --- | |
| ## Note: | |
| Model's **"math" patch** is to be made(model got 60% on gsm8k hence we are making a math patch) | |
| ## What is Soren? | |
| Soren is an AI assistant created by Andy at Syntropy-AI as part of Project Syntropic. It is built on Qwen3.5-2B and trained through a carefully sequenced pipeline of supervised fine-tuning and direct preference optimization to produce a model with four core strengths: | |
| - **Reasoning first** — Soren thinks through problems before answering. It uses extended chain-of-thought internally and surfaces its reasoning when it helps the user follow along. | |
| - **Low hallucination** — Soren does not invent facts, statistics, citations, or technical details. When it does not know something, it says so. | |
| - **Honest code** — Soren never fabricates APIs, libraries, or function signatures. It writes complete implementations, not stubs. No placeholder comments, no shortcuts. | |
| - **Human-like tone** — Soren does not open with "Certainly!" or "As an AI...". It communicates directly and warmly, the way a knowledgeable friend would. | |
| Soren-1-Small is the first release in the Soren-1 family. Medium (9B) and Large (27B) are planned. | |
| ### Context Window | |
| Soren-1-Small supports a **1,048,576 token (~1M) context window** via YaRN 4x extension applied post-training. The base Qwen3.5-2B native context is 262,144 tokens. | |
| --- | |
| ## Training Configuration | |
| | Parameter | Value | | |
| |---|---| | |
| | Base model | Qwen/Qwen3.5-2B | | |
| | Precision | BF16 full LoRA (no quantization) | | |
| | LoRA rank | 64 | | |
| | LoRA alpha | 128 | | |
| | Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | | |
| | Sequence length | 4096–8192 (session dependent) | | |
| | Effective batch size | 16 | | |
| | Optimizer | adamw_torch | | |
| | Framework | Unsloth + TRL | | |
| | Hardware | NVIDIA RTX PRO 6000 Blackwell (96GB VRAM) | | |
| --- | |
| ## Training Pipeline | |
| ### SFT — Supervised Fine-Tuning | |
| Training followed a sequential LoRA chaining approach. Each session trains a LoRA adapter, merges it into the base model, and the next session trains on the merged result. | |
| **Session 0 — Soren Identity v1** | |
| - Dataset: `syntropy-ai/Its-Me-Soren` (200 examples) | |
| - LR: `5e-5` | Epochs: 3 | |
| - Purpose: Establish Soren's core identity and persona before any other training | |
| **Session 1 — Reasoning Warmup** | |
| - Dataset: `Roman1111111/claude-sonnet-4.6-120000x` (80k slice, capped at 250 steps) | |
| - LR: `2e-4` | |
| - Purpose: Broad instruction following and conversational quality foundation | |
| **Session 2A — Claude Core (~26k examples, capped at 800 steps)** | |
| - LR: `5e-5` | |
| - Datasets: | |
| - `angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k` (8.7k slice) | |
| - `TeichAI/Claude-Opus-4.6-Reasoning-887x` | |
| - `dalisoft/claude-opus-4.6-high-reasoning-700x` | |
| - `TeichAI/claude-4.5-opus-high-reasoning-250x` | |
| - `Roman1111111/claude-opus-4.6-10000x` | |
| - `TeichAI/Claude-Sonnet-4.6-Reasoning-1100x` | |
| - `Hastagaras/Claude-Sonnet-X-Opus-4.6-Reasoning-small-500` | |
| - `TeichAI/lordx64-claude-opus-4.7-max-cleaned` | |
| **Session 2B — Code Core (~24.5k examples)** | |
| - LR: `3e-5` | |
| - Datasets: | |
| - `ianncity/Hunter-Alpha-Programming-160000x` (2k slice) | |
| - `TeichAI/gpt-5-codex-250x` | |
| - `TeichAI/gpt-5.2-high-reasoning-250x` | |
| - `Nettoov/Gpt-5.4-Xhigh-Reasoning-2000x` | |
| - `nvidia/HelpSteer` (5k slice) | |
| - `m-a-p/CodeFeedback-Filtered-Instruction` (5k slice) | |
| - `Sashvat/HyperThink-X-Nvidia-Opencode-Reasoning-200K` (10k slice) | |
| **Session 4 — Soren Identity v2 (Rescue)** | |
| - Dataset: `syntropy-ai/Its-Me-Soren` (200 examples) | |
| - LR: `5e-5` | Epochs: 3 | |
| - Purpose: Reinforce Soren's identity after ~60k examples of third-party data | |
| --- | |
| ### DPO — Direct Preference Optimization | |
| **DPO1 — General Preference (~34k examples)** | |
| - LR: `1e-6` | Beta: `0.1` | |
| - Datasets: | |
| - `openbmb/UltraFeedback` (17k slice) | |
| - `argilla/OpenHermesPreferences` (10k slice) | |
| - `argilla/distilabel-capybara-dpo-7k-binarized` (full, ~7k) | |
| **DPO2 — Code Preference (~6k examples)** | |
| - LR: `1e-6` | Beta: `0.1` | |
| - Datasets: | |
| - `Vezora/Code-Preference-Pairs` (5k slice) | |
| - `Code-Refinement/dpo-sample-perfect-less` (full) | |
| --- | |
| ### Post-Training | |
| - **YaRN 4x** applied to `config.json` — extends context from 262k to ~1M tokens | |
| - **Default system prompt** baked into the chat template — Soren's identity, tone, and behavioral guidelines are always active without needing to pass a system message at inference time | |
| --- | |
| ## Usage | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| import torch | |
| model = AutoModelForCausalLM.from_pretrained( | |
| "syntropy-ai/Soren-1-Small", | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto", | |
| ) | |
| tokenizer = AutoTokenizer.from_pretrained("syntropy-ai/Soren-1-Small") | |
| messages = [{"role": "user", "content": "Write a Python function to check if a number is prime."}] | |
| prompt = tokenizer.apply_chat_template( | |
| messages, | |
| tokenize=False, | |
| add_generation_prompt=True, | |
| enable_thinking=True, | |
| ) | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=2048, | |
| temperature=0.7, | |
| do_sample=True, | |
| ) | |
| print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)) | |
| ``` | |
| --- | |
| ## Credits | |
| **Created by** [Andy](https://github.com/Andy-ML-And-AI) at [Syntropy-AI](https://huggingface.co/syntropy-ai) — Project Syntropic | |
| **Compute** generously provided by [Lightning.ai](https://lightning.ai) — NVIDIA RTX PRO 6000 Blackwell (96GB VRAM) | |
| **Training framework** — [Unsloth](https://github.com/unslothai/unsloth) by Daniel Han-Chen and the Unsloth team | |
| **Dataset authors** — All dataset creators listed in the pipeline above. Thank you for making your data public. | |
| --- | |
| ## License | |
| Apache 2.0 | |