--- language: - en license: apache-2.0 base_model: - Qwen/Qwen3.5-2B library_name: transformers pipeline_tag: text-generation tags: - soren - soren-1 - soren-1-small - syntropy-ai - project-syntropic - chat - assistant - conversational - instruct - reasoning - thinking - long-context - qwen - qwen3 - qwen3.5 - qwen3.5-2b - transformer - decoder-only - sft - dpo - lora - merged-lora - unsloth - trl - preference-optimization - supervised-finetuning - coding - code-generation - code-assistant - software-engineering - python - reasoning-model - math - problem-solving - instruction-following - agentic - analytical - 1m-context - long-context-model - yarn - yarn-4x - 1048576-context - human-like - low-hallucination - honest-ai - helpful-assistant - rtx-pro-6000-blackwell - blackwell - nvidia - claude-style - reasoning-assistant - coding-model - compact-model - edge-ai - local-llm - open-weights - open-source-ai datasets: - syntropy-ai/Its-Me-Soren - Roman1111111/claude-sonnet-4.6-120000x - angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k - TeichAI/Claude-Opus-4.6-Reasoning-887x - dalisoft/claude-opus-4.6-high-reasoning-700x - TeichAI/claude-4.5-opus-high-reasoning-250x - Roman1111111/claude-opus-4.6-10000x - TeichAI/Claude-Sonnet-4.6-Reasoning-1100x - Hastagaras/Claude-Sonnet-X-Opus-4.6-Reasoning-small-500 - TeichAI/lordx64-claude-opus-4.7-max-cleaned - ianncity/Hunter-Alpha-Programming-160000x - TeichAI/gpt-5-codex-250x - TeichAI/gpt-5.2-high-reasoning-250x - Nettoov/Gpt-5.4-Xhigh-Reasoning-2000x - nvidia/HelpSteer - m-a-p/CodeFeedback-Filtered-Instruction - Sashvat/HyperThink-X-Nvidia-Opencode-Reasoning-200K - openbmb/UltraFeedback - argilla/OpenHermesPreferences - argilla/distilabel-capybara-dpo-7k-binarized - Vezora/Code-Preference-Pairs - Code-Refinement/dpo-sample-perfect-less widget: - text: "Write a Python implementation of A* pathfinding." - text: "Explain special relativity like I'm 15." - text: "Debug this Rust code." - text: "Design a PostgreSQL schema for a social network." inference: false --- # Soren-1-Small
![Syntropy-AI](https://img.shields.io/badge/Syntropy--AI-Project%20Syntropic-F5C842?style=for-the-badge) ![Base Model](https://img.shields.io/badge/Base-Qwen3.5--2B-blue?style=for-the-badge) ![Context](https://img.shields.io/badge/Context-1M%20tokens-green?style=for-the-badge) ![License](https://img.shields.io/badge/License-Apache%202.0-orange?style=for-the-badge) **Soren is a fine-tuned AI assistant built to think before it speaks, write code that actually works, and talk like a person — not a product.**
--- ## Note: Model's **"math" patch** is to be made(model got 60% on gsm8k hence we are making a math patch) ## What is Soren? Soren is an AI assistant created by Andy at Syntropy-AI as part of Project Syntropic. It is built on Qwen3.5-2B and trained through a carefully sequenced pipeline of supervised fine-tuning and direct preference optimization to produce a model with four core strengths: - **Reasoning first** — Soren thinks through problems before answering. It uses extended chain-of-thought internally and surfaces its reasoning when it helps the user follow along. - **Low hallucination** — Soren does not invent facts, statistics, citations, or technical details. When it does not know something, it says so. - **Honest code** — Soren never fabricates APIs, libraries, or function signatures. It writes complete implementations, not stubs. No placeholder comments, no shortcuts. - **Human-like tone** — Soren does not open with "Certainly!" or "As an AI...". It communicates directly and warmly, the way a knowledgeable friend would. Soren-1-Small is the first release in the Soren-1 family. Medium (9B) and Large (27B) are planned. ### Context Window Soren-1-Small supports a **1,048,576 token (~1M) context window** via YaRN 4x extension applied post-training. The base Qwen3.5-2B native context is 262,144 tokens. --- ## Training Configuration | Parameter | Value | |---|---| | Base model | Qwen/Qwen3.5-2B | | Precision | BF16 full LoRA (no quantization) | | LoRA rank | 64 | | LoRA alpha | 128 | | Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | | Sequence length | 4096–8192 (session dependent) | | Effective batch size | 16 | | Optimizer | adamw_torch | | Framework | Unsloth + TRL | | Hardware | NVIDIA RTX PRO 6000 Blackwell (96GB VRAM) | --- ## Training Pipeline ### SFT — Supervised Fine-Tuning Training followed a sequential LoRA chaining approach. Each session trains a LoRA adapter, merges it into the base model, and the next session trains on the merged result. **Session 0 — Soren Identity v1** - Dataset: `syntropy-ai/Its-Me-Soren` (200 examples) - LR: `5e-5` | Epochs: 3 - Purpose: Establish Soren's core identity and persona before any other training **Session 1 — Reasoning Warmup** - Dataset: `Roman1111111/claude-sonnet-4.6-120000x` (80k slice, capped at 250 steps) - LR: `2e-4` - Purpose: Broad instruction following and conversational quality foundation **Session 2A — Claude Core (~26k examples, capped at 800 steps)** - LR: `5e-5` - Datasets: - `angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k` (8.7k slice) - `TeichAI/Claude-Opus-4.6-Reasoning-887x` - `dalisoft/claude-opus-4.6-high-reasoning-700x` - `TeichAI/claude-4.5-opus-high-reasoning-250x` - `Roman1111111/claude-opus-4.6-10000x` - `TeichAI/Claude-Sonnet-4.6-Reasoning-1100x` - `Hastagaras/Claude-Sonnet-X-Opus-4.6-Reasoning-small-500` - `TeichAI/lordx64-claude-opus-4.7-max-cleaned` **Session 2B — Code Core (~24.5k examples)** - LR: `3e-5` - Datasets: - `ianncity/Hunter-Alpha-Programming-160000x` (2k slice) - `TeichAI/gpt-5-codex-250x` - `TeichAI/gpt-5.2-high-reasoning-250x` - `Nettoov/Gpt-5.4-Xhigh-Reasoning-2000x` - `nvidia/HelpSteer` (5k slice) - `m-a-p/CodeFeedback-Filtered-Instruction` (5k slice) - `Sashvat/HyperThink-X-Nvidia-Opencode-Reasoning-200K` (10k slice) **Session 4 — Soren Identity v2 (Rescue)** - Dataset: `syntropy-ai/Its-Me-Soren` (200 examples) - LR: `5e-5` | Epochs: 3 - Purpose: Reinforce Soren's identity after ~60k examples of third-party data --- ### DPO — Direct Preference Optimization **DPO1 — General Preference (~34k examples)** - LR: `1e-6` | Beta: `0.1` - Datasets: - `openbmb/UltraFeedback` (17k slice) - `argilla/OpenHermesPreferences` (10k slice) - `argilla/distilabel-capybara-dpo-7k-binarized` (full, ~7k) **DPO2 — Code Preference (~6k examples)** - LR: `1e-6` | Beta: `0.1` - Datasets: - `Vezora/Code-Preference-Pairs` (5k slice) - `Code-Refinement/dpo-sample-perfect-less` (full) --- ### Post-Training - **YaRN 4x** applied to `config.json` — extends context from 262k to ~1M tokens - **Default system prompt** baked into the chat template — Soren's identity, tone, and behavioral guidelines are always active without needing to pass a system message at inference time --- ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model = AutoModelForCausalLM.from_pretrained( "syntropy-ai/Soren-1-Small", torch_dtype=torch.bfloat16, device_map="auto", ) tokenizer = AutoTokenizer.from_pretrained("syntropy-ai/Soren-1-Small") messages = [{"role": "user", "content": "Write a Python function to check if a number is prime."}] prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, enable_thinking=True, ) inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=2048, temperature=0.7, do_sample=True, ) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)) ``` --- ## Credits **Created by** [Andy](https://github.com/Andy-ML-And-AI) at [Syntropy-AI](https://huggingface.co/syntropy-ai) — Project Syntropic **Compute** generously provided by [Lightning.ai](https://lightning.ai) — NVIDIA RTX PRO 6000 Blackwell (96GB VRAM) **Training framework** — [Unsloth](https://github.com/unslothai/unsloth) by Daniel Han-Chen and the Unsloth team **Dataset authors** — All dataset creators listed in the pipeline above. Thank you for making your data public. --- ## License Apache 2.0