Instructions to use FINAL-Bench/lastbrain with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use FINAL-Bench/lastbrain with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="FINAL-Bench/lastbrain")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("FINAL-Bench/lastbrain")
model = AutoModelForCausalLM.from_pretrained("FINAL-Bench/lastbrain")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use FINAL-Bench/lastbrain with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "FINAL-Bench/lastbrain"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FINAL-Bench/lastbrain",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/FINAL-Bench/lastbrain

SGLang

How to use FINAL-Bench/lastbrain with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "FINAL-Bench/lastbrain" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FINAL-Bench/lastbrain",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "FINAL-Bench/lastbrain" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FINAL-Bench/lastbrain",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use FINAL-Bench/lastbrain with Docker Model Runner:
```
docker model run hf.co/FINAL-Bench/lastbrain
```

SeaWolf-AI commited on Apr 17

Commit

7a96f3e

verified ·

1 Parent(s): 06f90e5

Add README: Darwin V8 lastbrain (Qwen3.5-2B father + Opus-Distill LoRA mother merged)

Browse files

Files changed (1) hide show

README.md +199 -0

README.md ADDED Viewed

	@@ -0,0 +1,199 @@

+---
+license: apache-2.0
+base_model: Qwen/Qwen3.5-2B
+tags:
+- qwen
+- qwen3.5
+- reasoning
+- distillation
+- claude-opus
+- darwin-v8
+- sft
+- lora
+- merged
+language:
+- en
+- ko
+- zh
+- ja
+pipeline_tag: text-generation
+library_name: transformers
+---
+# 🧠 lastbrain — Darwin V8
+**Darwin V8 기반 Claude Opus 증류 모델 (2B 파라미터)**
+- 👨 **Father (Base)**: [`Qwen/Qwen3.5-2B`](https://huggingface.co/Qwen/Qwen3.5-2B)
+- 👩 **Mother (LoRA Adapter)**: [`FINAL-Bench/Qwen3.5-2B-Opus-Distill-v1`](https://huggingface.co/FINAL-Bench/Qwen3.5-2B-Opus-Distill-v1)
+- 👶 **Child (This model)**: `FINAL-Bench/lastbrain` — merged full-weight standalone
+---
+## 📦 특징
+- **Base**: Qwen3.5-2B (2.3B 파라미터, 하이브리드 어텐션)
+- **Training**: SFT + LoRA (`all-linear`, rank=16, α=32)
+- **Teachers**: Claude Opus 4.5 / 4.6, Claude Sonnet 4.6 (pre-generated reasoning traces)
+- **Data**: 4,451 고품질 추론 궤적 (4개 공개 데이터셋)
+- **Merged**: LoRA 어댑터가 base 가중치에 완전 통합되어 **독립 실행 가능**
+---
+## 🚀 빠른 사용법
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+model_id = "FINAL-Bench/lastbrain"
+tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
+)
+messages = [
+    {"role": "user", "content": "If a train travels 60 km in 45 minutes, what is its speed in km/h?"}
+]
+prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tok(prompt, return_tensors="pt").to(model.device)
+with torch.no_grad():
+    outputs = model.generate(
+        **inputs,
+        max_new_tokens=800,
+        do_sample=False,
+        pad_token_id=tok.eos_token_id,
+    )
+print(tok.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
+```
+**예시 출력**:
+```
+To find the speed of the train in km/h, we need to convert the given time from minutes to hours.
+**Given:**
+- Distance = 60 km
+- Time = 45 minutes
+**Step 1: Convert time to hours**
+Since there are 60 minutes in 1 hour:
+$$\text{Time in hours} = \frac{45}{60} = 0.75 \text{ hours}$$
+**Step 2: Calculate speed**
+$$\text{Speed} = \frac{60}{0.75} = 80 \text{ km/h}$$
+**Final Answer:** The speed of the train is **80 km/h**.
+```
+---
+## 🧬 Darwin V8 학습 파이프라인
+```
+[Qwen/Qwen3.5-2B] ──── Base 모델 (동결)
+        +
+[4,451 Claude Opus/Sonnet reasoning traces]
+        ↓
+[SFT Training]
+  - LoRA (all-linear, r=16, α=32)
+  - Learning rate: 2e-4 (V8 rule: ×10 FullFT)
+  - 2 epochs, bf16, 8×B200 DDP
+  - Loss: 1.33 → 1.10 (-17%)
+  - Token accuracy: 68% → 72% (+4%p)
+        ↓
+[LoRA merge into base weights]
+        ↓
+[lastbrain] ← 이 모델
+```
+---
+## 📊 학습 데이터 구성
+| 데이터셋 | 샘플 수 | 출처 Teacher |
+|---------|--------|------|
+| [nohurry/Opus-4.6-Reasoning-3000x-filtered](https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered) | 2,326 | Claude Opus 4.6 |
+| [TeichAI/Claude-Opus-4.6-Reasoning-887x](https://huggingface.co/datasets/TeichAI/Claude-Opus-4.6-Reasoning-887x) | 887 | Claude Opus 4.6 |
+| [TeichAI/claude-4.5-opus-high-reasoning-250x](https://huggingface.co/datasets/TeichAI/claude-4.5-opus-high-reasoning-250x) | 250 | Claude Opus 4.5 |
+| [TeichAI/Claude-Sonnet-4.6-Reasoning-1100x](https://huggingface.co/datasets/TeichAI/Claude-Sonnet-4.6-Reasoning-1100x) | 1,100 | Claude Sonnet 4.6 |
+| **합계 (필터 후)** | **4,451** | - |
+---
+## 🎯 설계 철학 (Darwin V8)
+1. **LoRA Without Regret** — `all-linear` target, high LR, 작은 rank도 OK
+2. **Response Distillation** — pre-generated Opus traces로 비용 효율적 증류
+3. **Merge-and-Deploy** — LoRA 어댑터 통합 후 추가 의존성 없이 배포
+---
+## 🔁 재현 방법
+이 모델은 다음 두 컴포넌트를 merge하여 만들어졌습니다:
+```python
+from transformers import AutoModelForCausalLM
+from peft import PeftModel
+import torch
+base = AutoModelForCausalLM.from_pretrained(
+    "Qwen/Qwen3.5-2B", torch_dtype=torch.bfloat16
+)
+model = PeftModel.from_pretrained(
+    base, "FINAL-Bench/Qwen3.5-2B-Opus-Distill-v1"
+)
+merged = model.merge_and_unload()
+merged.save_pretrained("./lastbrain")
+```
+---
+## 📝 샘플 테스트 결과 (4문제)
+| 유형 | 정답 여부 | 응답 길이 |
+|-----|---------|---------|
+| Math (기차 속도) | ✅ 80 km/h | 771자 |
+| Logic (키 비교) | ✅ Carol | 354자 |
+| Code (소수 판별) | ✅ Python 함수 | 1,712자 |
+| Korean (최저시급) | ✅ 1,577,600원 | 142자 |
+**Markdown/LaTeX/Step-by-Step 구조화된 답변 자연스럽게 생성**
+---
+## ⚠️ 제한 사항
+- **규모**: 2.3B 파라미터 (소형 모델)
+- **한국어 계산 정확성**: 때로 숫자 오류 발생 가능 (소형 모델 한계)
+- **긴 컨텍스���**: 학습 시 max_length=4,096으로 학습됨
+- **`<think>` 태그**: 명시적 사용 낮음 (reasoning을 본문에 통합)
+---
+## 🪪 라이선스
+- Base model: Apache 2.0 (Qwen)
+- 학습 데이터: 각 데이터셋 개별 라이선스 참조
+- 이 모델: Apache 2.0
+---
+## 🙏 크레딧
+- **Base**: Qwen team (Alibaba)
+- **Teacher**: Anthropic (Claude Opus 4.5/4.6, Sonnet 4.6)
+- **데이터 공개**: nohurry, TeichAI
+- **Training & Release**: FINAL-Bench / VIDRAFT_LAB
+---
+## 🔗 관련 모델
+- 🧠 [`FINAL-Bench/Qwen3.5-2B-Opus-Distill-v1`](https://huggingface.co/FINAL-Bench/Qwen3.5-2B-Opus-Distill-v1) — 이 모델의 **LoRA 어댑터 단독 버전**
+- ⚡ [`FINAL-Bench/Qwen3.5-2B-Opus-SDPO-v1`](https://huggingface.co/FINAL-Bench/Qwen3.5-2B-Opus-SDPO-v1) — Phase 4 SDPO 자기증류 강화본
+---
+*Darwin V8 · Part of the evolutionary model merging series by VIDRAFT_LAB*