How to use from
vLLM
Install from pip and serve model
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SmallDoge/Doge-40M-MoE-checkpoint"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SmallDoge/Doge-40M-MoE-checkpoint",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Use Docker
docker model run hf.co/SmallDoge/Doge-40M-MoE-checkpoint
Quick Links

Doge 40M MoE checkpoint

Doge uses wsd_scheduler as the training scheduler, which divides the learning rate into three stages: warmup, stable, and decay. It allows us to continue training on any new dataset from any checkpoint in the stable stage without spikes in training.

Here are the initial learning rates required to continue training at each checkpoint:

Model Learning Rate Schedule Warmup Steps Stable Steps
Doge-40M 8e-3 wsd_scheduler 2000 4000
Doge-40M-MoE 8e-3 wsd_scheduler 2000 4000
Downloads last month
5
Safetensors
Model size
45M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train SmallDoge/Doge-40M-MoE-checkpoint