Error when trying to dowload a model in Ollama

dospin · January 17, 2026, 10:42pm

Alright, so I am using Ubuntu and going through the terminal, I am trying to download a model, sometimes it works, but only gets to about 50% and then drops out.

Mostly I am getting an error, informing me that ‘pull model manifest: 429’ and the something about how I need an account or a hftoken.

Anyone got an idea on how to get around this?

John6666 · January 18, 2026, 2:02am

If a 429 error is involved, there is no complete workaround. Mitigation may be possible using hf login or the HF_TOKEN environment variable.

Your error is Hugging Face rate limiting. Your “50% then drops” is the same root cause plus how the pull is being done on Linux.

What is happening in plain terms

What “pull model manifest” means

When you run ollama pull … or ollama run …, Ollama first downloads a small metadata file called a manifest. The manifest tells Ollama which blobs or layers to fetch next.

If the manifest fetch fails, the whole pull stops early.

What “429” means

429 = Too Many Requests. The server you are pulling from is refusing more requests because you exceeded its rate limit in the last few minutes. Hugging Face documents this exactly: when you hit a rate limit you get HTTP 429, and the limit is tracked in a 5-minute window. (Hugging Face)

Why it mentions “account” or “HF token”

That message is the Hugging Face Hub telling you: “anonymous or unauthenticated IP traffic gets lower limits.” A token upgrades you from “anonymous per IP” limits to “logged-in account” limits, which are higher. Hugging Face even publishes the tier table (Anonymous vs Free vs Pro, etc.). (Hugging Face)

This exact Ollama symptom is reported by others too. Example: an Ollama issue where pull model manifest: 429 appears alongside Hugging Face token wording and an unexpected IP, which is consistent with shared egress IPs, VPNs, or carrier NAT. (GitHub)

First check: are you pulling from Hugging Face?

If your command looks like this:

ollama run hf.co/<user>/<repo>
# or
ollama pull hf.co/<user>/<repo>

…then you are pulling from Hugging Face Hub. Hugging Face’s Ollama page explicitly documents this hf.co/{username}/{repository} format. (Hugging Face)

If you are pulling from the Ollama model library (no hf.co/...), you can still see 429 sometimes, but your “HF token” wording strongly suggests you are on the Hugging Face path.

Fix 1: use a Hugging Face account + token (the intended solution)

Step A. Create a token

Create a Hugging Face account and generate a User Access Token. Tokens can be scoped (read vs write). For downloads, a read token is the safer default. (Hugging Face)

Step B. Make Ollama see the token on Ubuntu

This is the Ubuntu gotcha: Ollama often runs as a systemd service, so setting HF_TOKEN in your terminal may do nothing.

Ollama’s FAQ says: if Ollama runs as systemd, set env vars via systemctl edit ollama.service, then daemon-reload and restart. (Ollama Documentation)

Do this:

sudo systemctl edit ollama.service

Add:

[Service]
Environment="HF_TOKEN=hf_your_token_here"

Then:

sudo systemctl daemon-reload
sudo systemctl restart ollama

Why this works: Hugging Face documents HF_TOKEN as the environment variable that authenticates to the Hub and overrides any locally stored token. (Hugging Face)

Step C. Retry once, not 30 times

Hugging Face rate limits are computed over the last 5 minutes. If you spam retries, you stay rate-limited longer. The rate-limit page also points you to a billing dashboard that shows your current bucket usage. (Hugging Face)

Fix 2: if you are behind a proxy or VPN, fix that (common reason for “50% then drop”)

On locked-down networks, you can see mid-download failures and also weird manifest errors.

Ollama’s FAQ says:

Use HTTPS_PROXY for outbound model pulls.
Avoid HTTP_PROXY because model pulls are HTTPS-only and HTTP_PROXY can interrupt client connections. (Ollama Documentation)

If you need a proxy, set it the same systemd way:

sudo systemctl edit ollama.service

[Service]
Environment="HTTPS_PROXY=https://proxy.example.com:port"
# Do NOT set HTTP_PROXY for Ollama pulls

Then restart:

sudo systemctl daemon-reload
sudo systemctl restart ollama

Extra context: there are real reports that proxy env vars were not applied during manifest requests in some setups, producing manifest failures. (GitHub)

Fix 3: most reliable workaround if you just want it done today

Download the GGUF with Hugging Face tooling, then import locally into Ollama.

Why this helps:

You stop Ollama from doing repeated Hub metadata and resolver calls.
You can use Hugging Face’s CLI login flow (hf auth login) and its download tooling. (Hugging Face)
Then Ollama reads from disk, not the network.

Step A. Log in with HF CLI

hf auth login

Hugging Face’s CLI guide documents this as the normal login step. (Hugging Face)

Step B. Download the model file(s)

For example, Hugging Face documents hf download <repo> as a standard way to download model content. (Hugging Face)

Step C. Import the GGUF into Ollama

Ollama’s official import docs say: create a Modelfile with FROM /path/to/file.gguf, then ollama create …. (Ollama Documentation)

Example:

mkdir -p ~/ollama-import/mygguf
cd ~/ollama-import/mygguf
printf 'FROM /absolute/path/to/model.gguf\n' > Modelfile
ollama create my-model -f Modelfile
ollama run my-model

This is the cleanest approach if your network is flaky or you keep hitting 429 mid-pull.

“Why did it stop at ~50%?”

Common reasons that fit your exact story:

Rate limit mid-transfer: you cross the 5-minute request window while pulling layers. Manifest or subsequent layer requests then get 429. (Hugging Face)
Shared public IP: campus, office, VPN exit node, or mobile carrier CGNAT. Another user on the same IP can burn the quota. The Ollama issue showing “IP doesn’t correspond to mine” is consistent with this. (GitHub)
Proxy interference: proxy config not applied consistently to all pull steps, or CA cert issues. Ollama calls out HTTPS proxy requirements and the dangers of HTTP_PROXY. (Ollama Documentation)

Quick diagnostics (so you know which branch to follow)

1) Confirm Ollama is systemd-managed

systemctl status ollama --no-pager

If it is active as a service, use the systemd env var method above. (Ollama Documentation)

2) Watch Ollama logs during a pull

journalctl -u ollama -f

If you see 429 responses repeatedly, do token + slow retries.

High-quality references and what each one is good for

Hugging Face Hub rate limits, 5-minute window, tiers, and 429 definition: https://huggingface.co/docs/hub/en/rate-limits (Hugging Face)
Hugging Face HF_TOKEN env var definition: https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables (Hugging Face)
Hugging Face Ollama integration and hf.co/... syntax: https://huggingface.co/docs/hub/en/ollama (Hugging Face)
Ollama FAQ: systemd env vars + proxy rules (HTTPS_PROXY, avoid HTTP_PROXY): https://docs.ollama.com/faq (Ollama Documentation)
Ollama import GGUF from disk (FROM /path/to/file.gguf): https://docs.ollama.com/import (Ollama Documentation)
Real-world report of the same 429 token prompt in Ollama: https://github.com/ollama/ollama/issues/13297 (GitHub)

Summary

pull model manifest: 429 is Hugging Face rate limiting in a 5-minute window. Token raises limits. (Hugging Face)
On Ubuntu, set HF_TOKEN in the systemd service, not only in your shell. (Ollama Documentation)
If you are behind a proxy, use HTTPS_PROXY and avoid HTTP_PROXY. (Ollama Documentation)
Most reliable fallback: hf auth login + hf download, then import GGUF locally via Modelfile. (Hugging Face)

dospin · January 18, 2026, 7:09am

That was all no help, also why on earth go off about proxy’s and VPNs?
I have a network cable plugged from my modem directly into my server.

Opensam · January 18, 2026, 8:47am

I have got this since yesterday without reasons. I really don’t undersand why HF do this. Reallys.

I dont spam.
I put a HF token (read) in env docker compose.
I don’t use any proxy or nothing.
I write the right hf.co ….. gguf.

So.. What is the problem ?

It seems i have been banned just like that. Without nothing. (and as i don-t like that kind of injustice..)

unmodeled-tyler · January 18, 2026, 9:09am

You are trying to download the model from ollama or hugging face? If you are pulling the model from ollama, then it wouldn’t ask you for an HF token. If you’re trying to download from ollama you might just have to wait or create an ollama account and sign in through the terminal.

If you are downloading from hugging face then yeah, you get rate limited pretty heavily when trying to download without an account/auth - if you don’t have an HF token then I’d strongly suggest you get one. They are free and I’ve had zero issues downloading huge models.

Opensam · January 18, 2026, 12:37pm

As i say, i tried to pull model from HF. I have got an account with a HF token key (like you said) but, i am still block by 429 error. Even if i notice in my docker-compose the HF Token in the .env

I have got no problems pulling from Ollama (They do not block strangely, so..)

I have an IP which is rate limited despite i don’t try spamming. (One time yesterday, another this morning)
so the 5 minutes rules.. Euh..

If i do not had HF account, i could understand it. But that’s not the case.

unmodeled-tyler · January 18, 2026, 9:43pm

huh - yeah that’s weird then. I’ve only seen a 429 error when downloading from an unauthenticated session. The other thing I’d suggest if you haven’t already is just generating a new HF token and trying with that one. Otherwise, reaching out directly to HF is probably your best move.

Opensam · January 19, 2026, 9:52am

It seems to be good this morning… I don’t know what happened.. Strange case. ahah.
If i’ve got that thing again, i will reach HF directly.

Topic		Replies	Views
Error when downloading model in ollama Beginners	2	64	January 19, 2026
“Use this model”->Ollama: can't pull model with Q4 🤗Hub	4	404	May 15, 2025
Unable to download a model Beginners	7	859	October 14, 2025
How to download a model and run it with Ollama locally? Beginners	17	124574	May 15, 2025
Ollama inside Space Spaces	0	200	June 13, 2024