Instructions to use immortaltatsu/ghost-ai-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use immortaltatsu/ghost-ai-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="immortaltatsu/ghost-ai-gguf", filename="ghost-ai-Q6_K.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use immortaltatsu/ghost-ai-gguf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf immortaltatsu/ghost-ai-gguf:Q6_K # Run inference directly in the terminal: llama-cli -hf immortaltatsu/ghost-ai-gguf:Q6_K
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf immortaltatsu/ghost-ai-gguf:Q6_K # Run inference directly in the terminal: llama-cli -hf immortaltatsu/ghost-ai-gguf:Q6_K
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf immortaltatsu/ghost-ai-gguf:Q6_K # Run inference directly in the terminal: ./llama-cli -hf immortaltatsu/ghost-ai-gguf:Q6_K
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf immortaltatsu/ghost-ai-gguf:Q6_K # Run inference directly in the terminal: ./build/bin/llama-cli -hf immortaltatsu/ghost-ai-gguf:Q6_K
Use Docker
docker model run hf.co/immortaltatsu/ghost-ai-gguf:Q6_K
- LM Studio
- Jan
- vLLM
How to use immortaltatsu/ghost-ai-gguf with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "immortaltatsu/ghost-ai-gguf" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "immortaltatsu/ghost-ai-gguf", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/immortaltatsu/ghost-ai-gguf:Q6_K
- Ollama
How to use immortaltatsu/ghost-ai-gguf with Ollama:
ollama run hf.co/immortaltatsu/ghost-ai-gguf:Q6_K
- Unsloth Studio
How to use immortaltatsu/ghost-ai-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for immortaltatsu/ghost-ai-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for immortaltatsu/ghost-ai-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for immortaltatsu/ghost-ai-gguf to start chatting
- Pi
How to use immortaltatsu/ghost-ai-gguf with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf immortaltatsu/ghost-ai-gguf:Q6_K
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "immortaltatsu/ghost-ai-gguf:Q6_K" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use immortaltatsu/ghost-ai-gguf with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf immortaltatsu/ghost-ai-gguf:Q6_K
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default immortaltatsu/ghost-ai-gguf:Q6_K
Run Hermes
hermes
- Docker Model Runner
How to use immortaltatsu/ghost-ai-gguf with Docker Model Runner:
docker model run hf.co/immortaltatsu/ghost-ai-gguf:Q6_K
- Lemonade
How to use immortaltatsu/ghost-ai-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull immortaltatsu/ghost-ai-gguf:Q6_K
Run and chat with the model
lemonade run user.ghost-ai-gguf-Q6_K
List all available models
lemonade list
Ghost AI
The ultra-lean, on-device AI model behind Ghost AI on Solana.
Ghost AI is part of Ghost AI, an on-device privacy-first crypto application running on Solana. This model powers the smallest, fastest local AI surface inside the Ghost AI app โ perfect for mobile and low-resource devices. Your prompts, wallet context, and financial data never leave your device.
Ghost AI is a fork of LiquidAI's LFM 2.5 (350M), rebranded and packaged for on-device crypto and finance use cases. All underlying model weights, architecture, and capabilities derive from LFM 2.5 โ credit and licensing terms remain with the upstream project.
Ghost AI (350M) is the base / lite tier of the Ghost AI model family. For higher-quality reasoning, see Ghost AI Pro (1.2B, based on LFM 2.5 1.2B Instruct).
Why Ghost AI
- Ultra-light. Q6_K quantization โ ~280MB on disk. Runs comfortably on phones, embedded devices, and low-end laptops.
- On-device, always private. Runs locally via
llama.cpp,LM Studio,Ollama, or any GGUF-compatible runtime. Nothing is logged or sent to a server. - Built for finance + crypto. Wallet-aware prompts, on-chain transaction explanations, DeFi context, and personal-finance reasoning.
- Native to Ghost AI on Solana. The default local model for the Ghost AI privacy-first crypto app on Solana.
- Real-time. Designed for instant, low-latency responses on CPU โ no GPU required.
Ghost AI model family
| Tier | Repo | Base | Quantization | Size | Best for |
|---|---|---|---|---|---|
| Ghost AI (this) | immortaltatsu/ghost-ai-gguf |
LFM 2.5 350M | Q6_K | ~280 MB | Mobile, embedded, instant responses |
| Ghost AI Pro | immortaltatsu/ghost-ai-pro-gguf |
LFM 2.5 1.2B Instruct | Q4_K_S | ~700 MB | Desktop, deeper reasoning |
Files
| File | Quantization | Size | Notes |
|---|---|---|---|
ghost-ai-Q6_K.gguf |
Q6_K | ~280 MB | Recommended. High-quality 6-bit quantization with minimal quality loss for a 350M model. |
Usage
llama.cpp
./llama-cli -m ghost-ai-Q6_K.gguf -p "Explain this Solana transaction: ..." -n 256
Ollama (Modelfile)
FROM ./ghost-ai-Q6_K.gguf
PARAMETER temperature 0.7
PARAMETER num_ctx 4096
ollama create ghost-ai -f Modelfile
ollama run ghost-ai
LM Studio
Download the .gguf file and load it directly from the LM Studio model browser.
Intended Use
Ghost AI is intended for:
- Powering the on-device AI layer of the Ghost AI crypto app on Solana
- Private wallet, transaction, and portfolio explanations
- Lightweight personal finance and budgeting assistance
- DeFi term clarification and on-chain activity narration
- Embedded assistants in mobile crypto and fintech apps where privacy is non-negotiable
Limitations
Ghost AI is a small (350M) model โ fast and private, but limited in long-form reasoning compared to Ghost AI Pro. It is not a substitute for licensed financial advice, accounting, or legal counsel, and it is not a trading or transaction-signing agent. Verify any numerical reasoning, dates, ticker symbols, token addresses, on-chain claims, and regulatory statements against authoritative sources before acting on them. Always confirm wallet operations through the Ghost AI app's signing flow โ never from raw model output.
About Ghost AI
Ghost AI is an on-device, privacy-first crypto app built on Solana. It brings local AI directly into the wallet experience โ letting users explore transactions, balances, and DeFi context with a model that runs entirely on their own device. Nothing is logged, sent to a server, or shared with a third party.
This model is the lightweight default that powers Ghost AI's on-device intelligence layer on resource-constrained devices.
Upstream
Ghost AI is a fork of LiquidAI/LFM2.5-350M. The GGUF quantizations are sourced from LiquidAI/LFM2.5-350M-GGUF.
License
Released for research and on-device application use. This model is governed by the upstream LFM Open License issued by Liquid AI โ please review the original terms before redistribution or commercial deployment.
Ghost AI โ intelligence that stays with you.
- Downloads last month
- 156
6-bit