Anemll-chat
Collection
v 0.3.5 β’ 9 items β’ Updated
β’ 2
Pre-converted Gemma 3 270M IT model optimized for Apple Neural Engine inference using ANEMLL.
This is a monolithic (single-file) CoreML model with in-model argmax, ideal for quick testing and development on any Apple Silicon device.
| Property | Value |
|---|---|
| Base Model | google/gemma-3-270m-it |
| Architecture | Gemma 3 (gemma3_text) |
| Parameters | 270M |
| Context Length | 512 |
| Batch Size | 64 |
| Quantization | LUT6 (6-bit, per-channel group size 4) |
| Argmax | In-model (outputs token IDs) |
| Format | Monolithic (single CoreML file) |
| Dedup | ANEMLL-Dedup enabled |
| ANEMLL Version | 0.3.5 |
| Model Size | ~335 MB (compiled) |
| File | Size | Description |
|---|---|---|
gemma3_monolithic_full_lut6.mlmodelc/ |
335 MB | Compiled CoreML model (infer + prefill) |
meta.yaml |
2 KB | Model configuration |
tokenizer.json |
32 MB | Tokenizer data |
tokenizer.model |
4.5 MB | SentencePiece model |
tokenizer_config.json |
1.1 MB | Tokenizer configuration |
chat_template.jinja |
1.5 KB | Chat template |
config.json |
66 B | iOS tokenizer config |
# Clone with git-lfs
git lfs install
git clone https://huggingface.co/anemll/anemll-gemma-3-270m-it-ctx512-lut6
# Or use huggingface-cli
huggingface-cli download anemll/anemll-gemma-3-270m-it-ctx512-lut6 \
--local-dir ~/Models/ANE/gemma3-270m
# Install ANEMLL
git clone https://github.com/Anemll/Anemll.git
cd Anemll
./create_uv_env.sh
source env-anemll/bin/activate
./install_dependencies.sh
# Chat with the model
python tests/chat.py \
--meta ~/Models/ANE/gemma3-270m/meta.yaml \
--prompt "Who are you?"
# Full conversation mode
python tests/chat_full.py \
--meta ~/Models/ANE/gemma3-270m/meta.yaml
This model was converted using:
python tests/test_gemma3_model.py \
--model google/gemma-3-270m-it \
--lut 6,4 \
--lut-embeddings 6,4 \
--lut-lmhead 6,4 \
--context 512 \
--batch 64
Or equivalently:
./anemll/utils/convert_monolith.sh \
--model google/gemma-3-270m-it \
--output ./output \
--lut 6,4 \
--lut-embeddings 6,4 \
--lut-lmhead 6,4 \
--context 512 \
--batch 64 \
--argmax \
--prefix gemma3
This model conversion is released under the MIT license. The base model (Gemma 3) is subject to Google's Gemma Terms of Use.