Math Demo
Clone the repo
git clone https://huggingface.co/ofirzaf/hebrew-math-tutor-v1-W4A16-G128 && cd hebrew-math-tutor-v1-W4A16-G128
vLLM command for benchmarking
To run on Intel XPU better install internal Intel vLLM
git clone https://github.com/intel-innersource/applications.ai.gpu.vllm-xpu vllm-xpu && cd vllm-xpu
git checkout release/2601/vllm-xpu-0.14.0
no_proxy=intel.com,127.0.0.1,localhost uv pip install -r requirements/xpu.txt --index-strategy unsafe-best-match
VLLM_TARGET_DEVICE=xpu python setup.py install
Than you can serve the model with
vllm serve ofirzaf/hebrew-math-tutor-v1-W4A16-G128 --no_enable_prefix_caching --config ./qconfig.yaml
--no_enable_prefix_cachingis only needed for benchmarking, if you omit this flag you might get some speedup from prefix caching
Start Streamlit app
MY_MODEL=ofirzaf/hebrew-math-tutor-v1-W4A16-G128 streamlit run ./app.py --server.port=8501 --server.address=0.0.0.0
- Downloads last month
- 1,095
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support
Model tree for ofirzaf/hebrew-math-tutor-v1-W4A16-G128
Base model
Qwen/Qwen3-4B-Thinking-2507
Finetuned
Intel/hebrew-math-tutor-v1