Instructions to use BSC-LT/salamandraTA-7B-instruct-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use BSC-LT/salamandraTA-7B-instruct-GGUF with Transformers:

# Use a pipeline as a high-level helper
# Warning: Pipeline type "translation" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline

pipe = pipeline("translation", model="BSC-LT/salamandraTA-7B-instruct-GGUF")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("BSC-LT/salamandraTA-7B-instruct-GGUF")
model = AutoModelForCausalLM.from_pretrained("BSC-LT/salamandraTA-7B-instruct-GGUF")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use BSC-LT/salamandraTA-7B-instruct-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="BSC-LT/salamandraTA-7B-instruct-GGUF",
	filename="salamandraTA_7B_inst_q4.gguf",
)

llm.create_chat_completion(
	messages = "\"Меня зовут Вольфганг и я живу в Берлине\""
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use BSC-LT/salamandraTA-7B-instruct-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf BSC-LT/salamandraTA-7B-instruct-GGUF
# Run inference directly in the terminal:
llama-cli -hf BSC-LT/salamandraTA-7B-instruct-GGUF

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf BSC-LT/salamandraTA-7B-instruct-GGUF
# Run inference directly in the terminal:
llama-cli -hf BSC-LT/salamandraTA-7B-instruct-GGUF

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf BSC-LT/salamandraTA-7B-instruct-GGUF
# Run inference directly in the terminal:
./llama-cli -hf BSC-LT/salamandraTA-7B-instruct-GGUF

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf BSC-LT/salamandraTA-7B-instruct-GGUF
# Run inference directly in the terminal:
./build/bin/llama-cli -hf BSC-LT/salamandraTA-7B-instruct-GGUF

Use Docker

docker model run hf.co/BSC-LT/salamandraTA-7B-instruct-GGUF

LM Studio
Jan
Ollama
How to use BSC-LT/salamandraTA-7B-instruct-GGUF with Ollama:
```
ollama run hf.co/BSC-LT/salamandraTA-7B-instruct-GGUF
```

Unsloth Studio

How to use BSC-LT/salamandraTA-7B-instruct-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for BSC-LT/salamandraTA-7B-instruct-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for BSC-LT/salamandraTA-7B-instruct-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for BSC-LT/salamandraTA-7B-instruct-GGUF to start chatting

Docker Model Runner
How to use BSC-LT/salamandraTA-7B-instruct-GGUF with Docker Model Runner:
```
docker model run hf.co/BSC-LT/salamandraTA-7B-instruct-GGUF
```

Lemonade

How to use BSC-LT/salamandraTA-7B-instruct-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull BSC-LT/salamandraTA-7B-instruct-GGUF

Run and chat with the model

lemonade run user.salamandraTA-7B-instruct-GGUF-{{QUANT_TAG}}

List all available models

lemonade list

SalamandraTA-7B-instruct-GGUF Model Card

This model is the GGUF-quantized version of SalamandraTA-7b-instruct.

The model weights are quantized from FP16 to Q4_K_M quantization Q8_0 (8-bit quantization), (4-bit weights with K-means clustering quantization) and Q3_K_M (3-but weights with K-means clustering quantization) using the Llama.cpp framework. Inferencing with this model can be done using VLLM.

SalamandraTA-7b-instruct is a translation LLM that has been instruction-tuned from SalamandraTA-7b-base. The base model results from continually pre-training Salamandra-7b on parallel data and has not been published, but is reserved for internal use. SalamandraTA-7b-instruct is proficient in 35 European languages (plus 3 varieties) and supports translation-related tasks, namely: sentence-level-translation, paragraph-level-translation, document-level-translation, automatic post-editing, grammar checking, machine translation evaluation, alternative translations, named-entity-recognition and context-aware translation.

DISCLAIMER: This version of Salamandra is tailored exclusively for translation tasks. It lacks chat capabilities and has not been trained with any chat instructions.

The entire Salamandra family is released under a permissive Apache 2.0 license.

How to Use

The following example code works under Python 3.10.4, vllm==0.7.3, torch==2.5.1 and torchvision==0.20.1, though it should run on any current version of the libraries. This is an example of translation using the model:

from huggingface_hub import snapshot_download
from vllm import LLM, SamplingParams

model_dir = snapshot_download(repo_id="BSC-LT/salamandraTA-7B-instruct-GGUF", revision="main")
model_name = "salamandraTA_7b_inst_q4.gguf"

llm = LLM(model=model_dir + '/' + model_name, tokenizer=model_dir)

source = "Spanish"
target = "English"
sentence = "Ayer se fue, tomó sus cosas y se puso a navegar. Una camisa, un pantalón vaquero y una canción, dónde irá, dónde irá. Se despidió, y decidió batirse en duelo con el mar. Y recorrer el mundo en su velero. Y navegar, nai-na-na, navegar."

prompt = f"Translate the following text from {source} into {target}.\\n{source}: {sentence} \\n{target}:"
messages = [{'role': 'user', 'content': prompt}]

outputs = llm.chat(messages,
                   sampling_params=SamplingParams(
                       temperature=0.1,
                       stop_token_ids=[5],
                       max_tokens=200)
                   )[0].outputs

print(outputs[0].text)

Additional information

Author

The Language Technologies Unit from Barcelona Supercomputing Center.

Contact

For further information, please send an email to langtech@bsc.es.

Copyright

Funding

This work has been promoted and financed by the Government of Catalonia through the Aina Project.

This work is funded by the Ministerio para la Transformación Digital y de la Función Pública - Funded by EU – NextGenerationEU within the framework of ILENIA Project with reference 2022/TL22/00215337.

Acknowledgements

The success of this project has been made possible thanks to the invaluable contributions of our partners in the ILENIA Project: HiTZ, and CiTIUS. Their efforts have been instrumental in advancing our work, and we sincerely appreciate their help and support.

Disclaimer

Be aware that the model may contain biases or other unintended distortions. When third parties deploy systems or provide services based on this model, or use the model themselves, they bear the responsibility for mitigating any associated risks and ensuring compliance with applicable regulations, including those governing the use of Artificial Intelligence.

The Barcelona Supercomputing Center, as the owner and creator of the model, shall not be held liable for any outcomes resulting from third-party use.

License

Apache License, Version 2.0

Downloads last month: 184

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for BSC-LT/salamandraTA-7B-instruct-GGUF

Base model

BSC-LT/salamandra-7b

Finetuned

BSC-LT/salamandraTA-7b-instruct

Quantized

(4)

this model

BSC-LT
/

salamandraTA-7B-instruct-GGUF

SalamandraTA-7B-instruct-GGUF Model Card

How to Use

Additional information

Author

Contact

Copyright

Funding

Acknowledgements

Disclaimer

Disclaimer

License

Model tree for BSC-LT/salamandraTA-7B-instruct-GGUF

Space using BSC-LT/salamandraTA-7B-instruct-GGUF 1