You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

LLM2Vec is a simple recipe to convert decoder-only LLMs into text encoders. It consists of 3 simple steps: 1) enabling bidirectional attention, 2) masked next token prediction, and 3) unsupervised contrastive learning. The model can be further fine-tuned to achieve state-of-the-art performance.

Usage

from llm2vec import LLM2Vec

model = LLM2Vec.from_pretrained(
    "standardmodelbio/model-model-smb-mntp-llama-3.1-8b-v1",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    max_length=4096,
    attn_implementation="flash_attention_2",
)

text = ["StandardModel"] * 8
embeddings = model.encode(text)
print(embeddings)
print(embeddings.shape)

"""
tensor([[ 1.1250,  0.7070, -0.1475,  ...,  0.8320,  0.2852, -0.3691],
        [ 1.1250,  0.7070, -0.1475,  ...,  0.8320,  0.2852, -0.3691],
        [ 1.1250,  0.7070, -0.1475,  ...,  0.8320,  0.2852, -0.3691],
        ...,
        [ 1.1250,  0.7070, -0.1475,  ...,  0.8320,  0.2852, -0.3691],
        [ 1.1250,  0.7070, -0.1475,  ...,  0.8320,  0.2852, -0.3691],
        [ 1.1250,  0.7070, -0.1475,  ...,  0.8320,  0.2852, -0.3691]])
torch.Size([8, 4096])
"""

License

This model is proprietary and for internal use only.

Training Data

We employ a comprehensive training dataset comprising proprietary real-world EHR records spanning fifteen distinct clinical indications which are heavy oncology. This extensive collection includes over 1.2M patients with approximately 200M clinical events, providing rich longitudinal data for training our models. The dataset’s diversity enables evaluation across 10 distinct predictive tasks, allowing thorough assessment of temporal reasoning capabilities across varied clinical scenarios. See https://arxiv.org/abs/2509.25591 for more details.

Downloads last month
6
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for standardmodelbio/smb-mntp-llama-3.1-8b-v1

Finetuned
(1740)
this model

Papers for standardmodelbio/smb-mntp-llama-3.1-8b-v1