File size: 3,388 Bytes
deeeab6 36531ee 9695bd1 a04461c 36531ee a04461c 4edb4dd 8ad4c61 d7f3e3f c70db05 4edb4dd 1d31d11 201b764 1d31d11 c14fa1e 1d31d11 c14fa1e 201b764 1d31d11 4edb4dd 3cad622 4edb4dd 0ed1ba7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | ---
license: apache-2.0
language:
- en
library_name: transformers
---
## Model Details
<img alt="OLMo Logo" src="https://huggingface.co/datasets/allenai/blog-images/resolve/main/olmo2/olmo.png" width="242px" style="margin-left:'auto' margin-right:'auto' display:'block'">
---
## Model Card for OLMo 2 1B Early Training Checkpoints
We introduce OLMo 2 1B Early Training checkpoints, a collection of frequent checkpoints early in the training process of a 1B model. We release these checkpoints as a resource for anyone interested in studying early training dynamics. To view our official OLMo 2 1B model, please see [OLMo-2-0425-1B](https://huggingface.co/allenai/OLMo-2-0425-1B). Or, view the entire collection of OLMo 2 models [here](https://huggingface.co/collections/allenai/olmo-2-674117b93ab84e98afc72edc).
We generated these checkpoints after the original training of our OLMo 2 1B model. Checkpoints were saved every 1,000 steps for 37,000 steps, starting at step0 of our OLMo-2-1B model.
### A Note on these Checkpoints
These checkpoints use the same architecture and starting checkpoint as the official OLMo 2 1B, but they aren’t identical to the original run due to the non-deterministic nature of LLM training environments. Performance may differ slightly. If you're interested in comparing these checkpoints to our original OLMo 2 1B, you can compare the checkpoints that are present in both repositories:
- `stage1-step0-tokens0B` -- This official OLMo 2 1B checkpoint is loaded in as the starting point for these checkpoints
- `stage1-step10000-tokens21B`
- `stage1-step20000-tokens42B`
- `stage1-step30000-tokens63B`
## Inference
You can access these checkpoints using the standard Hugging Face Transformers library:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
olmo_early_training = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-0425-1B-early-training")
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-2-0425-1B-early-training")
message = ["The capital of the United States is "]
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
response = olmo_early_training.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
```
To access a specific checkpoint, you can specify the revision:
```python
olmo_early_training = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-0425-1B-early-training", revision="stage1-step20000-tokens42B")
```
## Model Description
- Developed by: Allen Institute for AI (Ai2)
- Model type: a Transformer style autoregressive language model.
- Language(s) (NLP): English
- License: The models and checkpoints are licensed under Apache 2.0. They are intended for research and educational use in accordance with [Ai2's Responsible Use Guidelines](https://allenai.org/responsible-use).
- Contact: Technical inquiries: olmo@allenai.org. Press: press@allenai.org
## Bias, Risks, and Limitations
Like any base or fine-tuned language model, AI can be prompted by users to generate harmful and sensitive content. Such content may also be produced unintentionally, especially in cases involving bias, so we recommend that users consider the risks when applying this technology. Additionally, many statements from OLMo or any LLM are often inaccurate, so facts should be verified.
|