| --- |
| license: apache-2.0 |
| datasets: |
| - Salesforce/wikitext |
| - VisionTheta/fineweb-1B |
| - Voxel51/fiftyone-qa-pairs-14k |
| - Open-Orca/OpenOrca |
| - OpenAssistant/oasst2 |
| - Ereeeeef3/Qu-QA-v2 |
| - tau/commonsense_qa |
| - OpenAssistant/oasst1 |
| - hkust-nlp/deita-10k-v0 |
| - HuggingFaceH4/ultrafeedback_binarized |
| - meta-math/MetaMathQA |
| - HuggingFaceH4/ultrachat_200k |
| language: |
| - en |
| pipeline_tag: text-generation |
| --- |
| # Cascade0 170M Base Model |
|
|
|
|
| %3C!----%3E%3C%2Ftd%3E%3C%2Ftr%3E%3Ctr id="L24"> | |
|
| Base model of the entire Cascade0-159M-DPO-Instruct and Normal Instruct saga. |
| #### Max context size is 1512. |
| ## Cascade0 Base VS other small models |
|
|
|
|
| %3C!----%3E%3C%2Ftd%3E%3C%2Ftr%3E%3Ctr id="L31"> | |
|
| made with LMEval Harness |
|
|
|
|