Personality and Roleplay Collection Datasets for finetuning the personality and accent's of your LLM's • 2 items • Updated Feb 21 • 2
Creative Writing Datasets Collection High-quality creative writing and storytelling data. • 36 items • Updated 27 days ago • 7
🤏 Smol-Data Collection Tried and tested mixes for strong pretraining. Inspired by https://huggingface.co/blog/codelion/optimal-dataset-mixing • 14 items • Updated Mar 2 • 12
pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 7 items • Updated Feb 26 • 96
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 30 items • Updated Feb 25 • 137