Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.14528

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8, 2025 • 288
Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9, 2025 • 55
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot

Paper • 2501.09012 • Published Jan 15, 2025 • 10
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16, 2025 • 29

meta-llama/CodeLlama-7b-Instruct-hf

Text Generation • 7B • Updated Mar 14, 2024 • 5.23k • 59
hamzab/roberta-fake-news-classification

Text Classification • Updated Jul 4, 2023 • 3.53k • • 9
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents

Paper • 2509.06917 • Published Sep 8, 2025 • 43
Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding

Paper • 2510.08668 • Published Oct 9, 2025 • 9

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 15
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 189
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training

Paper • 2401.00849 • Published Jan 1, 2024 • 17
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 51
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 43

minlik/docllm-yi-34b

Text Generation • 38B • Updated Mar 20, 2024 • 1 • 1
JinghuiLuAstronaut/DocLLM_baichuan2_7b

Text Generation • 9B • Updated Feb 29, 2024 • 3 • 5
docling-project/docling-models

Updated Dec 3, 2025 • 712k • 197
Runtime error

Featured

187

DocLayout YOLO

🚀

187

Demo for DocLayout-YOLO

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 7
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 23
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 15
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

A collection of Audio, Video and Visual LLMs.

myshell-ai/OpenVoice

Text-to-Speech • Updated Dec 24, 2024 • 488
Running

Featured

1.13k

OpenVoice

🤗

1.13k

Clone a voice and generate speech from your text
dataautogpt3/ProteusV0.3

Text-to-Image • Updated Feb 12, 2024 • 32.1k • 95
ByteDance/SDXL-Lightning

Text-to-Image • Updated Apr 3, 2024 • 38.6k • • 2.13k

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8, 2025 • 288
Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9, 2025 • 55
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot

Paper • 2501.09012 • Published Jan 15, 2025 • 10
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16, 2025 • 29

minlik/docllm-yi-34b

Text Generation • 38B • Updated Mar 20, 2024 • 1 • 1
JinghuiLuAstronaut/DocLLM_baichuan2_7b

Text Generation • 9B • Updated Feb 29, 2024 • 3 • 5
docling-project/docling-models

Updated Dec 3, 2025 • 712k • 197
Runtime error

Featured

187

DocLayout YOLO

🚀

187

Demo for DocLayout-YOLO

meta-llama/CodeLlama-7b-Instruct-hf

Text Generation • 7B • Updated Mar 14, 2024 • 5.23k • 59
hamzab/roberta-fake-news-classification

Text Classification • Updated Jul 4, 2023 • 3.53k • • 9
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents

Paper • 2509.06917 • Published Sep 8, 2025 • 43
Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding

Paper • 2510.08668 • Published Oct 9, 2025 • 9

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 7
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 23
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 15
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 15
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

A collection of Audio, Video and Visual LLMs.

myshell-ai/OpenVoice

Text-to-Speech • Updated Dec 24, 2024 • 488
Running

Featured

1.13k

OpenVoice

🤗

1.13k

Clone a voice and generate speech from your text
dataautogpt3/ProteusV0.3

Text-to-Image • Updated Feb 12, 2024 • 32.1k • 95
ByteDance/SDXL-Lightning

Text-to-Image • Updated Apr 3, 2024 • 38.6k • • 2.13k

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 189
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training

Paper • 2401.00849 • Published Jan 1, 2024 • 17
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 51
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 43

Previous
1
2
3
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs