Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
magibu
's Collections
Pretrain Datasets
papers
Ekip karışık verileri
Fine-tuned LLMs
Turkish Language Healthcare Datasets
Pretrain Datasets
updated
Jan 3
Datasets we use for pretraining large language models
Upvote
-
omarkamali/wikipedia-monthly
Updated
1 day ago
•
3.95k
•
51
alibayram/hukuk_soru_cevap
Viewer
•
Updated
Nov 6, 2024
•
2.08k
•
20
•
14
umutertugrul/turkish-hospital-medical-articles
Viewer
•
Updated
Oct 2, 2025
•
24.6k
•
51
•
8
umutertugrul/turkish-medical-articles
Viewer
•
Updated
Oct 2, 2025
•
42.8k
•
10
•
3
alibayram/tr-books
Viewer
•
Updated
Dec 17, 2025
•
3.7k
•
2
selimfirat/bilkent-turkish-writings-dataset
Viewer
•
Updated
May 24, 2025
•
25.1k
•
57
•
8
umutertugrul/turkish-academic-theses-dataset
Viewer
•
Updated
Aug 18, 2025
•
649k
•
60
•
8
alibayram/onedio_haberler
Viewer
•
Updated
Jun 18, 2024
•
66.7k
•
4
•
5
habanoz/news-tr-1.8M
Viewer
•
Updated
Oct 6, 2024
•
1.85M
•
90
•
7
alibayram/hepsiburada_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
2.66M
•
7
•
14
alibayram/kitapyurdu_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
405k
•
25
alibayram/beyazperde_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
192k
•
7
•
5
BILGEM-AI/BILGE-Synthetic-Stories
Viewer
•
Updated
Nov 20, 2025
•
2.87M
•
373
•
5
Upvote
-
Share collection
View history
Collection guide
Browse collections