|
When Your “Labels” Aren’t Really Labels: Dealing with Entity-Based NLP Datasets
|
|
0
|
4
|
April 26, 2026
|
|
Made a Python failure dataset for DPO/RLHF — how do you source negative examples?
|
|
0
|
5
|
April 26, 2026
|
|
Load_dataset() creates a duplicate in cache
|
|
1
|
19
|
April 25, 2026
|
|
Spanish Historical Web Corpus — unique categories (religion, folklore, conspiracies, BOE)
|
|
0
|
9
|
April 21, 2026
|
|
Dataset viewer broke after repo rename
|
|
5
|
39
|
April 20, 2026
|
|
Huggingface Dataset Download Stuck in Kaggle
|
|
8
|
120
|
April 14, 2026
|
|
Add new official benchmark on the Hub
|
|
3
|
51
|
April 13, 2026
|
|
Otal AI beginner with a 25-year photography archive—is this useful for training?
|
|
0
|
15
|
April 10, 2026
|
|
QSBench: Synthetic quantum circuit datasets for QML benchmarking
|
|
0
|
28
|
April 6, 2026
|
|
I would like to get an opinion from knowledgeable people (since I don't understand anything about it myself)
|
|
26
|
191
|
April 4, 2026
|
|
Request to delete DOI-locked dataset: th1nhng0/vietnamese-legal-documents
|
|
2
|
25
|
April 1, 2026
|
|
Indic-faker: Generate realistic Indian synthetic data for NLP/ML — 8 languages, native scripts, batch DataFrame export
|
|
3
|
50
|
March 30, 2026
|
|
What are some AI/ML concepts or problems you found difficult while learning?
|
|
1
|
14
|
March 24, 2026
|
|
The downloads count of dataset hasn't been updated
|
|
2
|
26
|
March 19, 2026
|
|
Need help in fine-tuning of OCR model at production grade
|
|
1
|
74
|
March 12, 2026
|
|
Would a curated dataset of ~4000 social media design layouts be useful for training or fine-tuning design models?
|
|
1
|
25
|
March 10, 2026
|
|
Training LLM model for asking questions
|
|
5
|
337
|
March 10, 2026
|
|
Huggingface datasets card not work correctly
|
|
1
|
54
|
March 9, 2026
|
|
Fastdedup: Rust-based dataset deduplication — benchmarks on FineWeb sample-10BT
|
|
2
|
67
|
March 4, 2026
|
|
New Datasets: Human Vocality Primitives Series
|
|
0
|
39
|
March 4, 2026
|
|
Any way to streaming-preprocess a dataset to disk?
|
|
7
|
177
|
March 4, 2026
|
|
Inquiry About Dataset for AI-Driven Cloud Load Balancing and Auto scaling of instances
|
|
2
|
55
|
March 4, 2026
|
|
Looking for Data
|
|
2
|
58
|
March 4, 2026
|
|
Upload a large folder from S3 to a dataset
|
|
6
|
147
|
March 4, 2026
|
|
Dataset flagged as unsafe due to false positive - how to resolve?
|
|
6
|
122
|
March 4, 2026
|
|
Downloads count data not updating
|
|
0
|
28
|
March 4, 2026
|
|
Incorrect arXiv paper automatically linked to my dataset – request for correction
|
|
0
|
47
|
March 4, 2026
|
|
Looking for Mental Health Support Datasets for building a Multi-turn Chatbot
|
|
7
|
3383
|
March 4, 2026
|
|
Need help to find a dataset for fine tuning
|
|
1
|
162
|
March 4, 2026
|
|
I need to create my own dataset based on mlabonne/orpo-dpo-mix-40k. but when i does it and create a dataset for ORPO training it gives error
|
|
3
|
637
|
March 4, 2026
|