13 14 37

Ligeng Zhu

Ligeng-Zhu

AI & ML interests

None yet

Recent Activity

updated a model 1 day ago

ligeng-dev/tw-data-train_final_v2_nb2_mt8192_replaced_fix-8node-resume

published a model 1 day ago

ligeng-dev/tw-data-train_classified-8node-resume

published a model 1 day ago

ligeng-dev/tw-data-train_final_replaced_from_classified-fix-format-8node-resume

View all activity

Organizations

upvoted a paper 9 days ago

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Paper • 2604.05015 • Published 11 days ago • 232

upvoted a paper 2 months ago

Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

Paper • 2602.02958 • Published Feb 3 • 34

upvoted a paper 3 months ago

Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow

Paper • 2601.14243 • Published Jan 20 • 23

upvoted a collection 6 months ago

InternVL3.5

Collection

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 45 items • Updated Mar 2 • 107

upvoted an article 10 months ago

Article

The Common Pile v0.1

Jun 6, 2025

•

upvoted an article about 1 year ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Mar 12, 2025

•

494

upvoted a collection over 1 year ago

Sana

Collection

⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 22 items • Updated Mar 10 • 98

upvoted 2 papers over 1 year ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 121

Wolf: Captioning Everything with a World Summarization Framework

Paper • 2407.18908 • Published Jul 26, 2024 • 32

upvoted a collection almost 2 years ago

VILA: On Pre-training for Visual Language Models

Collection

10 items • Updated Mar 10 • 57

upvoted a paper about 2 years ago

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Paper • 2312.08578 • Published Dec 14, 2023 • 20

upvoted 3 papers over 2 years ago

VILA: On Pre-training for Visual Language Models

Paper • 2312.07533 • Published Dec 12, 2023 • 21

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Paper • 2310.04378 • Published Oct 6, 2023 • 22

PockEngine: Sparse and Efficient Fine-tuning in a Pocket

Paper • 2310.17752 • Published Oct 26, 2023 • 15

Ligeng Zhu

AI & ML interests

Recent Activity

Organizations

Ligeng-Zhu's activity

The Common Pile v0.1

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM