bytedance-research/HuMo
Image-to-Video • Updated
• 86 • 213
UMO based on OmniGen2
inpaint images using Qwen Image with inpainting Controlnet
Chat with AI using ERNIE‑4.5 model
Detect objects in images and videos
Transcribe audio files to text with language detection
Generate images from text prompts