view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito β’ May 12, 2025 β’ 611
view article Article Tiny Agents: an MCP-powered agent in 50 lines of code julien-c β’ Apr 25, 2025 β’ 308
ViTPose++: Vision Transformer for Generic Body Pose Estimation Paper β’ 2212.04246 β’ Published Dec 7, 2022 β’ 3
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper β’ 2412.10360 β’ Published Dec 13, 2024 β’ 147
Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models Paper β’ 2408.02442 β’ Published Aug 5, 2024 β’ 21
ViewFusion: Towards Multi-View Consistency via Interpolated Denoising Paper β’ 2402.18842 β’ Published Feb 29, 2024 β’ 15