Conflict-Aware Multimodal Fusion for Ambivalence and Hesitancy Recognition Paper • 2603.15818 • Published 2 days ago
VP-Hype: A Hybrid Mamba-Transformer Framework with Visual-Textual Prompting for Hyperspectral Image Classification Paper • 2603.01174 • Published 17 days ago
VLM-PAR: A Vision Language Model for Pedestrian Attribute Recognition Paper • 2512.22217 • Published Dec 22, 2025
Integrating ConvNeXt and Vision Transformers for Enhancing Facial Age Estimation Paper • 2511.00123 • Published Oct 31, 2025
CVPD at QIAS 2025 Shared Task: An Efficient Encoder-Based Approach for Islamic Inheritance Reasoning Paper • 2509.00457 • Published Aug 30, 2025
C-DiffDet+: Fusing Global Scene Context with Generative Denoising for High-Fidelity Object Detection Paper • 2509.00578 • Published Aug 30, 2025 • 2
Enhanced Arabic Text Retrieval with Attentive Relevance Scoring Paper • 2507.23404 • Published Jul 31, 2025 • 3
Beyond Linear Bottlenecks: Spline-Based Knowledge Distillation for Culturally Diverse Art Style Classification Paper • 2507.23436 • Published Jul 31, 2025 • 6
LoLA-SpecViT: Local Attention SwiGLU Vision Transformer with LoRA for Hyperspectral Imaging Paper • 2506.17759 • Published Jun 21, 2025 • 1
SegDT: A Diffusion Transformer-Based Segmentation Model for Medical Imaging Paper • 2507.15595 • Published Jul 21, 2025 • 6