mindchain 's Collections

Hybrid Attention - Efficient Transformer Architectures

Hybrid attention models combining local and global attention for efficient long-context processing