State Space Model

SSM

State Space Model

참고 논문 :
https://arxiv.org/abs/2406.07887
참고 강연 :
by NVIDIA Wonmin Byeon

Abstract

Is Attention All We Need?

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Mamba-2

Limitations of Mamba

Hybrid Architecture of Mamba and Transformer

Attention Layer is bottleneck at Hybrid model,
so Context Length가 길어질수록 Speedup 증가율은 줄어듬

Summary

왼쪽부터 4K, 16K, 32K-based models

Mamba-2 Hybrid는 Transformer와 달리 Quadratic calculation까지 필요 없고 inference 빠름
but, Attention Layer가 Bottleneck이듯이 해결해야 할 사항들이 남아 있어 앞으로도 발전 가능성 있음