Models and Datasets of paper: [Forget, Then Recall: Learnable Compression and
Selective Unfolding via Gist Sparse Attention]
Yuzhen Mao
gist-sparse-attention
·
AI & ML interests
None yet
Recent Activity
upvoted a paper 5 days ago
Decentralized Multi-Agent Systems with Shared Context submitted a paper 5 days ago
Decentralized Multi-Agent Systems with Shared Context authored a paper 2 months ago
Mem-α: Learning Memory Construction via Reinforcement LearningOrganizations
models 19
gist-sparse-attention/GSA-FT-Qwen2-7B-Instruct-chunk8
333k • Updated • 1
gist-sparse-attention/GSA-FT-Qwen2-7B-Instruct-chunk16
333k • Updated • 2
gist-sparse-attention/GSA-FT-Qwen2-7B-Instruct-chunk32
333k • Updated • 3
gist-sparse-attention/GSA-FT-Qwen2-7B-Instruct-chunk4-chunk4
333k • Updated • 462
gist-sparse-attention/GSA-FT-Qwen2-7B-Instruct-chunk8-chunk4
333k • Updated • 4
gist-sparse-attention/GSA-FT-Llama-3.2-1B-chunk16
1B • Updated • 7
gist-sparse-attention/GSA-FT-Llama-3.2-1B-chunk4-chunk4
1B • Updated • 2
gist-sparse-attention/GSA-link-FT-Llama-3.2-1B-chunk8
1B • Updated • 4
gist-sparse-attention/GSA-link-FT-Llama-3.2-1B-chunk16
1B • Updated • 1
gist-sparse-attention/GSA-link-FT-Llama-3.2-1B-chunk4-chunk4
1B • Updated • 2
datasets 14
gist-sparse-attention/GSA-FT-Llama-3.2-1B-data
Preview • Updated • 196
gist-sparse-attention/GSA-FT-Qwen2-7B-Instruct-chunk8-chunk4-data
Preview • Updated • 107
gist-sparse-attention/GSA-FT-Qwen2-7B-Instruct-chunk4-chunk4-data
Preview • Updated • 134
gist-sparse-attention/GSA-FT-Qwen2-7B-Instruct-chunk32-data
Viewer • Updated • 25.9k • 182
gist-sparse-attention/GSA-FT-Qwen2-7B-Instruct-chunk16-data
Preview • Updated • 125
gist-sparse-attention/GSA-FT-Qwen2-7B-Instruct-chunk8-data
Preview • Updated • 36
gist-sparse-attention/GSA-PT-Qwen2-7B-Instruct-chunk8-chunk4-data
Viewer • Updated • 88.6k • 265
gist-sparse-attention/GSA-PT-Qwen2-7B-Instruct-chunk4-chunk4-data
Viewer • Updated • 88.5k • 272
gist-sparse-attention/GSA-PT-Qwen2-7B-Instruct-chunk32-data
Preview • Updated • 205
gist-sparse-attention/GSA-PT-Qwen2-7B-Instruct-chunk16-data
Viewer • Updated • 88.8k • 253