Dongwon Jo
dongwonjo
AI & ML interests
Efficient AI, Model Compression, Sparse Attention, Quantization, Pruning, Generative Model, Large Language Model, Diffusion
Recent Activity
authored a paper 2 days ago
Rotation-Aligned Key Channel Pruning for Efficient Vision-Language Model Inference upvoted a paper about 2 months ago
CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection authored a paper about 2 months ago
CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection