PMT: Plain Mask Transformer for Image and Video Segmentation with Frozen Vision Encoders Paper • 2603.25398 • Published 12 days ago • 3
DeltaTok Collection DeltaTok tokenizer, DeltaWorld predictor, and evaluation heads. https://github.com/amazon-far/deltatok • 6 items • Updated about 14 hours ago • 2
VidEoMT: Your ViT is Secretly Also a Video Segmentation Model Paper • 2602.17807 • Published Feb 19 • 7
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Paper • 2409.11355 • Published Sep 17, 2024 • 30