Generative AI For Audio

community

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

akhaliq submitted a paper 9 days ago

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

akhaliq submitted a paper 10 days ago

Contrastive Distribution Matching for Amortized Sequential Monte Carlo in Discrete Diffusion

akhaliq submitted a paper 18 days ago

optimize_anything: A Universal API for Optimizing any Text Parameter

View all activity

submitted a paper to Daily Papers 9 days ago

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

Paper • 2605.30350 • Published 10 days ago • 13

submitted a paper to Daily Papers 10 days ago

Contrastive Distribution Matching for Amortized Sequential Monte Carlo in Discrete Diffusion

Paper • 2605.23346 • Published 16 days ago

submitted a paper to Daily Papers 18 days ago

optimize_anything: A Universal API for Optimizing any Text Parameter

Paper • 2605.19633 • Published 19 days ago • 6

submitted a paper to Daily Papers about 1 month ago

Image Generators are Generalist Vision Learners

Paper • 2604.20329 • Published Apr 22 • 21

submitted 2 papers to Daily Papers 2 months ago

MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines

Paper • 2603.06679 • Published Mar 30 • 6

AVO: Agentic Variation Operators for Autonomous Evolutionary Search

Paper • 2603.24517 • Published Mar 25 • 11

submitted 2 papers to Daily Papers 3 months ago

V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising

Paper • 2603.16792 • Published Mar 17 • 3

Multimodal OCR: Parse Anything from Documents

Paper • 2603.13032 • Published Mar 13 • 45

authored a paper 3 months ago

Any to Full: Prompting Depth Anything for Depth Completion in One Stage

Paper • 2603.05711 • Published Mar 5 • 2

submitted a paper to Daily Papers 4 months ago

SE-Bench: Benchmarking Self-Evolution with Knowledge Internalization

Paper • 2602.04811 • Published Feb 4 • 2

submitted a paper to Daily Papers 4 months ago

UniAudio 2.0: A Unified Audio Language Model with Text-Aligned Factorized Audio Tokenization

Paper • 2602.04683 • Published Feb 4 • 3

submitted 2 papers to Daily Papers 4 months ago

Visual Personalization Turing Test

Paper • 2601.22680 • Published Jan 30 • 2

Causal World Modeling for Robot Control

Paper • 2601.21998 • Published Jan 29 • 31

submitted a paper to Daily Papers 5 months ago

Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis

Paper • 2601.14253 • Published Jan 20 • 10

authored a paper 5 months ago

HeartMuLa: A Family of Open Sourced Music Foundation Models

Paper • 2601.10547 • Published Jan 15 • 49

submitted a paper to Daily Papers 5 months ago

V-DPM: 4D Video Reconstruction with Dynamic Point Maps

Paper • 2601.09499 • Published Jan 14 • 11

submitted a paper to Daily Papers 5 months ago

HeartMuLa: A Family of Open Sourced Music Foundation Models

Paper • 2601.10547 • Published Jan 15 • 49

submitted 3 papers to Daily Papers 5 months ago

UM-Text: A Unified Multimodal Model for Image Understanding

Paper • 2601.08321 • Published Jan 13 • 21

ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation

Paper • 2601.03955 • Published Jan 7 • 3

FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation

Paper • 2512.24724 • Published Dec 31, 2025 • 9