-
How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models
Paper • 2509.19371 • Published -
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Paper • 2505.06708 • Published • 10 -
Selective Attention: Enhancing Transformer through Principled Context Control
Paper • 2411.12892 • Published -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 190
Lubomir Konstantinov
lkonstantinov
AI & ML interests
None yet
Organizations
None yet
agents
reasoning
-
OpenThoughts: Data Recipes for Reasoning Models
Paper • 2506.04178 • Published • 50 -
NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning
Paper • 2504.13941 • Published • 11 -
Retrieval-augmented reasoning with lean language models
Paper • 2508.11386 • Published • 5 -
Language Models that Think, Chat Better
Paper • 2509.20357 • Published • 1
training
-
How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models
Paper • 2509.19371 • Published -
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Paper • 2505.06708 • Published • 10 -
Selective Attention: Enhancing Transformer through Principled Context Control
Paper • 2411.12892 • Published -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 190
diffusion
agents
mixed precision
reasoning
-
OpenThoughts: Data Recipes for Reasoning Models
Paper • 2506.04178 • Published • 50 -
NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning
Paper • 2504.13941 • Published • 11 -
Retrieval-augmented reasoning with lean language models
Paper • 2508.11386 • Published • 5 -
Language Models that Think, Chat Better
Paper • 2509.20357 • Published • 1