--- license: mit language: en tags: - nethack - reinforcement-learning - hmm - sticky-hdp-hmm - latent-dynamics - variational-inference pipeline_tag: other --- # Sticky-HDP-HMM for NetHack (Round 4/4) A Sticky Hierarchical Dirichlet Process Hidden Markov Model trained on NetHack latent representations for learning skills and temporal dynamics. ## Model Description This is a Sticky-HDP-HMM model that learns discrete skills and temporal transitions in the latent space of a NetHack VAE. The model uses: - Sticky self-transitions to encourage temporal persistence of skills - Hierarchical Dirichlet Process for automatic skill discovery - Normal-Inverse-Wishart priors for skill emission distributions ## Model Details - **Model Type**: Sticky Hierarchical Dirichlet Process Hidden Markov Model - **Framework**: PyTorch with Variational Inference - **EM Round**: 4 of 4 - **Latent Dimensions**: 96 - **Maximum Skills**: 40 - **Base VAE**: CatkinChen/nethack-vae ## HMM Parameters - **Alpha (DP concentration)**: 5.0 - **Kappa (sticky parameter)**: 1.0 - **Gamma (top-level DP)**: 5.0 ## Usage ```python from train import load_hmm_from_huggingface import torch # Load the HMM hmm, config = load_hmm_from_huggingface("CatkinChen/nethack-hmm") # The HMM can be used with a VAE for skill-based generation ``` ## Training This HMM was trained using Expectation-Maximization on VAE latent representations: - E-step: Variational inference for posterior skill assignments - M-step: VAE fine-tuning with HMM skill prior ## Citation If you use this model, please consider citing: ```bibtex @misc{nethack-hmm, title={Sticky-HDP-HMM for NetHack Skill Learning}, author={Xu Chen}, year={2025}, url={https://huggingface.co/CatkinChen/nethack-hmm} } ```