---
license: mit
language: en
tags:
- nethack
- reinforcement-learning
- hmm
- sticky-hdp-hmm
- latent-dynamics
- variational-inference
pipeline_tag: other
---

# Sticky-HDP-HMM for NetHack (Round 4/4)

A Sticky Hierarchical Dirichlet Process Hidden Markov Model trained on NetHack latent representations for learning skills and temporal dynamics.

## Model Description

This is a Sticky-HDP-HMM model that learns discrete skills and temporal transitions in the latent space of a NetHack VAE. The model uses:
- Sticky self-transitions to encourage temporal persistence of skills
- Hierarchical Dirichlet Process for automatic skill discovery
- Normal-Inverse-Wishart priors for skill emission distributions

## Model Details

- **Model Type**: Sticky Hierarchical Dirichlet Process Hidden Markov Model
- **Framework**: PyTorch with Variational Inference
- **EM Round**: 4 of 4
- **Latent Dimensions**: 96
- **Maximum Skills**: 40
- **Base VAE**: CatkinChen/nethack-vae

## HMM Parameters

- **Alpha (DP concentration)**: 5.0
- **Kappa (sticky parameter)**: 1.0
- **Gamma (top-level DP)**: 5.0

## Usage

```python
from train import load_hmm_from_huggingface
import torch

# Load the HMM
hmm, config = load_hmm_from_huggingface("CatkinChen/nethack-hmm")

# The HMM can be used with a VAE for skill-based generation
```

## Training

This HMM was trained using Expectation-Maximization on VAE latent representations:
- E-step: Variational inference for posterior skill assignments
- M-step: VAE fine-tuning with HMM skill prior

## Citation

If you use this model, please consider citing:

```bibtex
@misc{nethack-hmm,
  title={Sticky-HDP-HMM for NetHack Skill Learning},
  author={Xu Chen},
  year={2025},
  url={https://huggingface.co/CatkinChen/nethack-hmm}
}
```