πŸ‡°πŸ‡ͺ Model Card for RareElf/kalenjin-asr

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m for automatic speech recognition (ASR) in Kalenjin. It represents the Stage 2 research milestone in research inquiry into low-resource Nilotic languages at iLabAfrica, Strathmore University.


πŸ“‹ Model Details

Model Description

This model leverages the Wav2Vec2-XLS-R-300M architecture, specifically fine-tuned for the phonetic and morphological complexities of Kalenjin. It employs a cascaded decoding strategy using an external KenLM language model to ensure orthographic consistency and reduce phonetic hallucinations common in low-resource settings.

  • Developed by: Kevin Obote / RareElf
  • Shared by: RareElf
  • Model type: Automatic Speech Recognition (ASR)
  • Language(s): Kalenjin (kln)
  • License: Apache-2.0
  • Finetuned from model: facebook/wav2vec2-xls-r-300m

πŸš€ Uses

Direct Use

  • Transcribing Kalenjin audio into text for research, documentation, and accessibility.
  • Integration into multi-stage translation pipelines (e.g., Kalenjin β†’ English).

Out-of-Scope Use

  • Not suitable for high-stakes medical or legal transcription without human verification.
  • May struggle with rapid code-switching or extreme background noise.

⚠️ Bias, Risks, and Limitations

  • Performance reflects the distribution of Common Voice Scripted Speech 24.0 - Kalenjin.
  • Tonal variations in Kalenjin remain challenging for CTC-based architectures.
  • Performance may vary across dialects not represented in training data.

πŸ‹οΈ Training Details

Training Data

  • Dataset: Common Voice Scripted Speech 24.0 - Kalenjin
  • Augmentation: SpecAugment (mask_time_prob=0.05)

Training Procedure

  • Architecture: Wav2Vec2-XLS-R-300M
  • Optimizer: AdamW
  • Callbacks: EarlyStopping (patience=5)

Preprocessing

  • Resampled audio to 16kHz
  • Orthographic normalization
  • Removal of corrupted or misaligned samples

Training Hyperparameters

  • Epochs: 30
  • Effective Batch Size: 32
    (16 per device + 2 gradient accumulation)
  • Learning Rate: 5e-5
  • Mixed Precision: fp16

πŸ“Š Evaluation

Testing Data

  • Held-out test set from Common Voice Kalenjin

Metrics

  • WER: Word Error Rate
  • CER: Character Error Rate

Results

Decoding Strategy WER (%) CER (%)
KenLM Beam Search 61.75 20.11
Greedy Decoding 69.03 24.15

~10.36% absolute WER improvement over Stage 1 baseline.


🌍 Environmental Impact

  • Hardware: NVIDIA A100 GPU
  • Cloud Provider: Modal
  • Compute Region: us-east-1

πŸ›  Technical Specifications

Architecture & Objective

  • Architecture: Wav2Vec2 + CTC Head
  • Objective: Map 16kHz audio to Kalenjin character-level transcriptions

Compute Infrastructure

Hardware

  • Training: NVIDIA A100 GPU (Modal)
  • Development: Lenovo ThinkPad T14 Gen 1 (32GB RAM, 1TB SSD)

Software

  • Python 3.10
  • PyTorch 2.1.0
  • Transformers 4.42.3

πŸ“š Citation

@phdthesis{obote2026asr,
  author = {Obote, Kevin},
  title = {Automatic Speech Recognition for Low-Resource Nilotic Languages: A Stage-2 Acoustic Adaptation Approach},
  school = {iLabAfrica, Strathmore University},
  year = {2026}
}

πŸ“– Glossary

  • ASR: Automatic Speech Recognition
  • WER: Word Error Rate
  • CER: Character Error Rate
  • KenLM: Efficient n-gram language modeling library

πŸ‘€ Model Card Authors

Kevin Obote / RareElf / Guild Code Team


πŸ“¬ Contact

[email protected]


πŸ›  Script: Uploading KenLM Binary

To replicate the 61.75% WER result, upload your KenLM binary (.bin or .arpa) file to the repository:

from huggingface_hub import HfApi

api = HfApi()
repo_id = "RareElf/kalenjin-asr"

api.upload_file(
    path_or_fileobj="path/to/your/kalenjin_lm.bin",
    path_in_repo="kalenjin_lm.bin",
    repo_id=repo_id,
    repo_type="model"
)

print(f"KenLM binary uploaded to {repo_id}")
Downloads last month
88
Safetensors
Model size
0.3B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for RareElf/kalenjin-asr

Finetuned
(806)
this model