--- base_model: - DreadPoor/Irix-12B-Model_Stock - LatitudeGames/Muse-12B - Trappu/Nemo-Picaro-12B library_name: transformers tags: - mergekit - merge license: apache-2.0 --- # Irix-mpf-stock This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). An experimental merge to improve long-form writing capabilities of Irix-12B-Model_Stock. - I merged LatitudeGames/Muse-12B and Trappu/Nemo-Picaro-12B using **karcher** with higher tolerance and max iterations - took quite a lot of time - I've created 3 derivative models using **arcee_fusion** - they were hand-picked from hundreds of similar merges that performed best on 3 tests: * Adventure prompt * Factual consistency + reasoning * Long-form creative writing - Created **model_stock** using these high-performing merges.
PROS
- Shows more variety than base model and sometimes subverts expectations
- More creativity and granularity in world-building
- Better at complex, layered and ambitious scenarios
CONS
- Slightly faster, often using dialogue to advance the scene (Original Irix feels like a novelist, this model feels like a screen writer)
- Less intricate prose, writing feels modern, grounded and grim (might be a plus in some cases)
- Less introspective, pays more attention to external details (Pays more attention to sensory details rather than how character feels)
TL;DR; **Original Irix:** Produces highly readable, satisfying, and complete stories. **This model:** Produces stories that feel like the thrilling beginning of a much larger work. Oh, and I am planning to use this model as an output layer for next KansenSakura update ## Merge Details ### Merge Method This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [DreadPoor/Irix-12B-Model_Stock](https://huggingface.co/DreadPoor/Irix-12B-Model_Stock) as a base. ### Models Merged The following models were included in the merge: * [DreadPoor/Irix-12B-Model_Stock](https://huggingface.co/DreadPoor/Irix-12B-Model_Stock) * [LatitudeGames/Muse-12B](https://huggingface.co/LatitudeGames/Muse-12B) * [Trappu/Nemo-Picaro-12B](https://huggingface.co/Trappu/Nemo-Picaro-12B) ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: karcher models: - model: LatitudeGames/Muse-12B - model: Trappu/Nemo-Picaro-12B parameters: max_iter: 10000 tol: 1e-9 dtype: bfloat16 tokenizer_source: LatitudeGames/Muse-12B ``` ```yaml merge_method: arcee_fusion base_model: DreadPoor/Irix-12B-Model_Stock models: - model: DreadPoor/Irix-12B-Model_Stock - model: ./musepicaro dtype: bfloat16 tokenizer_source: DreadPoor/Irix-12B-Model_Stock ``` ```yaml merge_method: model_stock base_model: DreadPoor/Irix-12B-Model_Stock models: - model: ./irix_fusion3 - model: ./irix_fusion2 - model: ./irix_fusion parameters: normalize: false t: 0.75 dtype: bfloat16 ```