---
base_model:
- DreadPoor/Irix-12B-Model_Stock
- LatitudeGames/Muse-12B
- Trappu/Nemo-Picaro-12B
library_name: transformers
tags:
- mergekit
- merge
license: apache-2.0
---
# Irix-mpf-stock

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

An experimental merge to improve long-form writing capabilities of Irix-12B-Model_Stock.

- I merged LatitudeGames/Muse-12B and Trappu/Nemo-Picaro-12B using **karcher** with higher tolerance and max iterations - took quite a lot of time
- I've created 3 derivative models using **arcee_fusion** - they were hand-picked from hundreds of similar merges that performed best on 3 tests:
    * Adventure prompt
    * Factual consistency + reasoning
    * Long-form creative writing 
- Created **model_stock** using these high-performing merges.

<pre>
<span style="color: #008000;">PROS</span>
- Shows more variety than base model and sometimes subverts expectations
- More creativity and granularity in world-building
- Better at complex, layered and ambitious scenarios
<span style="color: #800000;">CONS</span>
- Slightly faster, often using dialogue to advance the scene (Original Irix feels like a novelist, this model feels like a screen writer)
- Less intricate prose, writing feels modern, grounded and grim (might be a plus in some cases)
- Less introspective, pays more attention to external details (Pays more attention to sensory details rather than how character feels)
</pre>

TL;DR;

**Original Irix:**
Produces highly readable, satisfying, and complete stories.

**This model:**
Produces stories that feel like the thrilling beginning of a much larger work.

Oh, and I am planning to use this model as an output layer for next KansenSakura update

## Merge Details
### Merge Method

This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [DreadPoor/Irix-12B-Model_Stock](https://huggingface.co/DreadPoor/Irix-12B-Model_Stock) as a base.

### Models Merged

The following models were included in the merge:
* [DreadPoor/Irix-12B-Model_Stock](https://huggingface.co/DreadPoor/Irix-12B-Model_Stock)
* [LatitudeGames/Muse-12B](https://huggingface.co/LatitudeGames/Muse-12B)
* [Trappu/Nemo-Picaro-12B](https://huggingface.co/Trappu/Nemo-Picaro-12B)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
merge_method: karcher
models:
  - model: LatitudeGames/Muse-12B
  - model: Trappu/Nemo-Picaro-12B
parameters:
  max_iter: 10000
  tol: 1e-9
dtype: bfloat16
tokenizer_source: LatitudeGames/Muse-12B
```

```yaml
merge_method: arcee_fusion
base_model: DreadPoor/Irix-12B-Model_Stock
models:
  - model: DreadPoor/Irix-12B-Model_Stock
  - model: ./musepicaro
dtype: bfloat16
tokenizer_source: DreadPoor/Irix-12B-Model_Stock
```

```yaml
merge_method: model_stock
base_model: DreadPoor/Irix-12B-Model_Stock
models:
  - model: ./irix_fusion3
  - model: ./irix_fusion2
  - model: ./irix_fusion
parameters:
  normalize: false
  t: 0.75
dtype: bfloat16
```