How to Use EmoNAVI, EmoFact, EmoLynx, and EmoClan

The EmoNavi series is designed to be scheduler-independent. This means you don't necessarily need a scheduler, and even if you use one, it's generally fine because the system automatically adjusts its settings to manage the learning process.

However, if your goal is to grasp fine details quickly, we recommend using your preferred scheduler, such as Cosine-Restart.

Understanding the Learning Rate: It's Not Just Intensity

Many people think of the learning rate setting as "learning intensity." However, it's actually more like a filter—it dictates how the VAE's latent space is perceived.

Imagine a translucent plastic plate. When the learning rate is high, the image you see through this plate is a blurrier "overview," appearing as a rough distribution of light or large masses. When the learning rate is low, the image is "detailed," with increased transparency and clearer representation. In essence, the learning rate can be thought of as "resolution"—it's like adjusting the degree of blurriness by controlling the plate's transparency.

This explains why a high learning rate is better and faster for grasping overviews. With less information to process (because the details are blurred out), the system learns basic patterns quickly. Conversely, a low learning rate involves more information, thus requiring more time to fully grasp everything. If the learning rate is too low, there's an overwhelming amount of information, leading to the training period ending without sufficient learning, and resulting in subpar outcomes.

It's important to note that this concept of "learning rate" isn't exclusive to the EmoNavi series; it applies to other optimizers as well. Please keep this in mind for your future training endeavors.

EmoNavi Series: Smart Learning with or Without Schedulers

As mentioned earlier, if you use a scheduler with the EmoNavi series, you might capture details earlier than with a constant learning rate.

Alternatively, you can skip the scheduler entirely and opt for "additional training" sessions at a lower learning rate. This is easily done without needing to manage transfer parameters, which is a simplified feature not commonly found in other optimizers.

The EmoNavi series is designed to prevent overfitting even when running at a constant learning rate. It automatically adjusts to avoid exceeding a certain threshold. Therefore, it won't learn more than necessary. If it detects that it's nearing the overfitting zone, it will adjust. After learning the general outline, it's not that it stops learning; rather, it learns only what's necessary, which might make it seem like progress has slowed compared to the initial overview-learning phase.

If you find that your training isn't progressing beyond a certain point, try an additional training session with a lower learning rate. This often allows the system to rapidly absorb all the finer details.

Conclusion

We hope this explanation helps you acquire valuable know-how for setting up your training, not just for the EmoNavi series. We believe it will be beneficial to all of you. Thank you for reading to the end.

postscript

I'd like to explain the learning rate in an easy-to-understand way, so you can truly grasp its concept.

You can think of the learning rate like reading speed.
Imagine this: a high learning rate is like skim reading (or speed reading), while a low learning rate is like perusing (or close reading).

The scheduler manages this, much like a learning schedule.
EmoNAVI has a "shadow" function that encourages the model to review and reflect on its own learning progress.
With EmoNAVI, you have a choice: you can allow external guidance to determine the learning path and let the model's autonomy supplement it, or you can rely solely on its autonomy.

Here are some other analogies:
High learning rate: Like shooting a photo from a distance, giving you an overview where details are fuzzy.
Low learning rate: Like shooting up close, capturing accurate details.
Think of autofocus as being handled by the scheduler and the "shadow" function.
From another perspective:

When aiming for detailed expressions, you can also consider increasing the amount of training data or increasing the number of iterations.
As the number of iterations increases, detailed features are gradually accumulated.

However, color representation largely depends on the performance of the VAE (Variational Autoencoder).
To accurately reflect colors, the only options are to improve the VAE's performance itself or to use teacher data that correctly reflects colors.

Furthermore, the "shadow" function also acts like an autofocus system.
It's a mechanism that allows the model to review and reflect on its own learning, essentially learning from its own experience.
This means it captures one feature, learns from it, then identifies another, and the process repeats.
Consequently, its "focus" (or understanding) continuously evolves and adapts.

That concludes the additional explanation. Thank you for reading to the end!
