Big jump in loss?
Is it normal to have such a large jump in loss?
Yes, I noticed there seems to be a big loss in quality recently too. Hopefully it gets fixed, cause this model totally sucks right now.
The loss jumped due to the inclusion of DINO in earlier timesteps (300 -> 700, and as of recent, all timesteps); results will get even better.
@Clybius Why DINO is used?
This article should help to answer that: https://jehillparikh.medium.com/improved-image-encoding-using-transformers-self-distillation-with-no-labels-dino-dinov2-79ac5b6cac06
Seems like dino was dropped.
@Clybius Why DINO is used?
This article should help to answer that: https://jehillparikh.medium.com/improved-image-encoding-using-transformers-self-distillation-with-no-labels-dino-dinov2-79ac5b6cac06
I know DINO. But how DINO is used in zimage finetuning?
Seems like dino was dropped.
the loss is certainly strange. not sure how the total loss is calculated here, especially with apparently new values for DINO not being there? would be good to have some clarification.