AI & ML interests

Image, text, video and audio generation (and compositions of all of the above)

alvdansen 
posted an update 5 days ago
view post
Post
1748


Just open-sourced LoRA Gym with Timothy - production-ready training pipeline for character, motion, aesthetic, and style LoRAs on Wan 2.1/2.2, built on musubi-tuner.

16 training templates across Modal (serverless) and RunPod (bare metal) covering T2V, I2V, Lightning-merged, and vanilla variants.

Our current experimentation focus is Wan 2.2, which is why we built on musubi-tuner (kohya-ss). Wan 2.2's DiT uses a Mixture-of-Experts architecture with two separate experts gated by a hard timestep switch - you're training two LoRAs per concept, one for high-noise (composition/motion) and one for low-noise (texture/identity), and loading both at inference. Musubi handles this dual-expert training natively, and our templates build on top of it to manage the correct timestep boundaries, precision settings, and flow shift values so you don't have to debug those yourself. We've also documented bug fixes for undocumented issues in musubi-tuner and validated hyperparameter defaults derived from cross-referencing multiple practitioners' results rather than untested community defaults.

Also releasing our auto-captioning toolkit for the first time. Per-LoRA-type captioning strategies for characters, styles, motion, and objects. Gemini (free) or Replicate backends.

Current hyperparameters reflect consolidated community findings. We've started our own refinement and plan to release specific recommendations and methodology as soon as next week.

Repo: github.com/alvdansen/lora-gym
  • 2 replies
·
rvorias 
in glif/loradex-flux-dev about 1 year ago

Create Alok

#3 opened about 1 year ago by
Alok45729ver
rvorias 
in glif/how2draw over 1 year ago

generation parameters

2
#4 opened over 1 year ago by
vatsal2473
rvorias 
updated a Space over 1 year ago
alvdansen 
posted an update over 1 year ago
view post
Post
5264
📸Photo LoRA Drop📸

I've been working on this one for a few days, but really I've had this dataset for a few years! I collected a bunch of open access photos online back in late 2022, but I was never happy enough with how they played with the base model!

I am so thrilled that they look so nice with Flux!

This for me is a version one of this model - I still see room for improvement and possibly expansion of it's 40 image dataset. For those who are curious:

40 Image
3200 Steps
Dim 32
3e-4

Enjoy! Create! Big thank you to Glif for sponsoring the model creation! :D

alvdansen/flux_film_foto