Experiments with tiny-sd

Since over a year I'm on the hunt for lightweight architectures and models for generative AI that I can run on my mid-end consumer hardware (GTX 2060 6GB VRAM/GTX 3040 8 GB VRAM).

Now recently I've trained a LoRA for stable diffusion XL (SDXL) on a logo dataset and published it on hugging face 🤗.

But this LoRA requires to run SDXL as a base model. Which is quite big and barely fits onto any of my GPUs. The only option I have is CPU offloading.

But that comes with a great speed decrease.

So I desperately wanted to find a smaller model!

Initial attempts

After discovering the distilled diffusion models by segmind I've tried out my casual trial prompts.

Here are the first results each with the default parameters via the inference UI:

Prompt: fluffy ball

Prompt: a woman in a field

Prompt: Super Mario sitting in private jet lounge and smoking a big joint with marijuana plants growing from his head, ultra realistic, HD, best quality, ~*~aesthetic~*~

Well, frankly apart from the fluffy ball the results are rather mediocre. I'm wondering if we can achieve something else maybe I should just try something easier.

For instance cartoons and cartoon-like pictures/paintings.

Recently I found the artist Jon Juarez through a post on mastodon. And I really like their work!

So I wanted to try out if I can generate something similar to their work :)

Prompt: painting with line shading of a cave

That's nice! I wonder how far I can go with this.

Because of that I felt obliged to train a LoRA to maybe improve the style of generated paintings.

LoRA training

After collecting 29 samples from an artist, I adjusted the dreambooth script that I used for my logo-LoRA for training of tiny-sd.

I set a learning rate of 1e-4 and trained it over 1.5k iterations.

After the training the script automatically uploads the model to the hub.

Note: you need to use the "magic phrase" ... by JON_JUAREZ ... to trigger the LoRA.

Here are some results:

Prompt: Pastel color painting with line shading by JON_JUAREZ of a dark cave

Prompt: Painting with line shading by JON_JUAREZ of a dark cave

Prompt: Wizard

Prompt: Wizard by JON_JUAREZ

Prompt: Landscape

Prompt: Landscape by JON& JUAREZ

One of the greatest findings of all of this is...

Even your toaster has more! And we're talking training, not inference.

For inference it's less than 1GB.

I wrapped the LoRA into a 🤗 hugging face space, you can find it below

Outlook

Apart from training a LoRA on a set of style examples, this Reddit post suggests to use them as class examples, potentially as style.

And then we can continue with LoRA merging. Infinite variations and possibilities are ahead us.

Demo

Keep in mind to include by JON_JUAREZ inro your prompt.