Skip to content

How to Fine-Tune Stable Diffusion with Dreambooth

Stable Diffusion is an image generation model that can create a wide variety of images. However, it becomes even more powerful and useful when fine-tuned for specific tasks.

Fine-tuning lets you get a lot of value from small to medium-size datasets by leveraging foundation models. As opposed to training a model from scratch, fine-tuning alters the behavior of an existing model to better match a specific use case.

Dreambooth is an approach to fine-tuning Stable Diffusion that teaches the model how to generate a specific object. It works by providing the model with a few images of a subject along with a class name.

In this tutorial, we'll use Dreambooth to fine-tune Stable Diffusion to generate images of a specific dog (or whatever you want if you provide your own dataset).

Step 0: Prerequisites

After signing up for Blueprint by Baseten, you'll need to do three things to complete this tutorial:

  • Install the latest version of the Baseten Python client with pip install --upgrade baseten
  • Create an API key
  • In your terminal, run:
baseten login

And paste your API key when prompted.

Following this tutorial will consume credits/billable resources

The Dreambooth fine-tuning run in this tutorial guide will consume credits (if available on your account) or billable resources.

Step 1: Create your dataset

Dreambooth fine-tuning teaches Stable Diffusion to generate a certain object by showing it pictures of that object. Your fine-tuning input dataset will be a collection of pictures in a common file format (e.g. jgp, png).

Dataset tips:

  • Get as many photos of the object as possible. The more images, the better the training. You need at least five.
  • Make sure the object is not blurry and can clearly be seen.
  • Angles are important, capturing as many angles as possible helps Stable Diffusion better learn your object.

Effective images:

The subject is clearly visible in the foreground and is photographed from different angles.

Example effective images

Ineffective images:

In these photos, the adorable subject is obscured behind blankets and other obstacles.

Example ineffective images

Put all the photos in a folder called object and place that folder inside your dataset folder, for the following structure:

my-dataset/
    object/
        photo1.jpg
        photo2.jpg
        photo3.jpg
        ...

This folder will be zipped during the upload process, so make sure nothing else is in the folder.

Our example dataset for this tutorial contains 115 images of a dog named Ollie who belongs to a friend of one of our engineers.

Step 2: Upload dataset

There are three ways to provide a dataset to a fine-tuning run. Click through the tabs to see options.

A "public" URL means a link that you can access without logging in or providing an API key.

The dataset must be a zip file containing the folder structure explained in step 1.

If you want to follow the tutorial using a pre-built dataset, use the code below as-is. Otherwise, replace the link with a link to your hosted dataset zip file, or check the other tabs for different dataset upload options.

from baseten.training import PublicUrl
# A URL to a publicly accessible dataset as a zip file
dataset = PublicUrl("https://cdn.baseten.co/docs/production/DreamboothSampleDataset.zip")

If you have your dataset on the local machine that you're running your Python code on, you can use the LocalPath option to upload it as part of your fine-tuning script.

from baseten.training import LocalPath
# A path to a local folder with your dataset
dataset = LocalPath("/path/to/my-dataset", "dog_pics")

If your fine-tuning script is running on one machine and your dataset lives on another, or you want to upload a dataset once and use it in multiple fine-tuning runs, you'll want to upload the dataset separately.

baseten dataset upload is a bash command

Open a terminal window and run:

baseten dataset upload --name my-cool-dataset --training-type DREAMBOOTH ./my-dataset-directory

Notes:

  • If the name parameter is not provided, Blueprint will name your dataset based on the directory name.
  • If you're doing a Full Stable Diffusion run, instead use --training-type CLASSIC_STABLE_DIFFUSION.

You should see:

Upload Progress: 100% |█████████████████████████████████████████████████████████
INFO 🔮 Upload successful!🔮

Dataset ID:
DATASET_ID

Then, for your fine-tuning config (your Python code), you'll use:

from baseten.training import Dataset
# The ID of a dataset already uploaded to Blueprint
dataset = Dataset("DATASET_ID")

Step 3: Assemble fine-tuning config

For the rest of this tutorial, we'll be using Python to configure, create, and deploy a fine-tuned model. Open up a Jupyter notebook or Python file in your local development environment to follow along.

Assembling the config is an opportunity to truly customize the fine-tuning run to meet our exact needs. For a complete reference of every configurable parameter, see the DreamboothConfig docs.

Here's an example config:

from baseten.training import DreamboothConfig

config = DreamboothConfig(
    instance_prompt="photo of olliedog", # Dog's name is "Ollie"
    input_dataset=dataset,
    class_prompt="photo of dog",
    num_train_epochs=10
)

This config makes a few key decisions:

  • It sets the instance_prompt to let Dreambooth know the subject of the provided images
  • It uses class_prompt to regularize the model so it can still generate pictures of other dogs
  • It sets the number of training epochs to 10, which will be enough to fine-tune the model on a dataset of our size

What should my instance prompt be?

Dreambooth teaches Stable Diffusion to associate a word with a visual concept (aka the object in your fine-tuning dataset). To do so, it needs a word that doesn't already have any concepts associated with it. Make up a meaningful but unique string to describe the concept.

In the example, the dataset is 115 pictures of a dog named Ollie. So we use the word olliedog as a unique descriptor. But your instance prompt could be any string that isn't a real word and wouldn't be found in the existing model's training data.

Step 4: Run fine-tuning

Once your config is set, it's time to kick off the fine-tuning run. This process is straightforward, just use:

from baseten.training import FinetuningRun

my_run = FinetuningRun.create(
    trained_model_name="Dog Dreambooth",
    fine_tuning_config=config
)

The trained_model_name will be assigned the deployed model.

Fine-tuning a model takes some time. Exactly how long depends on:

  • the type of fine-tuning you're doing (Dreambooth is generally faster than classic Stable Diffusion)
  • the size of your dataset (more images takes longer)
  • the configured num_train_epochs or max_train_steps (higher number means longer run).

While you wait, you can monitor the run's progress with:

my_run.stream_logs()

Your model will be automatically deployed

Once the fine-tuning run is complete, your model will be automatically deployed. You'll receive an email when the deployment is finished and the model is ready to invoke.

You can turn off this behavior by setting auto_deploy=False in FinetuningRun.create() and instead deploy your model manually.

Step 5: Use fine-tuned model

It's time! You can finally invoke the model. Use:

image, url = model("portrait of olliedog as an andy warhol painting")
image.save("ollie-warhol.png")

You'll get a Pillow image of your model output as well as a URL to access the output in the future.

If you want to access the model later, you can do so by instantiating a StableDiffusionPipeline with the model ID:

from baseten.models import StableDiffusionPipeline

model = StableDiffusionPipeline(model_id="MODEL_ID")
image, url = model("portrait of olliedog as an andy warhol painting")
image.save("ollie-warhol.png")

Example output:

These images were generated for the following prompts:

  • portrait of olliedog as an andy warhol painting (left)
  • side profile of olliedog as a van gogh painting (right)

Example output