Creating a dataset for Dreambooth
Your dataset must be a directory. For Dreambooth, the directory structure must be:
- my_dog_images - object - charlie_image_one.jpeg - pic_of_dog.jpeg - another_picture.png - prior_preservation - a_dog_1.jpeg - another_dog.png
- A top level folder named anything you want
- A subfolder named
objectthat contains photos of the object you're interested in training on (e.g. pictures of your dog).
- Optional: a subfolder named
prior_preservationthat contains photos of other objects of the same type (e.g. pictures of different dogs) to help Stable Diffusion learn what is unique about your object.
To upload and use your dataset in a FinetuningRun, you'll need to create a DatasetIdentifier object.
- Get as many photos of the object as possible. The more images, the better the training. You need at least five.
- Make sure the object is not blurry and can clearly be seen.
- Angles are important, capturing as many angles as possible helps Stable Diffusion better learn your object.
The subject is clearly visible in the foreground and is photographed from different angles.
In these photos, the adorable subject is obscured behind blankets and other obstacles.
Can I provide photos in any format?
Blueprint supports photos in JPEG, PNG, and more generally, any image format accepted by Pillow.
Does the name of the image matter?
The name of the image doesn't matter. As long as it's placed under the right subdirectory, Blueprint fine-tuning API will be able to use your dataset.
Does the size or resolution of the image matter?
Blueprint automatically scales down images to the correct resolution. This is determined by the
resolution parameter in the