Creating a dataset for Full Stable Diffusion
Your dataset must be a directory. For Full Stable Diffusion, the directory structure must be:
- A top level folder named anything you want
- For every image, there should be a corresponding .txt file that contains the caption.
Image and text file names must match
If a dataset has an image
van_gogh_artwork_1.jpeg, Blueprint expects a corresponding text file named
van_gogh_artwork_1.txt. If there are images without corresponding text files, the fine-tuning run will fail.
What should my captions be?
When you invoke your fine-tuned model, the captions in your dataset will determine how you will prompt the model. If you'd like to be able to tell Stable Diffusion to generate a certain color, for example, mention the color in the caption!
Captions should be descriptive and detailed and they should follow the style of prompting you want to use when invoking the model. However, Stable Diffusion does have a limit on caption sizes. Captions shouldn't be greater than 75 tokens (1 token roughly equates to 4 letters in the English alphabet).
Example images with captions:
a drawing of a green pokemon with red eyes(left)
a green and yellow toy with a red nose(middle)
a red and white ball with an angry look on its face(right)
Can I provide photos in any format?
Blueprint supports photos in JPEG, PNG, and more generally, any image format accepted by Pillow.
Does the name of the image matter?
The name of the image file must match the name of the associated caption text file (
name1.txt). But the name can be anything, as long as it matches.
Does the size or resolution of the image matter?
Blueprint automatically scales down images to the correct resolution. This is determined by the
resolution parameter in the