GitHub - uwidev/sd-prepare-images: Prepare a dataset for stable diffusion training, powered by dghs imgutils

Overview

This is a python script that utilizes dghs/imgutils for preparing images to be trained for stable diffusion models (or similar). It's somewhat opinionated and is for anime style trainings, but could be extended for other forms of training.

Project Structure

Images to-be-processed should be placed in ./raw/. You can also place captions in here as well; if they exist, these images will not be tagged when the flag is set (may change in the future, I currently pre-tag all images in Hydrus).

Finalied images are moved to ./done/ in a way that ensures the best quality and upscaled images, ignoring inferior images. Tagging will be done based on this directory.

Usage

usage: main.py [-h] [--clean] [--restore] [--crop] [--upscale] [--move]
               [--tag] [--tag-prepend TAG_PREPEND] [--stage-1] [--stage-2]

Preprocess images for AI training.

options:
  -h, --help            show this help message and exit
  --clean               clean workspace
  --restore             restore images
  --crop                crop images
  --upscale             upscale images
  --move                move finalized images and captions to ./done/
  --tag                 tag images in ./done/
  --tag-prepend TAG_PREPEND
                        prepend tag to all captions in ./done/
  --stage-1             restore and crop
  --stage-2             upscale and move

Note that the order of operations internally is as folows...

restore (denoise jpeg artifacts) -> crop -> upscale -> tag -> tag-prepend

Not including a flag will not do that step.

Why CLI has multiple steps...

Sometimes the cropper may crop false-positives, or maybe a face is just way too small to be upscaled with reasonable quality. To save time and compute power, do --stage-1, then go to ./workspace/crop/ and manually delete anything you don't want to be further procesed. Afterwards, you can continue to --stage-2.

About `--tag-prepend`

To prepend the artist/style/concept tag to all caption files, if exists. It will not add the tag if it already exists.

About `--clean`

Between datasets, you should clean the workspace, otherwise it will also process the previous dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
done		done
raw		raw
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Project Structure

Usage

Why CLI has multiple steps...

About `--tag-prepend`

About `--clean`

About

Releases

Packages

Languages

uwidev/sd-prepare-images

Folders and files

Latest commit

History

Repository files navigation

Overview

Project Structure

Usage

Why CLI has multiple steps...

About --tag-prepend

About --clean

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

About `--tag-prepend`

About `--clean`

Packages