AnimateDiff is an AI video generator that uses Stable Diffusion along with motion modules. It provides text-to-image, camera movements, image-to-video, sketch-to-video. Here is the link of AnimateDiff paper. This tutorial gives step by step guide on how to generate short videos for free using AnimateDiff from GitHub.
How to install AnimateDiff from github ?
1. First download the code from AniamteDiff GitHub repository. Open a dos prompt, in the directory where you want to install, type the command:
>git clone https://github.com/guoyww/AnimateDiff.git
2. Next is to download Stable Diffusion v1.5 from huggingface. In dos command, type
>cd AnimateDiff
> cd models\StableDiffusion
>git clone https://huggingface.co/runwayml/stable-diffusion-v1-5/
3. Go to huggingface or civitai to download Stable Diffusion checkpoints, such as realisticVisionV51, or ToonYou_Beta6. Put them at “models\DreamBooth_LoRA.”
4. Go to hugging face animatediff to download motion modules, such as “mm_sd_v15_v2.ckpt” and “v3_sd15_mm.ckpt” and put them at “models\Motion_Module.”
5. Now it is time to configure a virtual environment to run the code. If you haven’t installed Anaconda3, go to install Anaconda3.
6. Then you setup a conda environment of animatediff.
7. After you install AnimateDiff, you are ready to generation videos.
How to run Text to Video using AnimateDiff?
1. Under “AnimatedDiff” directory, go to “configs\prompts\v1”. There are 8 sample yaml files in which different checkpoints are used, such as ToonYou, RealisticVision etc. Select one yaml file and edit it.
2. In the field of dreambooth_path, compare the name of checkpoint with the name of checkpoint in “models\DreamBooth_LoRA.” If they have different version numbers, update in the script to match with the name in the directory.
3. In the field of motion_module, make sure you have downloaded the motion module “mm_sd_v15.ckpt”, and put it in “models\Motion_Module” directory.
4. The default width and height of the output video is 512×512. If you want to render faster, you can set the width and height smaller. Add this in the script
W: 256
H: 256
5. Change the prompt. You can run one prompt instead of four at a time by removing three prompts. Save the yaml file.
6. Open an Anaconda Prompt. Run command:
>conda activate animatediff
7. In the Anaconda prompt, go to the directory “AnimateDiff” and run one of your yaml file, for example:
>python -m scripts.animate –config configs/prompts/v1/v1-1-ToonYou.yaml
8. Monitor the progress in the Anaconda console. When it finishes, the result can be found in the “samples” directory.
9. Modify your prompts and run again until you find good results.
How to run Image to Video using AnimateDiff?
1. Go to hugging face animatediff, download “v3_sd15_adapter.ckpt” and “v3_sd15_mm.ckpt”, and put them at “models\MotionLoRA.” Download “v3_sd15_sparsectrl_rgb.ckpt” and put it at “models\SparseCtrl.”
2. Prepare an image file to animate. The image size can be 256×384. You can also use 256×256 or 512×512. Note the smaller the size, the faster it runs. Put the image file in “_assets_\demos\image.”
3. Under the “AnimatedDiff” directory, go to “configs\prompts\v3”. Open “v3-1-T2V.yaml.” In the file, there are a few sections for different demos.
4. It is easier to run one section at a time. To run Text to Video, use the first section “# 1-animation” and remove others. Save the file with a different name “v3-1-T2V-animation.yaml.” Now open the file and edit it.
5. In the field of controlnet_images, change the path of the image to your file.
6. Change the prompt to describe the scene.
7. Open an Anaconda Prompt. Run command:
>conda activate animatediff
8. In the Anaconda prompt, navigate to the directory “AnimateDiff,” run the yaml file:
>python -m scripts.animate –config configs/prompts/v3/v3-1-T2V-animation.yaml
9. Monitor the progress in the Anaconda console. When it finishes, the result can be found at “samples” directory.
10. Modify your prompts and run again until you find good results.
The current version of AnimateDiff v3 can create 16 frames, about 2 seconds of animation. The image resolution is 256 or 512 pixels.