Text-to-Video or Image-to-Video for Childeren Song

📢 Project Report: 240919

주요 기능

  • 동요 스타일 비디오 생성을 위한 프로젝트
  • Text-to-Video or Image-to-Video 관련 딥러닝 코드 구현 및 실험
  • AnimateDiff, CogVideo, ControlVideo, Free-Bloom 모델 구현 및 실험

View Project & Code


1. Prompts List

The following prompts were used to generate videos with different styles using various models:

  1. Realistic Style:
    “A video of a duckling wearing a medieval soldier helmet and riding a skateboard.”

  2. Cartoon Style:
    “A video of a duckling wearing a medieval soldier helmet and riding a skateboard in cartoon style.”

  3. Watercolor Style:
    “A video of a duckling wearing a medieval soldier helmet and riding a skateboard in watercolor style.”


2. Comparison Models

The video generation was performed using the following models:

  • AnimateDiff
  • Free-Bloom
  • CogVideo (available on Hugging Face)
  • ControlVideo

3. Inference

You can run the inference code for each model from the following notebook:

#Start path
git clone https://github.com/ssoojeong/Baby_Video.git

cd Baby_Video
  • Notebook Path: ./inference_code/inference.ipynb

This notebook is configured to support running inference for comparison models.


4. Results

You can see the results of the inference for each model from the following notebook:

  • Notebook Path: ./inference_outputs/{model_name}

View Sample Results

animatediff_duck
"A video of a duckling wearing a medieval soldier helmet and riding a skateboard in cartoon style." (generated by Animatediff)
animatediff_duck_2
"A video of a duckling wearing a medieval soldier helmet and riding a skateboard in watercolor style." (generated by Animatediff)
cogvideox_duck
"A video of a duckling wearing a medieval soldier helmet and riding a skateboard." (generated by CogVideoX)
cogvideox_duck_2
"A video of a duckling wearing a medieval soldier helmet and riding a skateboard in cartoon style." (generated by CogVideoX)