토니의 연습장

Inpainting 본문

비전 AI (VISION)/Stable Diffusion

Inpainting

bellmake 2024. 8. 24. 12:39

Without mask : unet input channel은 4로 입력됩니다. (latent)

With mask : unet input channel이 9로 입력됩니다. (latent, mask, masked_image_latents)

 

[ 참고 ]

test_sd_inpainting.py

from diffusers import StableDiffusionInpaintPipeline
from PIL import Image
from PIL import ImageOps

import torch
import numpy as np
import cv2

pipe = StableDiffusionInpaintPipeline.from_pretrained('runwayml/stable-diffusion-inpainting',
revision='fp16',
torch_dtype=torch.float16)

pipe = pipe.to('cuda')
prompt = 'a window with blue ocean scenary'
image = Image.open('/home/joseph/study/multimodal/ai_editor/my_data/livingroom.png')
image = ImageOps.exif_transpose(image)
image = image.resize((512,512))
mask_image = Image.open('/home/joseph/study/multimodal/ai_editor/my_data/mask.png')

kernel = np.ones((3,3), np.uint8)
mask_image = cv2.dilate(np.array(mask_image), kernel, iterations=10) # 경계영역 넓혀줌
mask_image = cv2.resize(mask_image, (512,512))

image = pipe(prompt=prompt, image=image, mask_image=Image.fromarray(mask_image)).images[0]

image.save('/home/joseph/study/multimodal/ai_editor/my_data_result/inpainted.png')

 

 


Original Image Mask Inpainted Image

 

'비전 AI (VISION) > Stable Diffusion' 카테고리의 다른 글

FLUX - LoRA  (2) 2025.03.15
LoRA (Low Rank Adaptation)  (1) 2024.08.28
StableDiffusionPipeline  (0) 2024.08.23
Stable Diffusion 이론  (0) 2024.08.23