1. StableDiffusion1.5
在加载huggingface中的扩散模型时,输入prompt总是会被报错超过clip的最大长度限制。
解决方案:使用compel库
from diffusers import AutoPipelineForText2Image import torch import pdb from compel import Compel device = torch.device("cuda:3") # 大模型 model_path = "/data1/zhikun.zhao/huggingface_test/hubd/stable-diffusion-v1-5" pipeline = AutoPipelineForText2Image.from_pretrained( model_path, torch_dtype=torch.float32 ).to(device) # 设置lora pipeline.load_lora_weights("/data1/zhikun.zhao/huggingface_test/hubd/adapter/c_adapt1", weight_name="zhenshi.safetensors", adapter_name = "zhenshi") #保证重复性和可复现性 generator = torch.Generator("cuda:3").manual_seed(31) prompt = "score_7_up, realhuman, photo_\\(medium\\), (dreamy, haze:1.2), (shot on GoPro hero:1.3), instagram, ultra-realistic, high quality, high resolution, RAW photo, 8k, 4k, soft shadows, artistic, shy, bashful, innocent, interior, dramatic, dynamic composition, 18yo woman, medium shot, closeup, petite 18-year-old woman, (hazel eyes,lip piercing,long silver straight hairs,Layered Curls cut, effect ,Sad expression, Downturned mouth, drooping eyelids, furrowed brows:0.8), wearing a figure-hugging dress with a plunging neckline and lace details, paired with black opaque tights pantyhose and knee-high leather boots, The look is bold and daring, perfect for a night out, detailed interior space, " negative_prompt = "score_1, skinny, slim, ribs, abs, 2girls, piercings, bimbo breasts, professional, bokeh, blurry, text" compel = Compel(tokenizer = pipeline.tokenizer, text_encoder = pipeline.text_encoder) conditioning = compel.build_conditioning_tensor(prompt) negative_conditioning = compel.build_conditioning_tensor(negative_prompt) # .build_conditioning_tensor()和()通用 [conditioning, negative_conditioning] = compel.pad_conditioning_tensors_to_same_length([conditioning, negative_conditioning]) out = pipeline(prompt_embeds = conditioning, num_images_per_prompt = 1, generator=generator, num_inference_steps = 50, # 建议步数50就可以 height = 1024, width = 1024, guidance_scale = 7 # 文字相关度,这个值越高,生成图像就跟文字提示越接近,但是值太大效果就不好了。 ) image = out.images[0] image.save("img/test.png")
复制
2. StableDiffusionXL1.0
上述解决方案在加载SDXL1.0模型的时候提示:输入prompt_embeds的同时应该输入pooled_prompt_embeds。
修改部分上述代码如下:
out = pipeline(prompt_embeds = conditioning[0], pooled_prompt_embeds = conditioning[1], negative_prompt_embeds = negative_conditioning[0], negative_pooled_prompt_embeds = negative_conditioning[1], num_images_per_prompt = 1, generator=generator, num_inference_steps = 50, # 建议步数50就可以 height = 1024, width = 768, guidance_scale = 3 # 文字相关度,这个值越高,生成图像就跟文字提示越接近,但是值太大效果就不好了。 )
复制