Harry Potter Fashion Week

I was scrolling through TikTok and saw a hilarious video of AI generated Harry Potter characters as Balenciaga models. Here is the link if you want to see it. The creator used Midjourney to make the images and D-ID to animate the characters according to some audio files. I wanted to see if I could get something similar with Stable Diffusion.

My Approach

Prompts

To replicate The above images, I made prompts about each character and threw in words like Balenciaga, 1990s, robe, dress, fashion, movie, and Harry Potter. Some of my go to prompts also include words like this - ((best quality)), ((masterpiece)), ((realistic)), (detailed), high quality, 35mm film, fujifilm, vibrant colors, symmetrical face, 4k, photorealistic, high definition, low angle shot.

I have downloaded some negative embeddings that I use for most things. All the negative prompt looked like this (((EasyNegative, bad_prompt_version2, bad-image-v2-39000))), ((((ugly)))), (((duplicate))), ((morbid)), ((mutilated)), [out of frame], extra fingers, mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))). out of frame, ugly, extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), (deformed iris, deformed pupils, bad eyes, semi-realistic:1.4).

For negative prompts, I usually just copy paste what people say has been working for them online.

Pose

For every image I made, I used the Openpose Editor that is available through Automatic 1111’s web UI. I would Google search for some models posing, detect their pose, make edits if I needed to, and then combined that with the ControlNet Openpose model for Stable Diffusion v-1.5.

Upscaling

Lastly, I was able to upscale the images to 4k images. Once I was happy with a random seed for a character, I would send that image to the img2img section. I used the build in upscaling first, and then I would send the image to the extras section. I used a ESRGAN model to do the final upscale

Results

Below are the results! There definitely is way more leather and chiseled faces in the Midjourney outputs. I never messed around with Midjourney before, but my friend showed me some results he had of doing Breaking Bad in Balenciaga. His results look almost identical to the original creator of the Harry Potter one.