Let's Learn from AI! Recreating Dynamic Compositions

Four-frame comic strip of times to study composition in English.png (1133×1600)
  • Anime models offer diverse compositions
  • SDXL excels at brightness adjustments
  • Use controlnet to apply Flux.1

Introduction

Hello, I'm Easygoing.

Today, we're going to explore how to recreate dynamic compositions in illustrations!

Previously Written About Anime Compositions

I previously wrote an article about how to apply anime compositions to illustrations.

This time, I want to dive deeper into some of the points I noticed back then.

Theme: Strawberry Shortcake

The theme for this article is perfect for autumn's appetite: strawberry shortcake!

We'll be comparing three models of SDXL/Flux.1. Along with the illustrations, we'll also take a look at the brightness distribution.

SDXL - anima_pencil-XL_v500 (Anime)

Clip art of strawberry shortcake.png (1024×1024)
Histogram of brightness for illustration of strawberry shortcake on SDXL animated model.png (1056×556)

The right side shows a higher brightness level

  • SDXL anime model
  • Expresses character softness and emotions
  • Capable of producing semi-realistic images

The brightness distribution for the anime model shows a lot of light on the right side, meaning brighter areas. Overall, it produces bright and poppy illustrations.

If this image were on a café menu, you'd likely feel tempted to order the dessert!

SDXL - RealVisXL_V5.0 (Photorealistic)

Photorealistic illustration of delicious strawberry shortcake SDXL.png (1024×1024)
Brightness histogram of SDXL's photorealistic model illustration of strawberry shortcake.png (1056×548)

Emphasis on the dark areas on the left

  • SDXL’s top-tier photorealistic model
  • FP32 format, 13.6 GB
  • Recommended with 50 steps, a true "supermodel"

RealVisXL is the pinnacle of SDXL's photorealistic models. The latest version was released in September 2024, and the large model size and high step count give away its powerful rendering capabilities.

Even though it's from the previous generation of SDXL models, it beautifully captures the texture of cake sponge and strawberries.

The darker tones emphasize the contrast, making the overall image look crisp and sharp.

Flux.1 - FluxesCore-Dev_V1.0 (Photorealistic)

Flux1 photo-realistic illustration of a delicious strawberry shortcake.png (2576×2576)
Brightness histogram of strawberry shortcake illustration of Flux1 photorealistic model.png (1056×560)

Centered, lower contrast

  • Flux.1’s photorealistic model
  • Fixes the issue where "Japanese" prompts used to turn people into anime characters
  • Specialized in rendering Asian features after additional training

FluxesCore series, created by TofuNoKakera, is a photorealistic model of Flux.1. While FluxesCore-Schnell is available only to members, FluxesCore-Dev is freely available (for non-commercial use).

Though the shortcake image feels slightly different from typical Japanese depictions, the texture and three-dimensionality are top-notch.

The brightness distribution is concentrated in the center, closely mimicking real-life light and shadow.

Is Flux.1 Subtle?

When comparing the three images, Flux.1 may seem subtle at first glance.

This is because its low contrast gives it less visual impact upon first sight.

Contrast Can Be Adjusted, But...

Although Flux.1 images have low contrast, you can adjust them with post-processing.

Flux1 photo-realistic illustration of a delicious strawbertone adjust.png (2576×2576)
Tone correction
Flux1 photo-realistic illustration of a delicious strawberry shortcake blend mode blue.png (2576×2576)
Blend mode: "Blue"

However, since this correction is done on PNG files rather than RAW images, no new information is recovered.

As you apply corrections, information will inevitably be lost.

While lower contrast might provide a natural look, generating images with brighter tones directly from AI would be more efficient.

Recreate Anime Composition and Brightness with Flux.1!

Now, for the main topic.

Let’s recreate SDXL’s composition and brightness using Flux.1 through a specific workflow.

We’ll use controlnet with depth information to lock down the composition.


flowchart LR
subgraph SDXL
A(anima_pencil-XL<br>1024 x 1024)
B(RealVisXL<br>1024 x 1024)
end
subgraph Flux.1
C(FluxesCore-Dev<br>1440 x 1440)
D(FluxesCore-Dev<br>2580 x 2580)
end
A-->|controlnet<br>depth|B
B-->C
C-->|upscale<br>x 1.8|D

SDXL-Flux1_Realistic.7z

Testing Models

  • anima_pencil-XL_v500
  • DepthAnything-V2_large_fp16
  • RealVisXL_V5.0
  • T5xxl-Q_5_K_M
  • FluxesCore-Dev_V1-Q_8.gguf (created from FP16 via flux_tool)

Note: FluxesCore-Dev is a non-commercial model.

Why Include RealVisXL?

We’re not passing depth information directly from the anime model to Flux.1 but are including RealVisXL in between for the following reasons:

  • Incorporate RealVisXL’s brightness
  • SDXL models have more reliability since Flux.1’s controlnet models are still relatively new
  • Flux.1 consumes a lot of VRAM, so we run controlnet on the SDXL model to conserve resources

Let’s Draw a Girl!

Now, let’s create an image. To reproduce an anime composition, we’ll include the following in the prompt: dutch angle and close up.

realistic, photorealistic, girl, teenage, dutch angle, close up

anima_pencil-XL

Illustration of a close-up of a young woman in cartoon composition.png (1024×1024)

The result is a beautiful woman with chestnut hair. Although anima_pencil-XL is an anime model, it handles photorealistic expressions well.

However, since it’s an anime model, the eyes are larger, and the overall brightness is higher than in a realistic photo.

Depth (Depth Information)

Depth map of illustration of a close-up of a young woman in cartoon composition.png (1024×1024)

This is the depth map generated from the previous image. Blue represents the foreground, and red represents the background.

RealVisXL

Illustration of a live-action SDXL model based on a depth map of an illustration of a close-up of a young woman in an animated composition.png (1024×1024)

Next, an image is generated based on the depth information. As often happens with AI, the result is typically a blonde woman unless otherwise specified.

The composition follows the depth map, with smaller eyes and more realistic brightness.

FluxesCore-Dev

Illustration of a live-action Flux1 model based on a depth map of an illustration of a close-up of a young woman in cartoon composition.png (2576×2576)

Lastly, we redraw and upscale the image with Flux.1. Although FluxesCore specializes in Asian depictions, for comparison’s sake, we kept the blonde hair.

As expected, the texture and depth in the image are outstanding.

However, when it comes to natural skin tones, SDXL models still hold the upper hand for now.

Showcasing the Results!

Here are the examples:

  • Top: SDXL → Flux.1 anime composition
  • Bottom: Flux.1 original

Streets of Paris

Real photo illustration of a Parisian cityscape in autumn with anime composition.png (2576×2576)
Wide angle
Real photo illustration of Paris streetscape in autumn.png (2576×2576)

Stained Glass of a Church

Photorealistic illustration of stained glass inside a majestic church in anime composition.png (2576×2576)
Diagonal composition
Photorealistic illustration of stained glass inside a majestic church.png (2576×2576)

Detailed Diorama

Photo-realistic illustration of an elaborate model train diorama in anime composition.png (2576×2576)
Framing
Photo-realistic illustration of an elaborate model train diorama.png (2576×2576)

Busy Airport

Illustration inside an airport building with people coming and going in a anime composition.png (2576×2576)
Linear perspective
Illustration of the inside of an airport building where people come and go.png (2576×2576)

Japanese Alleyway

Photorealistic illustration of a Japanese alleyway at dusk in an animated composition.png (2576×2576)
shift away from the center
Photorealistic Illustration of Japanese alley at dusk.png (2576×2576)

Woman in Kimono

Photographic realistic illustration of a young woman in a blue kimono standing in a Japanese garden in an animated composition.png (2576×2576)
Diagonal composition and movement
Photographic realistic illustration of a young woman in a blue kimono standing in a Japanese garden.png (2576×2576)

Chef

Real photo illustration of an experienced cook preparing food in a small restaurant with a beautiful anime composition.png (2576×2576)
Arrangement of ingredients in motion
Real photo illustration of an experienced cook preparing food in a beautiful small restaurant.png (2576×2576)

Horse Racing

Real photo illustration of a horse galloping in a anime composition horse race.png (2576×2576)
Rightward running adds motion
Illustration of a real photo of a horse galloping in a horse race.png (2576×2576)

Couple in a Skyscraper

Real photo illustration of a couple against a beautiful background on a skyscraper rooftop in an anime composition3.png (2576×2576)
Space behind expresses feelings
Real photo illustration of a couple looking at each other against a beautiful background on a skyscraper rooftop.png (2576×2576)

Cyberpunk Boy

Real photo illustration of a cyberpunk slum boy in a anime composition.png (2576×2576)
Sucked into the elevator
Real photo illustration of a cyberpunk slum boy.png (2576×2576)

Anime Compositions Convey Movement!

Anime models bring dynamic compositions to life, adding motion to the illustrations.

In contrast, Flux.1 original images focus more directly on the subject, creating a heavier atmosphere.

It seems best to use each depending on whether you want to express "motion" or "stillness."

SDXL’s Brightness Makes It Convenient

Regarding brightness, SDXL images are adjusted to be brighter and more contrasted, making them easier to use as-is.

Since we only have limited time to adjust prompts and edit the final image, being able to generate images that are almost ready from the get-go is a big advantage.

While Flux.1 excels in texture, SDXL still offers better ease of use.

Conclusion: Use AI as Your Teacher!

  • Anime models offer diverse compositions
  • SDXL excels at brightness adjustments
  • Use controlnet to apply Flux.1

AI has learned from over 5 billion images, with custom models reflecting the creators' unique styles.

Unlike human mentors, AI doesn’t get annoyed if you keep asking for advice.

Illustration of a young woman with pink hair wearing pink long sleeves, sitting on the terrace of a coffee shop, looking at us and smiling (2576×2576)

Through this process, I’ve discovered that AI can inspire new ideas.

Although I usually generate anime illustrations, this exercise reminded me that both SDXL and Flux.1 have impressive photorealistic capabilities.

AI image generation continues to fascinate me, and I’m excited to keep exploring!

Thank you for reading until the end!