Let's Learn from AI! Recreating Dynamic Compositions

Flux.1 #ComfyUI #composition #Flux.1 #photorealistic #SDXL

2024-10-112025-2-13

Four-frame comic strip of times to study composition in English.png (1133×1600)

Anime models offer diverse compositions
SDXL excels at brightness adjustments
Use controlnet to apply Flux.1

Introduction

Hello, I'm Easygoing.

Today, we're going to explore how to recreate dynamic compositions in illustrations!

Previously Written About Anime Compositions

I previously wrote an article about how to apply anime compositions to illustrations.

Enhance Flux.1's Expression! How to Incorporate SDXL's Composition Techniques | AI image journey

This time, I want to dive deeper into some of the points I noticed back then.

Theme: Strawberry Shortcake

The theme for this article is perfect for autumn's appetite: strawberry shortcake!

We'll be comparing three models of SDXL/Flux.1. Along with the illustrations, we'll also take a look at the brightness distribution.

SDXL - anima_pencil-XL_v500 (Anime)

Clip art of strawberry shortcake.png (1024×1024)

Histogram of brightness for illustration of strawberry shortcake on SDXL animated model.png (1056×556)

The right side shows a higher brightness level

SDXL anime model
Expresses character softness and emotions
Capable of producing semi-realistic images

The brightness distribution for the anime model shows a lot of light on the right side, meaning brighter areas. Overall, it produces bright and poppy illustrations.

If this image were on a café menu, you'd likely feel tempted to order the dessert!

SDXL - RealVisXL_V5.0 (Photorealistic)

Photorealistic illustration of delicious strawberry shortcake SDXL.png (1024×1024)

Brightness histogram of SDXL's photorealistic model illustration of strawberry shortcake.png (1056×548)

Emphasis on the dark areas on the left

SDXL’s top-tier photorealistic model
FP32 format, 13.6 GB
Recommended with 50 steps, a true "supermodel"

RealVisXL is the pinnacle of SDXL's photorealistic models. The latest version was released in September 2024, and the large model size and high step count give away its powerful rendering capabilities.

Even though it's from the previous generation of SDXL models, it beautifully captures the texture of cake sponge and strawberries.

The darker tones emphasize the contrast, making the overall image look crisp and sharp.

Flux.1 - FluxesCore-Dev_V1.0 (Photorealistic)

Flux1 photo-realistic illustration of a delicious strawberry shortcake.png (2576×2576)

Brightness histogram of strawberry shortcake illustration of Flux1 photorealistic model.png (1056×560)

Centered, lower contrast

Flux.1’s photorealistic model
Fixes the issue where "Japanese" prompts used to turn people into anime characters
Specialized in rendering Asian features after additional training

FluxesCore series, created by TofuNoKakera, is a photorealistic model of Flux.1. While FluxesCore-Schnell is available only to members, FluxesCore-Dev is freely available (for non-commercial use).

Though the shortcake image feels slightly different from typical Japanese depictions, the texture and three-dimensionality are top-notch.

The brightness distribution is concentrated in the center, closely mimicking real-life light and shadow.

Is Flux.1 Subtle?

When comparing the three images, Flux.1 may seem subtle at first glance.

This is because its low contrast gives it less visual impact upon first sight.

Contrast Can Be Adjusted, But...

Although Flux.1 images have low contrast, you can adjust them with post-processing.

Flux1 photo－realistic illustration of a delicious strawbertone adjust.png (2576×2576) — Tone correction

Flux1 photo-realistic illustration of a delicious strawberry shortcake blend mode blue.png (2576×2576) — Blend mode: "Blue"

However, since this correction is done on PNG files rather than RAW images, no new information is recovered.

As you apply corrections, information will inevitably be lost.

While lower contrast might provide a natural look, generating images with brighter tones directly from AI would be more efficient.

Recreate Anime Composition and Brightness with Flux.1!

Now, for the main topic.

Let’s recreate SDXL’s composition and brightness using Flux.1 through a specific workflow.

We’ll use controlnet with depth information to lock down the composition.


flowchart LR
subgraph SDXL
A(anima_pencil-XL<br>1024 x 1024)
B(RealVisXL<br>1024 x 1024)
end
subgraph Flux.1
C(FluxesCore-Dev<br>1440 x 1440)
D(FluxesCore-Dev<br>2580 x 2580)
end
A-->|controlnet<br>depth|B
B-->C
C-->|upscale<br>x 1.8|D

SDXL-Flux1_Realistic.7z

Testing Models

anima_pencil-XL_v500
DepthAnything-V2_large_fp16
RealVisXL_V5.0
T5xxl-Q_5_K_M
FluxesCore-Dev_V1-Q_8.gguf (created from FP16 via flux_tool)

Note: FluxesCore-Dev is a non-commercial model.

Why Include RealVisXL?

We’re not passing depth information directly from the anime model to Flux.1 but are including RealVisXL in between for the following reasons:

Incorporate RealVisXL’s brightness
SDXL models have more reliability since Flux.1’s controlnet models are still relatively new
Flux.1 consumes a lot of VRAM, so we run controlnet on the SDXL model to conserve resources

Let’s Draw a Girl!

Now, let’s create an image. To reproduce an anime composition, we’ll include the following in the prompt: dutch angle and close up.

realistic, photorealistic, girl, teenage, dutch angle, close up

anima_pencil-XL

Illustration of a close-up of a young woman in cartoon composition.png (1024×1024)

The result is a beautiful woman with chestnut hair. Although anima_pencil-XL is an anime model, it handles photorealistic expressions well.

However, since it’s an anime model, the eyes are larger, and the overall brightness is higher than in a realistic photo.

Depth (Depth Information)

Depth map of illustration of a close-up of a young woman in cartoon composition.png (1024×1024)

This is the depth map generated from the previous image. Blue represents the foreground, and red represents the background.

RealVisXL

Illustration of a live-action SDXL model based on a depth map of an illustration of a close-up of a young woman in an animated composition.png (1024×1024)

Next, an image is generated based on the depth information. As often happens with AI, the result is typically a blonde woman unless otherwise specified.

The composition follows the depth map, with smaller eyes and more realistic brightness.

FluxesCore-Dev

Illustration of a live-action Flux1 model based on a depth map of an illustration of a close-up of a young woman in cartoon composition.png (2576×2576)

Lastly, we redraw and upscale the image with Flux.1. Although FluxesCore specializes in Asian depictions, for comparison’s sake, we kept the blonde hair.

As expected, the texture and depth in the image are outstanding.

However, when it comes to natural skin tones, SDXL models still hold the upper hand for now.

Showcasing the Results!

Here are the examples:

Top: SDXL → Flux.1 anime composition
Bottom: Flux.1 original

Streets of Paris

Real photo illustration of a Parisian cityscape in autumn with anime composition.png (2576×2576) — Wide angle

Real photo illustration of Paris streetscape in autumn.png (2576×2576)

Stained Glass of a Church

Photorealistic illustration of stained glass inside a majestic church in anime composition.png (2576×2576) — Diagonal composition

Photorealistic illustration of stained glass inside a majestic church.png (2576×2576)

Detailed Diorama

Photo-realistic illustration of an elaborate model train diorama in anime composition.png (2576×2576) — Framing

Photo-realistic illustration of an elaborate model train diorama.png (2576×2576)

Busy Airport

Illustration inside an airport building with people coming and going in a anime composition.png (2576×2576) — Linear perspective

Illustration of the inside of an airport building where people come and go.png (2576×2576)

Japanese Alleyway

Photorealistic Illustration of Japanese alley at dusk.png (2576×2576)

Woman in Kimono

Photographic realistic illustration of a young woman in a blue kimono standing in a Japanese garden in an animated composition.png (2576×2576) — Diagonal composition and movement

Photographic realistic illustration of a young woman in a blue kimono standing in a Japanese garden.png (2576×2576)

Chef

Real photo illustration of an experienced cook preparing food in a small restaurant with a beautiful anime composition.png (2576×2576) — Arrangement of ingredients in motion

Real photo illustration of an experienced cook preparing food in a beautiful small restaurant.png (2576×2576)

Horse Racing

Real photo illustration of a horse galloping in a anime composition horse race.png (2576×2576) — Rightward running adds motion

Illustration of a real photo of a horse galloping in a horse race.png (2576×2576)

Couple in a Skyscraper

Real photo illustration of a couple against a beautiful background on a skyscraper rooftop in an anime composition3.png (2576×2576) — Space behind expresses feelings

Real photo illustration of a couple looking at each other against a beautiful background on a skyscraper rooftop.png (2576×2576)

Cyberpunk Boy

Real photo illustration of a cyberpunk slum boy in a anime composition.png (2576×2576) — Sucked into the elevator

Real photo illustration of a cyberpunk slum boy.png (2576×2576)

Anime Compositions Convey Movement!

Anime models bring dynamic compositions to life, adding motion to the illustrations.

In contrast, Flux.1 original images focus more directly on the subject, creating a heavier atmosphere.

It seems best to use each depending on whether you want to express "motion" or "stillness."

SDXL’s Brightness Makes It Convenient

Regarding brightness, SDXL images are adjusted to be brighter and more contrasted, making them easier to use as-is.

Since we only have limited time to adjust prompts and edit the final image, being able to generate images that are almost ready from the get-go is a big advantage.

While Flux.1 excels in texture, SDXL still offers better ease of use.