r/malcolmrey

Image 1 — ZIT I2I "Character LORA Transformation" Workflow
Image 2 — ZIT I2I "Character LORA Transformation" Workflow
Image 3 — ZIT I2I "Character LORA Transformation" Workflow
Image 4 — ZIT I2I "Character LORA Transformation" Workflow
Image 5 — ZIT I2I "Character LORA Transformation" Workflow
Image 6 — ZIT I2I "Character LORA Transformation" Workflow
Image 7 — ZIT I2I "Character LORA Transformation" Workflow
▲ 424 r/malcolmrey+2 crossposts

ZIT I2I "Character LORA Transformation" Workflow

Helo, guys.

I've made this workflow where I can input any image and it will make a similar image using a character LORA.

It's made for ZIT since it's fast but it can be used for any model, just modify it.

It takes less than a minute at second run at this resolution on my RTX 4070 Super (12GB VRAM) and 64GB RAM.

&gt; VAE and CLIP loader nodes under the Load image Node. <Load your ZIT VAE and CLIP properly

Link: https://pastebin.com/pGXEhDc8
(Updated: Removed the WAS Node Pack, no need for it. VAE and CLIP changed to the default ZIT ones)

It works in 3 Steps:

1- The image is downscaled to 768 on longer edge, Qwen3VL creates a basic prompt for it. Play with Denoise value here to best suit your preferences, around 0.45 - 0.55 seems ok for me.

2- Latent Upscale of 2x. I have best results like this, even with T2I.
The image will look better and the character LORA will be used again.

3- Face fix pass. The face will be detected with SAM3 and again refined with the LORA using the Inpaint Crop node. A small amount of sharpness is applied in this step.

Theres a group bypasser node so you can enable/disable steps 2 and 3. The image is only saved on step 3.

For the prompt, I'm suing a text concatenate so I can have my LORA trigger word and any other prompt applied before the Qwen3VL prompt.

Hope it's useful for someone o/

u/aniki_kun — 3 days ago

fk9_sydneysweeney_v1_prodigy Flux.2-klein-9b

A young woman in her late twenties with natural, ash-blonde hair tucked loosely behind one ear. She has clear skin with subtle, faint freckles across the bridge of her nose and cheekbones. She is sitting near a window, head turned slightly to look past the camera with a small, genuine, half-smile. The setting is a sun-drenched, minimalist room with white walls and a sheer linen curtain partially pulled to the side. Early afternoon sunlight spills through the fabric, creating a soft, diffused glow that wraps gently around her facial features. The light catches the moisture in her eyes and defines the individual fine hairs around her hairline and the texture of her cream-colored wool sweater. Shadows are soft and filled with ambient bounce from the light walls.

STYLE ANNOTATION:

Style: Intimate editorial portrait photography. Shot on 85mm f/1.8 lens, shallow depth of field, sharp focus on the eyes with a creamy bokeh background. Film stock emulation: Kodak Portra 400 for natural skin tones and soft grain. Mood: Quiet, contemplative, serene.

u/sruckh — 4 days ago

My first LORA completed (kinda) [Elli Evrram]

So I made the following post: https://www.reddit.com/r/malcolmrey/comments/1t6fr4a/training_my_first_lora/

A day later, today, I was able to finally complete the LoRA after many restarts and changes. I think I got a good result and I'm quite happy with the likeness considering my expectations were quite low, although, I do believe I still could have done a lot of things better especially after I changed learning rate and timestamp_type mid training which I think definitely dropped the likeness I would have approached had I kept going with things unchanged. I wanted to share the LoRA, but unfortunately, in adittion to the mess ups i made during training I also messed up a lot of things in regards to captioning.

As it is my first completed LoRA, I was unaware of the nuances of captions and the drawbacks of captioning literally every aspect of the subject. I unfortunately rendered the LoRA so highly dependent on captions that another user probably won't be able to generate a good image without knowing my dataset.

I will be redoing this LoRA and fixing that, and certainly after that, I will be sharing the LoRA as well. I hope some of you will look forward to that.

Also, I hope someone can guide me regarding the best strategy in regards to learning rate and timestamp type.

For this LoRA, I switched between different learning rates and timestamp types and I think I messed some things up. I still want to experiment with that for the finer details and the late-step polishing, and some tips would make that a whole lot easier.

BTW no upscaling or post on these sample photos. Also eulerflowdiscrete scheduler brings out exceptionally realistic details I was aiming for, I will share the sample of that later.

u/KylseS — 5 days ago

Training my first Lora

Malcolm has very greatly inspired me to make my own loras. Trying it out, hopefully everything goes well.

Setting up AI toolkit was hell. Dependency conflicts are a nightmare to deal with, im glad im past that now.

Makes me appreciate his work more.

Anyways, wish me luck!

P.S. Would appreciate all the tips I can get.

UPDATE: The first training was a failure.

Changing a few things

AdamW8bit to Prodigy 8bit

Using a trigger word

captioning dataset with trigger word instead of woman or pronouns

changing learning rate to 1

changing linear rank to 64

stepping up with dataset from 16 to 80

increasing steps from 1500 to 5000

using only 1024 resolution

Turned DOP on

Let's see how this goes...

Update 2: Currently the captioning is taking a lot more time than anticipated. Changed a lot of things. These are my instructions:

"

 Act as a Visual Prompt Engineer specializing in "Long Caption" synthesis. Your task is to extract a highly detailed, narrative description of an image featuring one primary person.
DO NOT USE "He","She", "Her", "They", "them" or any general pronoun when describing the subject, just state their name Elli or when describing her attributes or her actions use Elli's
STRUCTURE:

CONCEPT & MEDIUM: Start with the type of image. Mention the core theme or event.

SUBJECT DESCRIPTION: woman, facial expression, hair details, face details, makeup. Crucially, describe their gaze. When referring to the subject, always use the name "Elli". DO NOT use pronouns like "he", "she","her", or "they" . Repeat the name when needed.

ACTION & INTERACTION: Describe exactly what the person is doing. What are they holding? How are they standing? Describe the interaction with objects.

APPAREL: Detail the clothing (fabric, color, fit) and accessories.

ENVIRONMENT & FOREGROUND: Describe objects immediately around the person.

BACKGROUND & LIGHTING: Describe the setting (location, weather, landmarks), depth of field (bokeh), and the quality of light.

RULES:

- STYLE: Use fluid, descriptive sentences (narrative style), not just a list of tags.

- ACCURACY: Be specific about colors and textures.

- TEXT: Always put captured text in double quotes.

- LENGTH: 700-900 characters.

- OUTPUT: A single continuous paragraph. No headers.

UPDATE 3: Captioning is a nightmare. Restarting because it keeps hallucinating some details. Also made it add the visiblity details of subject and the camera angle.

u/KylseS — 7 days ago
▲ 3 r/malcolmrey+1 crossposts

Need Help Training LoRAs on Custom Models in Ostris AI Toolkit

Total noob question here, so please bear with me.
I’m trying to understand how to train a LoRA on custom models inside Ostris AI Toolkit GitHub, but I’m clearly missing something important.
From what I understand:
we need to change the model path to the custom model folder,
and I vaguely remember the model needing to be a .gguf file (not even sure if that’s correct :).
But in practical use, my training fails every single time.
Could someone explain in simple beginner-friendly words:
what exact model format is needed,
where the custom model should be placed,
how the path should be configured,
and what additional settings are required for custom architectures/models?
If anyone has:
a proper tutorial,
documentation,
or especially a good YouTube video covering this exact topic,
please share it.

u/FitEgg603 — 19 hours ago

Maura Tierney Flux.2-dev-9b

Professional photography of a woman with long brown hair lying prone on a bed with white linens, her body angled slightly away from the camera while her head turns to face the lens directly, her hands clasped gently under her chin with fingers relaxed, legs bent at the knees and feet lifted playfully in the background, soft natural daylight entering from the side illuminating her skin and hair with a warm glow, creating subtle shadows that define facial features and fabric folds, shot on 50mm lens at f/2.0 aperture for shallow depth of field, keeping focus sharply on face and upper torso while blurring background details including bed structure and walls.

u/sruckh — 6 days ago

Ana de Armas flux.2-dev-9b

Professional photography of a woman with long brown hair lying prone on a bed with white linens, her body angled slightly away from the camera while her head turns to face the lens directly, her hands clasped gently under her chin with fingers relaxed, legs bent at the knees and feet lifted playfully in the background, soft natural daylight entering from the side illuminating her skin and hair with a warm glow, creating subtle shadows that define facial features and fabric folds, shot on 50mm lens at f/2.0 aperture for shallow depth of field, keeping focus sharply on face and upper torso while blurring background details including bed structure and walls.

u/sruckh — 6 days ago