u/KylseS — reddlx

Image 1 — My first LORA completed (kinda) [Elli Evrram]

My first LORA completed (kinda) [Elli Evrram]

So I made the following post: https://www.reddit.com/r/malcolmrey/comments/1t6fr4a/training_my_first_lora/

A day later, today, I was able to finally complete the LoRA after many restarts and changes. I think I got a good result and I'm quite happy with the likeness considering my expectations were quite low, although, I do believe I still could have done a lot of things better especially after I changed learning rate and timestamp_type mid training which I think definitely dropped the likeness I would have approached had I kept going with things unchanged. I wanted to share the LoRA, but unfortunately, in adittion to the mess ups i made during training I also messed up a lot of things in regards to captioning.

As it is my first completed LoRA, I was unaware of the nuances of captions and the drawbacks of captioning literally every aspect of the subject. I unfortunately rendered the LoRA so highly dependent on captions that another user probably won't be able to generate a good image without knowing my dataset.

I will be redoing this LoRA and fixing that, and certainly after that, I will be sharing the LoRA as well. I hope some of you will look forward to that.

Also, I hope someone can guide me regarding the best strategy in regards to learning rate and timestamp type.

For this LoRA, I switched between different learning rates and timestamp types and I think I messed some things up. I still want to experiment with that for the finer details and the late-step polishing, and some tips would make that a whole lot easier.

BTW no upscaling or post on these sample photos. Also eulerflowdiscrete scheduler brings out exceptionally realistic details I was aiming for, I will share the sample of that later.

u/KylseS — 5 days ago

▲ 29 r/malcolmrey

Training my first Lora

Malcolm has very greatly inspired me to make my own loras. Trying it out, hopefully everything goes well.

Setting up AI toolkit was hell. Dependency conflicts are a nightmare to deal with, im glad im past that now.

Makes me appreciate his work more.

Anyways, wish me luck!

P.S. Would appreciate all the tips I can get.

UPDATE: The first training was a failure.

Changing a few things

AdamW8bit to Prodigy 8bit

Using a trigger word

captioning dataset with trigger word instead of woman or pronouns

changing learning rate to 1

changing linear rank to 64

stepping up with dataset from 16 to 80

increasing steps from 1500 to 5000

using only 1024 resolution

Turned DOP on

Let's see how this goes...

Update 2: Currently the captioning is taking a lot more time than anticipated. Changed a lot of things. These are my instructions:

Act as a Visual Prompt Engineer specializing in "Long Caption" synthesis. Your task is to extract a highly detailed, narrative description of an image featuring one primary person.
DO NOT USE "He","She", "Her", "They", "them" or any general pronoun when describing the subject, just state their name Elli or when describing her attributes or her actions use Elli's
STRUCTURE:

CONCEPT & MEDIUM: Start with the type of image. Mention the core theme or event.

SUBJECT DESCRIPTION: woman, facial expression, hair details, face details, makeup. Crucially, describe their gaze. When referring to the subject, always use the name "Elli". DO NOT use pronouns like "he", "she","her", or "they" . Repeat the name when needed.

ACTION & INTERACTION: Describe exactly what the person is doing. What are they holding? How are they standing? Describe the interaction with objects.

APPAREL: Detail the clothing (fabric, color, fit) and accessories.

ENVIRONMENT & FOREGROUND: Describe objects immediately around the person.

BACKGROUND & LIGHTING: Describe the setting (location, weather, landmarks), depth of field (bokeh), and the quality of light.

RULES:

- STYLE: Use fluid, descriptive sentences (narrative style), not just a list of tags.

- ACCURACY: Be specific about colors and textures.

- TEXT: Always put captured text in double quotes.

- LENGTH: 700-900 characters.

- OUTPUT: A single continuous paragraph. No headers.

UPDATE 3: Captioning is a nightmare. Restarting because it keeps hallucinating some details. Also made it add the visiblity details of subject and the camera angle.

u/KylseS — 7 days ago

▲ 31 r/IndianArtAI

u/KylseS — 16 days ago

▲ 1 r/IndianArtAI

https://preview.redd.it/p8tu0llv71yg1.jpg?width=1024&format=pjpg&auto=webp&s=814191b43434e22b9ca4f3647f279d8255bf0d05

https://preview.redd.it/ooy7k9to81yg1.png?width=2496&format=png&auto=webp&s=f1ece5c5b44aea75cb445b0a04c6b2eb0868b26f

reddit.com

u/KylseS — 16 days ago