u/General_Degenerate-

I have a series of photographs of different core boxes, which are a uniform rectangular container used to hold and display drill core. A tedious part of my job right now is manually cropping in on the core tray of each photograph, which is a task I'd rather automate.

Since the photographs are taken by hand, there is often a slight angle, so a bounding box parallel to the axis of the photograph won't be sufficient. I need a polygon which tightly encompasses the core tray, with four nodes, one for each corner of the tray. For this reason I believe I need instance segmentation rather than object recognition, please correct me if I'm wrong.

I started off by training a Yolo11m-seg model on 150 photographs which I annotated myself. I left all other parameters as their defaults. The results were subpar, the predictions were consistently significantly smaller than my annotations, which would cut off the edges of my core trays.

I think my model may have failed to learn that the core (highly variable) displayed withing the trays is irrelevant, the edges of the trays are all that matter.

I have tried to upgrade to a YOLO11l-seg model hoping it would be smarter but I always get a memory crash out on my 8GB of RAM even after setting the batch size to 2 and the number of workers to 0.

Any advice on how to train a model which can accurately make a tight bounding polygon based on the four corners of a core tray would be appreciated.

I have included an example sketch of the issue I am facing. The grey box represents the core tray, which I have perfectly annotated using the polygon tool. The violet box overlain on it shows my models prediction, which you can see is off.

https://preview.redd.it/82o0gmm7c6tg1.png?width=840&format=png&auto=webp&s=8daf32425a4353d0fde740058520e8acc8a1c43c

Supervisely tight bounding polygon