u/Exact-Literature-395

▲ 28 r/agi

A robot just zipped up a jacket without task specific training and I cant stop thinking about it

Most humanoid demos this year have been about running, jumping or doing parkour. Boston Dynamics tumbling, Figure 03 jogging, the EngineAI thing sprinting next to a human. Cool, but those are basically locomotion problems and locomotion is the part of robotics we have been making steady progress on for fifteen years.

The thing nobody talks about is zippers, cables, fabric, anything that bends.

Pulling a zipper up on a jacket that is hanging on a stand is one of those tasks that sounds trivial until you try to write code for it. You need a continuous estimate of where the zipper pull is in 3D as the fabric deforms around it, your gripper has to track it without losing contact, the force you apply has to be enough to engage the teeth but not enough to tear them, and the whole thing has to happen along a path that the model has to figure out from one or two camera angles. Classic robotics stack gets nowhere on this. State space is effectively infinite, contact dynamics are nonlinear, you cant simulate it cleanly.

The new wave of VLA models is starting to crack this and not by being smart about the geometry, by being big and end to end. Same family of models that handle "pick up the cup" are handling "zip up the jacket", "hang the shirt", "route the cable through the slot". WALL B from X Square Robot is the one I have seen the cleanest footage of, but Physical Intelligence pi0.6 demos show similar stuff with their setup. Helix 02 from Figure is in that bucket too.

Why this matters more than another backflip:

The unsolved core of household and service robotics is soft / deformable object manipulation. Folding laundry. Changing bedsheets. Unloading a dishwasher full of weird shaped Tupperware. Helping an elderly person put on a sweater. All zipper problems, basically. If we are starting to see zero shot ish generalization on that class of task, the consumer ready home robot timeline is not 10 years anymore.

It also closes one of the last "humans still have it" gaps. We were comfortable saying robots can lift heavy stuff but not handle anything soft. That comfort is going to age really badly really fast.

The locomotion race is mostly cosmetic at this point. The manipulation race is the real one and it is happening kind of quietly because the footage is less spectacular. Worth watching.

reddit.com
u/Exact-Literature-395 — 4 days ago

I came across this dress with a rat DJing on it, and I can’t decide if it’s weird in a bad way or actually kind of fun. I usually dress pretty safe, so this feels outside my comfort zone, but part of me really wants to try it.

I’m not sure what kind of shoes or accessories would make it look intentional instead of random. Would chunky boots work, or would sneakers make it feel more casual? I’m also wondering if I should keep the accessories simple or lean into the weirdness a bit.

u/Exact-Literature-395 — 12 days ago

Figured this out by accident. Was trying to recreate a specific prop and uploaded just the front photo. Result was flat on the back, like a cardboard cutout in 3D.

Then I uploaded front + side + back photos of the same object. The model came out with actual depth and detail on all sides. Makes sense when you think about it but I didn't realize how much of a difference it makes.

Best setup I've found: 3 photos minimum. Front, 45 degree angle, and side view. All on a clean background if possible. Phone photos work fine, you don't need a studio setup.

For objects that are symmetrical you can get away with 2 photos. Front and side. Meshy fills in the other side pretty accurately.

Lighting matters more than camera quality. Even lighting with no harsh shadows gives the cleanest results. I just put the object on a white desk near a window.

One thing that doesn't work well: photos with busy backgrounds. Meshy tries to include background elements in the model. Crop tight or use a plain backdrop.

This changed my image-to-3D success rate from maybe 40% to like 75%. Still not perfect but way more reliable.

reddit.com
u/Exact-Literature-395 — 16 days ago