Rumor's of prompt engineering's demise have been greatly exaggerated
Here's a fun, actual prompt "engineering" example.
FlaiChat is our chat app, like WhatsApp, that does automatic translations. People type in their own languages and the everyone in the group reads the messages in their own language, automatically.
The LLM use-case is obvious to anyone who has called an openai API. There's some code involved to structure the request and obtain a structured response (we want a structured response with translation in all the languages being spoken in the group for one thing...and other promptish stuff)
What's not obvious is what happens when the message is just one giant block of emojis, like ❤️😘❤️😘❤️😘... (repeat 20x...) and the model just freaks the fuck out. Normal translations might take 500ms on a small/fast model. A wall of emojis could get stuck for 10s of seconds.
Seriously, try it out yourself. Build a simple API call that asks a model to translate a wall of emojis to a different language. Of course, don't forget to sternly tell the model "DO NOT TRY TO TRANSLATE EMOJIs" (or whatever the fuck you do to yell at the models). It does not work!
So the fix for us turned into a little pipeline of its own. We detect long emoji runs before building the prompt, swap them out for a placeholder like __EMOJIS&%!%%__ or whatever, and then tell the model in the prompt to leave that token in the appropriate place in the translation and so on. You know... prompt engineering.
Yet another data point on how software is never finished. Also another data point on the jagged edges of the LLM experience, if any more were needed.