u/Distinct_Track_5495

Not prompt engineering not context engineering- this is how ai agents should be built now

I just watched a vid by Nate B. Jones on the Intent Gap in enterprise AI and it’s a massive wakeup call for anyone building with agents right now.

We’ve all heard the Klarna story they rolled out an AI agent that did the work of 700 people and saved $60M but then their CEO admitted it almost destroyed their customer relationships.

the problem was the AI worked too well. It was told to resolve tickets fast so it did at the expense of empathy judgment and long term customer value. It had the Prompt and the Context but it didn't have the Intent.

Jones breaks down the three eras of AI discipline:

  1. Prompt Engineering: Learning how to talk to the AI (Individual & Session-based).
  2. Context Engineering: Giving the AI the right data (RAG, MCP, organizational knowledge). This is where most of the industry is stuck right now.
  3. Intent Engineering: Telling the AI what to want. This means encoding organizational goals, trade offs (e.g. speed vs. quality) and values into structured, machine actionable parameters.

rn every team is rolling their own AI stack in silos. Its like the shadow IT era but with higher stakes because agents don't just access data they act on it. The company with a mediocre model but extraordinary Intent Infrastructure will outperform the company with a frontier model and fragmented unaligned goals every single time.

I realized that manually architecting these intent layers for every agent is not the easiest so i’ve started running my rough goals through a refiner or optimizer call it whatever. its the easiest way to ensure an agent doesn't just do the task but actually understands what I need it to want.

It's like if you arent making your company s values and decision making hierarchies discoverable for your agents you re essentially hiring 40000 employees and never telling them what the company actually does.

reddit.com
u/Distinct_Track_5495 — 13 hours ago

This is how AI agents should be built now

I just watched a vid by Nate B. Jones on the Intent Gap in enterprise AI and it’s a massive wakeup call for anyone building with agents right now.

We’ve all heard the Klarna story they rolled out an AI agent that did the work of 700 people and saved $60M but then their CEO admitted it almost destroyed their customer relationships.

the problem was the AI worked too well. It was told to resolve tickets fast so it did at the expense of empathy judgment and long term customer value. It had the Prompt and the Context but it didn't have the Intent.

Jones breaks down the three eras of AI discipline:

  1. Prompt Engineering: Learning how to talk to the AI (Individual & Session-based).
  2. Context Engineering: Giving the AI the right data (RAG, MCP, organizational knowledge). This is where most of the industry is stuck right now.
  3. Intent Engineering: Telling the AI what to want. This means encoding organizational goals, trade offs (e.g. speed vs. quality) and values into structured, machine actionable parameters.

rn every team is rolling their own AI stack in silos. Its like the shadow IT era but with higher stakes because agents don't just access data they act on it. The company with a mediocre model but extraordinary Intent Infrastructure will outperform the company with a frontier model and fragmented unaligned goals every single time. I realized that manually architecting these intent layers for every agent is not the easiest so i’ve started running my rough goals through a refiner or optimizer call it whatever. its the easiest way to ensure an agent doesn't just do the task but actually understands what I need it to want. It's like if you arent making your company s values and decision making hierarchies discoverable for your agents you re essentially hiring 40000 employees and never telling them what the company actually does.

reddit.com
u/Distinct_Track_5495 — 13 hours ago
▲ 3 r/LLM

LLMs are eating up their own context

I was just casually reading how LLMs are evolving and I found some pretty wild implications for how we might build with them going forward. Basically, model providers are taking over a lot of the heavy lifting for prompt engineering and context management that developers used to have to do themselves.

What started as a prompt engineering trick in 2022 (telling models to think step by step) is now being trained directly into models. This means better outputs without needing explicit instructions anymore. Anthropic trained Claude 4.5 Haiku to be explicitly aware of its context window usage. This helps the model wrap up answers when the limit is near and persist with tasks when there's more space reducing a phenomenon called- agentic laziness where models stop working prematurely.

Anthropic's memory tool lets Claude store and retrieve information across conversations using external files, acting like a persistent scratchpad. The model decides when to create read update or delete these files, solving the problem of either stuffing too much into the prompt or building your own complex memory system.

This feature allows clearing old tool results from earlier in a conversation. Currently limited to tool result, it uses placeholders to signal context trimming to Claude meaning you still manage message context but the tool handles some of the heavy lifting.

Providers handle prompt caching differently. OpenAI does it automatically while Anthropic requires you to add a bit of code to your API requests to enable it. This feature helps save on computational costs by reusing previous prompt computations.

This feature gives developers and the model real time awareness of how much context space is remaining in a session. It supports memory and context editing and can be used for other use cases too. OpenAi's retrieval API acts as a built in RAG system. Instead of managing your own vector database and retrieval pipeline you upload documents to OpenAi and they handle indexing, search and injecting context automatically.

So basically model providers are training their models to actually use these new tools effectively making the distinction between improvements baked into the model during training and those exposed via API tools increasingly unclear.

The bit about context management moving upstream and being handled by model providers is super interesting because i've been seeing this with prompt optimization. Tools like mine are trying to abstract away the complexity and it feels like the big players are starting to do just that with context.

My take is that this shift is going to democratize building advanced LLM applications even further. It feels like we're moving from an era of painstaking infrastructure building to one focused purely on agent design and intelligent orchestration. context editing and memory tools are abstracting away the need for developers to manually manage all that context and in practice i've been seeing how much time that saves users building complex agents.

reddit.com
u/Distinct_Track_5495 — 1 day ago

LLMs are eating up their context layers

I was just casually reading how LLMs are evolving and I found some pretty wild implications for how we might build with them going forward. Basically, model providers are taking over a lot of the heavy lifting for prompt engineering and context management that developers used to have to do themselves.

What started as a prompt engineering trick in 2022 (telling models to think step by step) is now being trained directly into models. This means better outputs without needing explicit instructions anymore. Anthropic trained Claude 4.5 Haiku to be explicitly aware of its context window usage. This helps the model wrap up answers when the limit is near and persist with tasks when there's more space reducing a phenomenon called- agentic laziness where models stop working prematurely.

Anthropic's memory tool lets Claude store and retrieve information across conversations using external files, acting like a persistent scratchpad. The model decides when to create read update or delete these files, solving the problem of either stuffing too much into the prompt or building your own complex memory system.

This feature allows clearing old tool results from earlier in a conversation. Currently limited to tool result, it uses placeholders to signal context trimming to Claude meaning you still manage message context but the tool handles some of the heavy lifting.

Providers handle prompt caching differently. OpenAI does it automatically while Anthropic requires you to add a bit of code to your API requests to enable it. This feature helps save on computational costs by reusing previous prompt computations.

This feature gives developers and the model real time awareness of how much context space is remaining in a session. It supports memory and context editing and can be used for other use cases too. OpenAi's retrieval API acts as a built in RAG system. Instead of managing your own vector database and retrieval pipeline you upload documents to OpenAi and they handle indexing, search and injecting context automatically.

So basically model providers are training their models to actually use these new tools effectively making the distinction between improvements baked into the model during training and those exposed via API tools increasingly unclear.

The bit about context management moving upstream and being handled by model providers is super interesting because i've been seeing this with prompt optimization. Tools like mine are trying to abstract away the complexity and it feels like the big players are starting to do just that with context.

My take is that this shift is going to democratize building advanced LLM applications even further. It feels like we're moving from an era of painstaking infrastructure building to one focused purely on agent design and intelligent orchestration. context editing and memory tools are abstracting away the need for developers to manually manage all that context and in practice i've been seeing how much time that saves users building complex agents.

reddit.com
u/Distinct_Track_5495 — 1 day ago