u/DevilxxOP
Two Transformer authors, the BDH inventor, and the Liquid Neural Networks inventor got in a boxing ring to debate the path to AGI
The last time I saw Lukasz Kaiser and Llion Jones together was on the NVIDIA GTC stage with Jensen Huang. Now they were in a literal boxing ring in SF debating what comes after Transformers. Reminded me of Silicon Valley Episode 1 (Well done pathway)
The core question was:
Are Transformers the path all the way to AGI, or are they the architecture that gets us close enough to realize what the next thing needs to be? It was basically a debate in the format of a boxing match. They encouraged you to argue fiercely and the winner was to be decided by a clapometer (i.e, basically the one whose side had more noise won).
The post Transformer side had Adrian Kosowski, who is behind BDH, Mathias Lechner, known for Liquid Neural Networks, and Llion Jones himself. That last part is what makes it interesting, because Llion was one of the original Transformer authors.
Lukasz’s strongest pro Transformer argument seemed empirical, not ideological. Transformers are simple, scalable, hardware friendly, and keep absorbing tasks people once thought needed special architectures. Language, code, tools, agents, multimodal reasoning, long context. Ugly in some ways, but they work. The post Transformer argument is not ideology either. Continual learning, energy cost, quadratic attention, dense computation, memory, and sample efficiency are well known issues with LLMs. And probably you cant do a permanent work-around as these are properties of the architecture itself.
Humans clearly do not learn like current foundation models. A child does not need to read the whole internet several times to become intelligent.
One framing from Adrian stuck with me. He said the Pagerank moment for intelligence has not happened yet. Search existed before Pagerank, but Pageank changed what scaled. Maybe Transformers are that moment for intelligence. Or maybe they are just the bridge to it.
Llion has made a similar point publicly: the Transformer has been so successful that it may have created inertia and unnecessary pressure in research. Anything new has to beat a brutally optimized stack with better data, kernels, hardware support, tooling, and billions of dollars behind it. So even if a better idea exists, it may look worse at first.
The quote I heard from the event:
“The success of the Transformer is stopping us from finding the next thing.”
On the other hand, Mathias apparently said we could eventually see frontier style models running on a Raspberry Pi. Big claim, but the point is clear: the post Transformer side is arguing that intelligence may need a different efficiency profile entierly.
That feels like the real tension.
Transformers are probably not the final architecture. But I also do not think they are going away soon. The realistic future might be hybrid: Transformers as the main substrate, with newer architectures adding better memory, recurrence, efficiency, or learning behavior.
There is also real momentum outside the usual Transformer scaling story. Sakana is the posterchild in Japan. Liquid announced a Mercedes collab for embedded, on device frontier AI. Pathway’s BDH is also commercialized with AWS and NVIDIA.
The big open questions for me:
Can we get reasoning that is not just language first?
Can we get memory that is not just a bigger context window? BDH claims models can build something closer to experience, not just retrieve longer context. Is that the right direction?
Can we get inference time learning that is actual learning, not just retrieval?
And maybe the biggest one:
Will the next architecture be invented by humans, or by a Transformer based system itself? Curious what people here think. Are Transformers the AGI path, or just the first architecture powerful enough to reveal what the real requirments are? PS – who do you think won the audience noise vote?
Not going to oversell this — it's simple, repetitive copy/paste work done entirely online. a phone / laptop would do it, if you have both even better for you.
What you get:
$10–20/week depending on workload (i give bonuses % weekly for certain members) Weekly payments via USDT (BEP20), PayPal, or UPI
All tasks are posted and managed in a Discord server — no chasing anyone for work
It won't make you rich, but this is perfect if you’re looking for a small but steady weekly gig. this might be worth your time. join to get started:https://discord.gg/KfVAJXDg
quarterly budget review. my project, my numbers. i've been over this budget with my team, with my manager, in a prelim review last month. i know every line.cfo joined unexpectedly. started asking pointed questions. 'why is the vendor cost this high, have you explored alternatives, what's the roi model for this line.'all questions i have answers to. all questions i have literally answered before. but something about the unexpected presence and the directness of her questions made me start hedging everything. 'i believe the rationale was' instead of just saying the rationale. 'we looked at a few options' instead of naming the options i actually evaluated.ended the meeting with her saying she'd 'like to revisit some of these numbers.' which is the worst thing you can hear after a budget review you should have owned.how do you hold your ground with finance in those rooms
THE ONE PIECE remake by WIT STUDIO schedule for Feb 2027
All 7 episodes together on Netflix
-season 1 will have 7 episodes
-Februray 2027 release on Netflix
-300 minutes runtime
-All episodes will release at ones
trying to understand the mechanics a bit more before we commit. from what I can tell Ava 2.0 classifies replies into intent categories and routes them accordingly - interested, not now, not the right person, hard no, that kind of thing. what I can't figure out from the docs is how much of this is editable. can you define your own intent categories? can you customise what happens downstream for each one - like auto-pause vs auto-reassign to a human vs continue with a modified sequence?also curious how it handles ambiguous replies. like someone writes back "what does this actually do" - is that treated as interest or neutral? and does it loop a human in or try to respond itself? if anyone's dug into the workflow config on this I'd appreciate the detail. the marketing is everywhere but the actual technical breakdown is hard to find.
Every demo starts to sound the same after a while.
“personalized at scale”
“data-driven outreach”
“increase pipeline faster”
I get that the core problems are similar, but I don’t know if its my imagination but somehow everything is converging into the same language, same promises and same positioning. Even the actual outputs from a lot of these tools end up looking and sounding alike, which kinda defeats the point. Is this just where the market is at, or are we stuck in a loop of copying what supposedly works without questioning it?
been on cursor daily for about 5 months. full stack typescript, mostly internal tools and frontend. I went through a phase where I was constantly fixing cursor's output and blaming the model.
then I looked at my prompts. they were basically shorthand that only made sense to me. "add dark mode to settings." "fix the table sort." "build a notification component." these are notes to myself, not instructions for someone who doesn't have my context.
the shift was treating cursor like a competent developer who just joined the team today. they're smart but they don't know our codebase, our patterns, or what I specifically want. so I started including stuff like which existing components to reference, what the state management approach should be, which edge cases to handle, what the behavior should be on error states.
my prompts went from one sentence to a full paragraph and the output went from "eh, I'll rewrite this" to "ok I can work with this, just need to tweak a few things." the model was always capable. I just wasn't giving it enough.
the bottleneck for me was that typing a paragraph-length prompt for every task felt slow. so I started talking through requirements out loud and pasting the transcription. I use an AI voice dictation tool called Willow Voice for it. speaking naturally I end up covering more ground than when I type because I don't edit myself. the transcription comes out clean enough for cursor to work with.
but really the core thing is just: more context = better output. you can type it, dictate it, whatever. the model needs to know what you know, and most of us give it about 20% of that by default.
what do your cursor prompts look like? short commands or full descriptions? curious what the sweet spot is for other people.