r/newAIParadigms

'Dragon Hatchling' AI architecture modeled after the human brain, rewires neural connections in real time, could be a key step toward AGI
▲ 35 r/learnmachinelearning+2 crossposts

'Dragon Hatchling' AI architecture modeled after the human brain, rewires neural connections in real time, could be a key step toward AGI

TLDR: A group of researchers attempted to replicate the brain's plasticity by designing a neural network with real-time self-organization abilities, where neural connections change continuously as new data is processed. They bet on generalization emerging from continual adaptation

---

➤Key quotes:

>Researchers have designed a new type of large language model (LLM) that they propose could bridge the gap between artificial intelligence (AI) and more human-like cognition.

and

>Called "Dragon Hatchling," the model is designed to more accurately simulate how neurons in the brain connect and strengthen through learned experience, according to researchers from AI startup Pathway.

and

>They described it as the first model capable of "generalizing over time," meaning it can automatically adjust its own neural wiring in response to new information. Dragon Hatchling is designed to dynamically adapt its understanding beyond its training data by updating its internal connections in real time as it processes each new input, similar to how neurons strengthen or weaken over time.

and

>Unlike typical transformer architectures, which process information sequentially through stacked layers of nodes, Dragon Hatchling's architecture behaves more like a flexible web that reorganizes itself as new information comes to light. Tiny "neuron particles" continuously exchange information and adjust their connections, strengthening some and weakening others.

and

>Over time, new pathways form that help the model retain what it's learned and apply it to future situations, effectively giving it a kind of short-term memory that influences new inputs.

➤IMPORTANT CAVEAT

>In tests, Dragon Hatchling performed similarly to GPT-2 on benchmark language modeling and translation tasks — an impressive feat for a brand-new, prototype architecture, the team noted in the study.

>Although the paper has yet to be peer-reviewed, the team hopes the model could serve as a foundational step toward AI systems that learn and adapt autonomously.

livescience.com
u/Tobio-Star — 18 hours ago
▲ 6 r/3Blue1Brown+3 crossposts

Global mixing for local compute

If you have a sparse neural network it make take several layer for global mixing to occur.

Or you may try to construct a high width layer from several small width layers to take advantage of their smaller ( n²) parameter count.

Only to find you have constructed multiple independent small width neural networks due to a lack of global mixing.

You can take advantage of the one-to-all connectivity of fast transforms to provide the global mixing to fix this.

(AI assisted text): https://archive.org/details/fast-walsh-hadamard-transform-in-neural-networks-clean-view

Java code for mixing multiple width 16 (2 headed ReLU) layers into one layer and then making a neural network out of those:

https://archive.org/details/sw-net-16-b

u/oatmealcraving — 20 hours ago
Week