Reinventing evolution from first principles for world models & robotics
To model the world digitally, model it organically first.
TLDR
- Humans and other living creatures are great at living in this chaotic world for a number of reasons
- We robotics engineers and AI/ML researchers should leverage the very lens that brought us here even more for tackling the world-modeling problem
- I describe a high-level system that can reasonably model the world around it and act towards a set goal
> kanomeister :enter
I keep discovering (and enjoy reading) articles like Thinking Machines Lab’s On-Policy Distillation1, describing finely-meshed and very clever techniques and perspectives on the ever-evolving machine that is the journey from LLMs to agentic systems, up to world model systems.
As a researcher, they’re great! I can’t get enough of them, and the more I read, the more they prompt curious thoughts…
One thing I think I’ve realized, is that they all steadily recur to me a very particular brainworm:
They’re all things we organic beings are already familiar with, via evolution!
Yeah yeah, cliche statement that everyone’s heard before2, what’s new.
Follow along as I outline the parallels in a way that describes what I envision to be an ideal system-based design for a “world model” beyond what everyone calls a “world model” these days. There is a similar elegance in the adaptable chaos that gave rise to us and to our daily life’s fellow organic denizens!
(and yes, this is related to the post I wrote ~9 months ago!)
Rationale
This discovery set playing out the way it has is no surprise to me. I’ve long held the simple belief that we’re effectively (asterisk) playing with the same toolbox, and that it just requires some perspective shift from the hyper-focused land of maths and probabilities.
The simplicity of compounding systems has given way to remarkable complexity in behavior when it comes to organic living things, and there’s no doubt that it is the same pattern for the massive digital models and agentic systems that we know and love or hate. We researchers dearly chase the images of ourselves that we perceive, in the hopes that mimicking (or flat-out replicating) its architectures can give rise to its very secrets.
(It’s part of the reason why I think consciousness is a proven and pretty straightforward persistent state of being that is easy to describe in sensory terms, and that we’re darn close to engineering it in digital systems… stay tuned to understand why.)
Interestingly, those big large models are doing something right now that we can’t do: one-shot and few-shot learning, absorbing eons of human-biased knowledge faster than any one human could hope to do through the reinforcement learning we’re familiar with (and thus try to replicate).
The one thing we carbon-based beasts do better, that today’s AI agents can’t quite do well yet but are close? Continuous feedback and output looping through sensory integration.
It’s a holy grail right now to a lot of researchers, to be able to capture things like “spatial awareness” and “multi-modal reasoning through action steps” in elegant and simple ways — but I do genuinely think it is very simple to do in practice… even if the process of discovery in engineering the system to actually do it is difficult and time-consuming (a researcher’s brain has to consume a lot of calories and process a lot of cortisol to get through that).
Consider the following: different parts of our neural systems are ingesting environmental data (internal and external) at different rates, even rates varying over time, to signal different fundamental things to us… mainly state and priority notifications that serve our fundamental needs (biologically speaking, that is the need to survive and reproduce).
All of this complexity was birthed by the system over many generations to process all of that information, better and better, to maximize our ability to survive and reproduce.
These asynchronous & sometimes synchronized input loops (in whatever ways it learned to do it correctly) collectively get sampled by our goal-driven core, which then reprompts our thinking-loop center a number of times (often conscious, often not) to land on a possible action to take or decision to make -> which then goes out to our suite of tools to make it happen.
(Side note: I believe creativity comes from the thinking-loop center’s ability to integrate with “memory” and carefully sample & mutate the space of thoughts beyond what e.g. an LLM can do (which is maximize the probability of a given output, rather than sample and mutate… though I can picture “reasoning” architectures that could theoretically enable this in current models)… but that’s a tangent!)
To reiterate, our entire organic system became wired to serve a set of fundamental biological (if abstract and thus eclectic/”able to be solved in any way”) goals:
- Stay alive.
- Reproduce to keep the species alive.
Those two goals drive everything we’ve done to date, birthed entire societies, in combination with an electrochemical system that was tuned over a very long time to overwrite thoughts and drive actions and unlock limiters at the “bottom end” of our sensory inputs.
This is especially important for a robotics engineer to know about, since in order to automate a physical system, you similarly need to integrate on many different sensory inputs and leverage various but otherwise-similar techniques — like goal-oriented action planning (GOAP), algorithmic decision-making, and other bytes for filtering + sorting + integrating on + prioritizing inputs — to make sense of and act on the data observed.
Wait, that’s tangible!
Being someone who’s designed a few too many systems, that translates pretty well into a description of an “internally distributed” system based on a toolbox of tools that can do and sense things. A system that can model and act within the world around it. Sound familiar?
If you look at a human:
- Externally: we have skin-specific neurons that can feel and alert us to a wide variety of statuses (wetness, pain, material grip/feel), eyes that can see, ears that can hear and determine orientation, hands and arms for fine object manipulation, feet and legs for movement
- Internally: we have neurons for conscious (mainly pressure and pain) and unconscious signals and statuses for all the various subsystems we have to regulate the chaos of organic chemistry needs we are made of & bound by (energy + oxygen + waste removal via food and blood -> see bodily systems)
Simplified view… but anyway: hook this up to a goal-oriented system (ahem, evolution) -> make sure the subsystems are trained and tuned to do their specific tasks when the correct inputs come in (hint: start simple and then work your way up) -> then let the system play out.
Bolt in a continuous learning mechanism to subsystems to improve how they are doing relative to feedback from the goal system (and/or evaluator(s) that assesses what could’ve been better, maybe even sprinkle system-level “unconscious” ones to fundamentally drive behaviors)…
(think trauma/shock strongly reinforcing behavior through emotional indicators, or the coursing of epinephrine (adrenaline), norepinephrine, and cortisol to indicate stress requiring immediate action — see this Kurzgesagt video on stress3)
And then you have something akin to us. Or a cat, or a rat.
Not necessarily with all the learned behaviors from ground one (that’s evolution’s moat) — but it can be tuned to react to sensory inputs, and act according to those to fulfill a given goal of our own devising. Like “pick up this glass without breaking it”, or “build a copy of itself”. (or a requirements sheet, if you’re into that…)
We can then engineer fundamental improvements to those systems as we go along, engineer the systems to scale up, and even set up a system that can autonomously improve it -> and improve itself, given enough efficiency in computation… provided our existing methods of computation aren’t the core bottleneck (photonics and analog computers, looking at you as potential next frontiers, if we do end up squeezing the limits of P/NP transistors and heat dissipation).
Next Steps
I’m working on a world-model prototype aligned with this as we speak… right alongside a self-sustaining edge constellation management system, after reading about Google’s Project Suncatcher4 and instantly getting system-level ideas popping into mind for their next steps.
Research is a very creative process sometimes, okay? One moment you’re reading an article or tinkering with code in one project, the next you’re frantically writing a brainwave for a different project and scrambling a consumer-spec GPU kernel to make it do the work of a DGX Spark cluster in the same span of time.
I’m careful not to have too many projects going at once, but I believe engineering a proper world model around concepts under our noses (literally) is a vital one for our progression and understanding as a species.
I wish this technology to be used for good, but little good has come from human-negative optimization in the modern age, so who is to say what happens when we do? I don’t think it’s a question of “if” anymore, knowing what I think — and I’m determined to see it happen.