I attended Day 1 of YOW! Sydney 2013 and thought some people might get something useful out of my notes. These aren’t my complete reinterpretations of every slide, but just things I jotted down that I thought were interesting enough to remember or look into further.
Keynote, Day 1: Jeff Hawkins spoke about “Computing Like the Brain: The Path to Machine Intelligence”
Jeff is an entrepreneur (he invented this slightly popular thing called the Palm Pilot) and scientist who co-founded Grok (formerly Numenta) to build technology based on theories of how the neocortex of mammalian brains works.
Machine Intelligence
In introducing machine intelligence, Jeff reflected on Big Data, asking: Is it possible that the only way we will truly master Big Data will be to have computers that can learn from the data rather than analyse as instructed by a limited human intelligence.
He constrained the type of ‘machine intelligence‘ he is talking about, saying it’s not about trying to build computers that simulate humans, but computers that can learn.
Jeff’s work centres around trying to achieve machine intelligence using the same technology and methods that our biological brains use to learn. Jeff believes machine intelligence will be built (in fact, it is being built!) on the principles of the neocortex. It is a flexible, universal learning machine. It is robust, despite neurons being unreliable and trauma being common. The trick is, we don’t actually know exactly how it works … yet.
At a base level, the neocortex turns a massive data stream from millions of sensors into predictions, anomalies and actions. It does this using a hierarchy of memory regions.
The majority of it works in almost exactly the same way as the rest, the only difference being what sensors are connected, but not how it works. He seemed to be saying that the “vision section” of the brain doesn’t have special vision processing, it just happens to be connected to the visual sensors.
The majority of our memory is used to remember time sequences. These are stored using ‘Sparse Distributed Representations‘, massive bitsets where the bits have semantic meaning and very few of them are ‘on’ in any given sample.
All regions of the neocortex handle both sensory input and produce motor actions, i.e. output. The neocortex is also able to focus attention, ignoring much of the input when one task is deemed crucial. This is one principle that is not well understood at the moment.
Unlike simplified simulations in software neural networks, real neurons have thousands to tens of thousands of dendrites, each of which is actually doing processing, not just passing simple signals along.
Applications of Neorcortex Simulation to Machine Intelligence
Some of the potential applications of using neocortex principles in computing:
Anomaly detection in data streams: models are be created in flow (not batch), learning continuously and constantly creating outputs.
“Threshold detection is awful.” says Jeff. Many unusual patterns don’t trip thresholds chosen by humans. He showed two great (real) examples where threshold detection would have been useless. Both were appeared like continuous sinusoidal signals, one of which had an anomaly where the zenith had a diminished amplitude for a period of time, and another which had an increased frequency.
Jeff and his team have built ‘Grok‘ to monitor AWS systems and report on anomalies within them.
They have open sourced a lot of their work as the NuPIC, the Numenta Platform for Intelligent Computing, at numenta.org. He warned that there is a steep learning curve to understanding and making use of the Cortical Learning Algorithm they have designed, but that it is very interesting for those who make the journey.
Another application Jeff demonstrated is the creation of sparse distributed representations by unsupervised machine learning, which they used to model language and then make predictions about how new words might be related to previously learnt ones. He showed two great examples, one where the intelligence was able to predict what a fox might eat based on knowledge it had about other animals, and one where they performed mathematical operations on the word representations to discover that, after removing the word “fruit” from “apple”, the words most closely related to the result were “computer” and other words related to Apple Inc products, such as “macintosh”.
The Future of Machine Intelligence
Because signals in the brain are relatively slow and the size of our brains is limited by our biology, Jeff believes we should be able to build software brains that are much bigger and much faster than human brains.
He finished off by asking the question: “Why create intelligent machines?” and offered two answers:
- Live better (by using machine intelligence to solve global problems)
- Learn more (by using machine intelligence to research the universe)
The slides from Jeff’s talk are available on the YOW! website. You can also watch an older version of Jeff’s talk from StrangeLoop 2012 on InfoQ. He has added some stuff to the talk since then, so definitely go and see him speak if you get a chance.
Image credits: ‘Light of ideas‘ by Saad Faruque
I’m glad this is the sort of consensus people are coming to w.r.t. machine learning. I haven’t gone through the material yet but for me the end goal is a distributed, fault tolerant, social (or maybe “selfish”) machine learning. To see a really trivial example, look at RIP (http://en.wikipedia.org/wiki/Routing_Information_Protocol)
The idea being that each node is selfish, each node does a simple not-very-smart thing, and eventually, all nodes converge on a solution together. The difficulty with algorithms like this is proving dominance and convergence. i.e. showing that the algorithm cannot be “gamed” so as to force the world not to converge, and that the algorithm does in fact, eventually, converge on the right solution. What this means is that the algorithm is actually implemented as a *protocol*. I think it’s really valuable having design patterns like this.