一张图看看人工智能各大“门派”

2017-04-05 唐杉 StarryHeavensAbove

我们现在说的人工智能，很多时候指的是基于深度神经网络的机器学习（或者深度学习）方法。但实际上，人工智能是一个历史悠久和丰富内涵的学科。由于这两年机器学习取得了非常好的实际效果，其它研究方向似乎被大家遗忘了。最近这种情况有点变化，似乎其它方向也在更多的发出声音。比如，前两天看到的一个新闻，“美国国防部高级计划研究局（DARPA）于不久前对Gamalon注资720万美元”。这个Gamalon就是玩“Bayesian programming”的。

正好昨天看到两篇挺有意思的文章，都是聊人工智能领域的各个“部落”（原文是tribes）。我觉得用“门派”也挺合适。虽然同在人工智能这个“武林”，他们的关系也很微妙，既有竞争，也有合作，有时还会“badmouth each other”。一篇是“AI’s Factions Get Feisty. But Really, They’re All on the Same Team”[1]，第二篇是“The Many Tribes of Artificial Intelligence”[2]。特别是第二篇，还用来一张信息图形象的描述了他们之间的关系。

图片来自Intuition Machine, medium.com

这篇文章的作者非常“严肃”的给每个“部落”起了名字（当然也有的是公认的），还设计了“徽章”。我第一眼就看到了PAC Theorists那个。

下面我就搬运一下各个“部落”的说明。高亮的部分是Deep Learning，几个分支名字起的有点意思，内容也有亮点！

Symbolists - Folks who used symbolic rule-based systems to make inferences. Most of AI has revolved around this approach. The approaches that used Lisp and Prolog are in this group, as well as the SemanticWeb, RDF, and OWL. One of the most ambitious attempts at this is Doug Lenat’s Cyc that he started back in the 80’s, where he has attempted to encode in logic rules all that we understand about this world. The major flaw is the brittleness of this approach, one always seems to find edge cases where one’s rigid knowledge base doesn’t seem to apply. Reality just seems to have this kind of fuzziness and uncertainty that is inescapable. It is like playing an endless game of Whack-a-mole.
Evolutionists - Folks who apply evolutionary processes like crossover and mutation to arrive at emergent intelligent behavior. This approach is typically known as Genetic Algorithms. We do see GA techniques used in replacement of a gradient descent approach in Deep Learning, so it’s not a approach that lives in isolation. Folks in this tribe also study cellular automata such as Conway’s Game of Life [CON] and Complex Adaptive Systems (CAS).
Bayesians - Folks who use probabilistic rules and their dependencies to make inferences. Probabilistic Graph Models (PGMs) are a generalization of this approach and the primary computational mechanism is the Monte-Carlo method for sampling distributions. The approach has some similarity with the Symbolist approach in that there is a way to arrive at an explanation of the results. One other advantage of this approach is that there is a measure of uncertainty that can be expressed in the results. Edward is one library that mixes this approach with Deep Learning.
Kernel Conservatives - One of the most successful methods prior to the dominance of Deep Learning was SVM. Yann LeCun calls this glorified template matching. There is what is called a kernel trick that makes an otherwise non-linear separation problem into one that is linear. Practitioners in this field live in delight over the mathematical elegance of their approach. They believe the Deep Learners are nothing but alchemists conjuring up spells without the vaguest of understanding of the consequences.
Tree Huggers - Folks who use tree-based models such as Random Forests and Gradient Boosted Decision Trees. These are essentially a tree of logic rules that slice up the domain recursively to build a classifier. This approach has actually been pretty effective in many Kaggle competitions. Microsoft has an approach that melds the tree based models with Deep Learning.
Connectionists - Folks who believe that intelligent behavior arises from simple mechanisms that are highly interconnected. The first manifestation of this were Perceptrons back in 1959. This approach died and resurrected a few times since then. The latest incarnation is Deep Learning.
The Canadian Conspirators - Hinton, LeCun, Bengio et al. End-to-end deep learning without manual feature engineering.
Swiss Posse - Basically LSTM and that consciousness has been solved by two cooperating RNNs. This posse will have you lynched if you ever claim that you invented something before they did. GANs, the “coolest thing in the last 20 years” according to LeCun are also claimed to be invented by the posse.
British AlphaGoist - Conjecture that AI = Deep Learning + Reinforcement Learning, despite LeCun’s claim that it is just the cherry on the cake. DeepMind is one of the major proponents in this area.
Predictive Learners - I’m using the term Yann LeCun conjured up to describe unsupervised learning. The cake of AI or the dark matter of AI. This is a major unsolved area of AI. I, however, tend to believe that the solution is in “Meta-Learning”.
Compressionists - Cognition and learning are compression (Actually an idea that is shared by other tribes). The origins of Information theory derives from an argument about compression. This is a universal concept that it is more powerful than the all too often abused tool of aggregate statistics.
Complexity Theorists - Employ methods coming from physics, energy-based models, complexity theory, chaos theory and statistical mechanics. Swarm AI likely fits into this category. If there’s any group that has a chance at coming up with a good explanation why Deep Learning works, then it is likely this group.
Biological Inspirationalists - Folks who create models that are closer to what neurons appear in biology. Examples are the Numenta folks and the Spike-and-Integrate folks like IBM’s TrueNorth chip.
Connectomeist - Folks who believe that the interconnection of the brain (i.e. Connectome) is where intelligence comes from. There’s a project that is trying to replicate a virtual worm and there is some ambitious heavily funded research [HCP] that is trying to map the brain in this way.
Information Integration Theorists - Argue that consciou-ness emerges from some internal imagination of machines that mirrors the causality of reality. The motivation of this group is that if we are ever to understand consciousness then we have to at least start thinking about it! I, however, can’t see the relationship of learning and consciousness in their approach. It is possible that they aren’t related at all! That’s maybe why we need sleep.
PAC Theorists - Are folks that don’t really want to discuss Artificial Intelligence, rather prefer just studying intelligence because at least they know it exists! Their whole idea is that adaptive systems perform computation expediently such that they are all probably approximately correct. In short, intelligence does not have the luxury of massive computation.

再说一点题外话，深度神经网络几个比较大的问题，比如“黑盒”问题，无监督学习，能耗的问题（和人类相比），有可能未来都要靠学习别的“门派”的“武功”来解决。

T.S.

参考：
1. CADE METZ，“AI’s Factions Get Feisty. But Really, They’re All on the Same Team”，wired.com
2. Carlos E. Perez, “The Many Tribes of Artificial Intelligence”，Medium.com

以下内容你可能会感兴趣

嵌入式机器学习处理器的技术挑战和机会

ISSCC2017 Deep-Learning Processors导读文章汇总

初创公司在人工智能芯片（IP）领域的机会

处理器IP厂商的机器学习方案 - Synopsys

处理器IP厂商的机器学习方案 - CEVA

处理器IP厂商的机器学习方案 - 背景

长按左侧二维码关注