Yoshua Bengio | From System 1 Deep Learning to System 2 Deep Learning | NeurIPS 2019

A shift in research direction and new tools such as soft-attention and progress in deep reinforcement learning are opening the door to the development of novel deep architectures and training frameworks

Share this article

Yoshua Bengio | From System 1 Deep Learning to System 2 Deep Learning | NeurIPS 2019

Past progress in deep learning has concentrated mostly on learning from a static dataset, mostly for perception tasks and other System 1 tasks which are done intuitively and unconsciously by humans. However, in recent years, a shift in research direction and new tools such as soft-attention and progress in deep reinforcement learning are opening the door to the development of novel deep architectures and training frameworks for addressing System 2 tasks (which are done consciously), such as reasoning, planning, capturing causality and obtaining systematic generalization in natural language processing and other applications. Such an expansion of deep learning from System 1 tasks to System 2 tasks is important to achieve the old deep learning goal of discovering high-level abstract representations because we argue that System 2 requirements will put pressure on representation learning to discover the kind of high-level concepts which humans manipulate with language. We argue that towards this objective, soft attention mechanisms constitute a key ingredient to focus computation on a few concepts at a time (a "conscious thought") as per the consciousness prior and its associated assumption that many high-level dependencies can be approximately captured by a sparse factor graph. We also argue how the agent perspective in deep learning can help put more constraints on the learned representations to capture affordances, causal variables, and model transitions in the environment. Finally, we propose that meta-learning, the modularization aspect of the consciousness prior and the agent perspective on representation learning should facilitate re-use of learned components in novel ways (even if statistically improbable, as in counterfactuals), enabling more powerful forms of compositional generalization, i.e., out-of-distribution generalization based on the hypothesis of localized (in time, space, and concept space) changes in the environment due to interventions of agents.

Other articles

Tenstorrent Launches Blackhole™ Developer Products at Tenstorrent Dev Day

Tenstorrent launched the next generation Blackhole™ chip family today at their DevDay event in San Francisco.

Community Highlight: Tenstorrent Wormhole Series Part 3: NoC propagation delay

An in depth look at Tenstorrent Wormhole, originally posted on corsix.org

ECOBLOX Partners with Tenstorrent to Drive AI/HPC Data Center Growth in the Middle East/Africa Region

Dubai, UAE, March 11, 2025 – ECOBLOX, a pioneer in AI/HPC supercomputing system integration for design and construction of data centers, has announced a strategic partnership with Tenstorrent, a next-generation computing company that builds computers for AI, to support rapid growth in the Middle East and Africa region.