Professor Kun Zhang, Carnegie Mellon University

Causal Representation Learning: Uncovering the Hidden World

Abstract

Causality is a fundamental notion in science, engineering, and even in machine learning. Uncovering the causal process behind observed data can naturally help answer 'why' and 'how' questions, inform optimal decisions, and achieve adaptive prediction. In many scenarios, observed variables (such as image pixels and questionnaire results) are often reflections of the underlying causal variables rather than being causal variables themselves. Causal representation learning aims to reveal the underlying hidden causal variables and their relations. The modularity property of a causal system implies properties of minimal changes and independent changes of causal representations, and in this talk, we show how such properties make it possible to recover the underlying causal representations from observational data with identifiability guarantees: under appropriate assumptions, the learned representations are consistent with the underlying causal process. Various problem settings are considered, involving independent and identically distributed (i.i.d.) data, temporal data, or data with distribution shift as input. We demonstrate when identifiable causal representation learning can benefit from flexible deep learning and when suitable parametric assumptions have to be imposed on the causal process, complemented with various examples and applications.