Data science is science's second chance to get causal inference right: A classification of data science tasks
Causal inference from observational data is the goal of many health and social scientists. However, academic statistics has often frowned upon data analyses with a causal objective. The advent of data science provides a historical opportunity to redefine data analysis in such a way that it naturally accommodates causal inference from observational data. We argue that the scientific contributions of data science can be organized into three classes of tasks: description, prediction, and causal inference. An explicit classification of data science tasks is necessary to describe the role of subject-matter expert knowledge in data analysis. We discuss the implications of this classification for the use of data to guide decision making in the real world.
READ FULL TEXT