Concept-Based Learning in Heterogeneous Treatment Effect
Estimating Heterogeneous Treatment Effects (HTE) is crucial for personalized decision-making in medicine, economics and engineering. While machine learning models for Conditional Average Treatment Effect (CATE) estimation have become increasingly accurate, they often remain black boxes, providing little insight into why treatments affect individuals differently. This paper introduces CATE-Concept Bottleneck Model (CATE-CBM), a novel framework that integrates concept-based learning with CATE estimation to bridge this interpretability gap. Our approach enforces a concept bottleneck that forces the model to express treatment effects through understandable concepts, enabling transparent reasoning about which concepts drive heterogeneous effects. Through experiments on a modified MNIST dataset, we demonstrate that CATE-CBM maintains competitive accuracy while providing meaningful concept-based explanations of treatment effect heterogeneity. The model successfully identifies how both the presence and absence of specific concepts influence treatment outcomes, offering clinicians and engineers both accurate effect estimates and interpretable rationales for personalized interventions. This work represents the first unification of concept bottleneck models with causal effect estimation, advancing the frontier of explainable artificial intelligence in causal inference.