Bayesian Consumer Profiling
Abstract
Firms use aggregate data from data brokers (e.g., Acxiom, Experian) and external data sources (e.g., Census) to infer the likely characteristics of consumers and thus better predict consumers' profiles and needs, unobtrusively. We demonstrate that the simple count method most commonly used in this effort relies on an assumption of conditional independence that fails to hold in many settings of managerial interest. We develop a Bayesian profiling method that leverages a different independence assumption and use simulations to show that in managerially-relevant settings, the Bayesian method will outperform the simple count method, often by an order of magnitude. We then compare both methods in three case studies. The first example estimates customers’ age on the basis of their first names; prediction errors decrease substantially. In the second example, the approach identifies 99.9% of people’s political affiliations based on their ZIP codes (vs. 30.3% with the simple count method). In the third case study, we infer the income, occupation, and education of online visitors of a marketing analytic software company, based exclusively on visitors’ IP addresses.