Empirical Asset Pricing via Machine Learning
Abstract
We synthesize the field of machine learning with the canonical problem of empirical asset pricing: Measuring asset risk premia. We use the widely understood empirical setting of predicting the time series and cross section of stock (and portfolio) returns to perform a comparative analysis of methods in the machine learning repertoire, including generalized additive models, boosted regression trees, random forests, and neural networks. At the broadest level, we find that machine learning has great promise for describing asset price behavior. Our implementation establishes a new standard for accuracy in measuring risk premia summarized by unprecedented high out-of-sample return prediction R2 . We identify the best performing methods (trees and neural nets) and trace their predictive gains to allowance of non-linear predictor interactions that are missed by other methods. Lastly, we find that all methods agree on the same small set of dominant predictive signals that are variations on momentum, liquidity, and market beta. Improved risk premia measurement through machine learning can simplify the investigation into economic mechanisms of asset pricing and justifies its growing role in innovative financial technologies.