Target Selection as Variable Selection: Using the Lasso to Select Auxiliary Vectors for the Construction of Survey Weights

Date
-
Event Sponsor
The Munro Lectureship Fund and The Lane Center
Location
Encina Hall West, Room 400 (GSL)
Speaker

Erin Hartman, Assistant Professor of Statistics and Political Science, UCLA

 

Abstract

Survey nonresponse is a ubiquitous problem in modern survey research. As individuals have become less likely to respond to surveys there has been a simultaneous rise in highly granular data sources that can be used to help ameliorate the nonresponse problem. While much research has been done on post-hoc weighting methods, which provide a flexible and general solution for unit nonresponse, there is an open question of how to select the optimal auxiliary vector to include in the weighting method. We formulate this as a methodological question of variable and interaction selection where the goal is, assuming an individual level stochastic response probability, to construct an optimal set of weights for each individual respondent to account for an observed pattern of nonresponse. We use recent literature on hierarchical group-lasso regularization to determine the best auxiliary vector for weighting. We show the advantages of this method in simulations that are derived from real survey data sampled off of an individual level voter file in recent elections. We also apply the method to historic quota sampled survey data from the 1930s and 1940s to show the advantages of this method even where the sampling design is unknown.

 

Biography

Erin Hartman is currently an Assistant Professor of Statistics and Political Science at UCLA. Her recent research focuses on creating new methods–including both theoretical approaches and new estimation strategies–for identifying and validating causal effects. She also studies survey design methodologies, including a new survey sampling method that reduces reliance on post hoc weighting methods and alleviates non-response bias, and an automated raking methodology that selects the optimal auxiliary vector on which to weight.

In 2012, she ran the polling operation for Obama for America’s Analytics department, which very accurately predicted election outcomes in the campaign’s battleground states. She also co-founded a successful analytics and technology start-up, BlueLabs, focused on providing analytics services to clients in politics, issues advocacy, healthcare, and education.

She holds a PhD in Political Science and a MA in Statistics from UC Berkeley. She also completed a Post Doctoral Research Fellowship with at Princeton University.

If you can’t find her around the office, chances are she's climbing some rocks somewhere.