Unrepresentativeness primarily stemmed from two main sources: the target population of interest to the pollsters and the sampling techniques they used to select survey respondents. Since many (though not all) polls were focused on election forecasting, their target population was the US electorate, which at that time diverged substantially from the population of US adults. Not only did women, white Southerners, and the less-educated vote at lower rates, but African Americans in Southern states were almost wholly disenfranchised. Most poll samples mirror these biases, to the point where many fail to include a single black Southerner.
To address the these issues, the team created several sets of respondent-specific weights designed to adjust for sampling and nonresponse bias (for details, see Berinsky et al. 2011 and Caughey et al. 2020). The weights are “calibrated” such that the weighted sample distribution of key demographic characteristics—region, race, gender, urban/rural residence, and various measures of class status—matches their distribution in the target population. Because these variables predict both Americans’ probability of being surveyed and social and political attributes of interest (e.g., partisanship), eliminating the sample–population discrepancies yields more accurate estimates of the attributes’ distribution in the US population. The weights are especially effective after 1942, when pollsters started recording each respondent’s education level, a powerful predictor of nonresponse. However, to facilitate comparison across time, the team also created a set of comparable weights based on a set of demographic variables consistently available throughout the 1936–52 period.
In addition, weights were created for two different target populations: the voting-age population (VAP), which includes non-institutionalized adults over the age of 21, and the voting-eligible population (VEP), which excludes African Americans in Southern states, where they were effectively disenfranchised. The VEP weights are largely a practical concession to the total absence of black Southerners from many polls (making inferences about the VAP based on these polls requires the assumption that non-Southern blacks can effectively “stand in for” Southern ones).
LINK to Berinsky-Schickler collection
Works incorporating these datasets include: