Making Polls Work (Again) – Sabato’s Crystal Ball

Dear Readers: We’re pleased to feature an excerpt from G. Elliott Morris’s new book, Strength in Numbers: How Polls Work and Why We Need Them. Morris is a data journalist at the Economist whose work has previously appeared in the Crystal Ball. In this excerpt, Morris addresses some of the big questions of polling — namely, how polling can be improved, and how the public’s understanding of polling can be improved along with it.

— The Editors

The polls have had some big misfires, but they are still the best tools we have to gauge support for the actions of the government. If the accuracy of polling overall is measured by the predictive abilities of election polls, then they are typically off by one percentage point here and two there, and the person in the lead ends up winning. Studies of issue polls directly suggest they may be more accurate than their pre-election counterparts.

More importantly, a one- or two-point miss is not nearly large enough to alter conversations about public policy. What is the practical difference between a position that is supported by 60% versus 62% of adults? Certainly the two-point difference would not change any politician’s mind when so clear a majority has already decided in favor. And how much does the difference between even 48% and 50% matter? The latter is closer to a majority, but with both numbers within the margin of error of it, few leaders would be persuaded to do something risky just on the back of the single poll. On the whole, the picture of the country as uncovered by polls appears quite accurate.

This does not mean that all polls are good. We have seen how pollsters in Iraq and other overseas (particularly Middle Eastern) countries in the early twenty-first century struggled with the methods and business of survey-taking — or may have been influenced by authoritarian governments — and produced unreliable data that was likely even falsified. Those findings were passed up the chains of command to leaders in both the United Kingdom and United States — and distributed to the media. Along with so-called push pollsters, ideologically motivated firms, and attention-seekers, these examples remind us that we cannot fully let down our guard when gathering data on the will of the people, as we have seen how, across the board, not all polls are created equal.

Over the ninety-year history of polling, we have learned public opinion surveys are less like pulse oximeters and more like a cracked mirror — a tool that reveals a portrait of the gazer that is roughly correct, but with notable imperfections. These cracks became apparent after polls were faulted for very real methodological shortcomings during elections in both the recent and distant past — but also by routine and unfair beatings by critics who do not understand either the science behind them or their value to democracy. Though the reflective surface can sometimes offer up a distorted view of the American public, we have seen that its imperfections do not render it absolutely useless. Luckily, unlike a glass mirror, the polls can be fixed to a large degree, cracks filled and blemishes polished out. Pollsters are constantly engaging in the process of repair, but citizens too can help polling regain its footing and realize its full potential. Ultimately, the fixes will lead us to ask ourselves: Can we use the mirror to improve our democracy?

I propose five reforms that pollsters, political practitioners, the media, and the public can adopt to elevate the polls. First, pollsters should abandon polls fielded entirely by phone, and incorporate samples drawn by other methods. Due to the rise of caller ID and other call-blocking technologies, as well as a general distrust of the pollsters, phone polling has become increasingly unreliable and incredibly expensive. Phone pollsters face a deadly combination of high costs due to the labor demands of dialing additional cell phone numbers by hand, and a lack of high-quality population benchmarks to which they can adjust their samples to ensure their representativeness, especially by demographic group. There was a time when over 90% of people you called would answer a phone poll; now, pollsters are lucky to get five or six percent of people to tell them how they feel and what they think. And that group is unrepresentative.

While pure phone polls have been trending toward irrelevance, online pollsters have been proving their worth. Through experimentation with new data-collection methods and innovations in statistics, firms such as YouGov and Civiqs have outperformed pure “probability” methods that performed well in the past. Their ability to gain repeated observations from the individuals over time enables them to produce samples that are often more politically representative than a phone poll fielded among a random subset of the population. The firms using Erin Hartman’s method of adjusting for predicted nonresponse, like David Shor’s and the New York Times, have also developed powerful ways to adjust their samples to be better representative of the population. At the very least, they do not miss elections by 17 points.

Pollsters also ought to invest in more off-line methods, such as the address-based methods that the Pew Research Center developed during the 2020 election. These methods should help pollsters derive higher-quality population benchmarks for things like partisanship, religious affiliation, and trust in our neighbors — data that can be used to adjust other polls and improve the landscape of public opinion research. Benchmarking surveys could also be completed in conjunction with the government, which still manages to get very high shares of people to fill out its census surveys, or through a commercial partnership that distributes the benchmarks to its partner organizations. While these methods might not fix the underlying problem with polls — certain groups of people refusing to answer their phones or fill out online surveys at rates standard modeling has a hard time capturing — they will go a long way toward repairing them.

Second, pollsters should be open to the fact that their opinion polls are subject to roughly twice the potential error that is captured by the traditional margin of sampling error — and political journalists should treat individual surveys with more skepticism. A pre-election poll that shows one candidate leading by two or three points should not be treated as a solid poll for that candidate, or even a sign that they are leading. If there is a two-point spread and a six- or seven-percentage-point margin of error, you are only slightly better off betting in favor of the leading candidate; the bet would not be safe — and so journalists should report the contest as a toss-up. At the very least, the press should always report the margin of error of a poll near the top of the story. Smarter journalism would remind readers and listeners of the many different factors that could cause the survey to go wrong.

Accordingly, and third, election forecasters should revisit their old ideas about the ability of aggregation to remove biases in a mass of data, and their ability to convey the likelihood of those biases to readers. The savants have had two contests in a row where they badly underestimated one candidate across states. The first time, Donald Trump won enough extra votes to win the Electoral College and overcome his poor 15-30% chance of victory in the leading models; the second time, his vote share in two states was higher than in 80–90% of simulations forecasters generated. In the future, it could be wise for forecasters to reframe their commentary as exploring what could happen if the polls go wrong, rather than providing pinpoint predictions of the election. The expectations of hyper-accuracy, largely caused by the media’s misunderstanding of Nate Silver’s successful forecasts in 2008 and 2012, as well as his championing of correct forecasts in binary terms, but to which I have contributed as well, should be consigned to the history books. Forecasting should become an enterprise for exploring uncertainty, not predicting outcomes.

Fourth, to combat the influence of low-quality outfits that are motivated by profits or ideology, the American Association for Public Opinion Research (AAPOR) ought to more aggressively and publicly sanction public pollsters who do not release thorough, transparent reports on their methodologies. Additionally, when a survey firm is suspected of faking its data or engaging in other nefarious activity, AAPOR should investigate it and engage in additional high-profile scrutiny — both to incentivize good behavior and to shore up public trust in the industry. Instead of being a professional society for the pollsters, AAPOR could transform itself into a public watchdog for survey data. If it publicly condemned the practices of ideologically biased or nefarious firms, thereby affecting news coverage and client recruitment to produce a loss of revenue for bad actors, AAPOR could cut down on the number of unsavory outlets at home, clean up the public opinion information environment, and restore trust in the industry.

Finally, to better achieve the promise of polls in a republican government, more political interest groups should devote themselves to measuring and advocating for the public’s opinions. Data for Progress, a progressive think tank that was started in 2018, has data-driven advocacy at the core of its mission. Their secret is a combination of speed, accuracy, and networking. The nerdy progressives who run the group’s polls use a cheap online survey platform called Lucid to field quick surveys with large amounts of respondents, often running multiple questionnaires simultaneously. Then, the methodologists weight their data to be both politically and demographically representative — as per the breakdowns of the voter file — and an army of authors write quick reports and publish them online. While a traditional media poll will take weeks to design, field, weight, and report, Data for Progress can ask the questions it needs and publish the findings in a matter of days.

The business model works. For example, for months during 2018, politicians and many in the media claimed that a package of climate policies called the “Green New Deal” would drag down Democrats in swing districts. But Data for Progress released a report using polling and MRP modeling showing strong support for the policy in swing districts. The report was tweeted out by the bill’s cosponsors, New York representative Alexandria Ocasio-Cortez and Massachusetts senator Ed Markey, reaching millions of people, and was covered extensively in the media, including an exclusive in Vox. In early 2020, the founder of Data for Progress, Sean McElwee, landed a meeting with Joe Biden’s political team and may have pushed his advisors to put climate policy at the forefront of the campaign. The group even convinced New York senator Chuck Schumer, the Senate majority leader, to blog on the firm’s website in support of unemployment insurance, which it found was very popular. “We’ve developed a currency that [politicians] are interested in,” McElwee told the New York Times in 2021. “We get access to a lot of offices because everyone wants to learn about the numbers.”

Poll-based public interest groups do not have to be advocacy-focused. They can partner with newspapers to share their findings and still meaningfully improve the political discourse. In the summer of 2021, for example, the Republican Party engaged in a full-throated campaign against critical race theory (CRT), a body of legal scholarship about racism and racial inequalities developed in the late twentieth century. Several Republican-led states, including Texas and Florida, banned coursework that talked about CRT or related subjects (such as the New York Times’s 1619 Project, a series of articles that examines the country’s history from the date when enslaved people first arrived on American soil). But a poll conducted by YouGov and published in partnership with the Economist found that only 26% of Americans had even heard “a lot” about CRT, and fewer had a clear idea of what it was. What misconceptions about their aggregate attitudes and priorities would the American people have held if those polls were not published?

Fielding timely and relevant polls can point legislators toward the things the people actually care about. If they don’t address key issues, or enact policies that a majority doesn’t like, the people can use the data to hold their leaders to account. In our fourth stage of democracy, the press, advocacy groups, and constituents would all work together to facilitate the link between the government and the governed — by using the polls.

Together, these steps would help fix the methods, correct the misconceptions, and elevate the impacts of public opinion polling in America. But do not mistake these prescriptions for polls as promises of democracy. A higher pedestal for the polls will not fix the many other forces working against representative government. I do not promise that polls are a panacea. Still, if we are interested in living under a truly representative government, more and better polling at least pushes us in the right direction.

We, the people, hold the final key to unlocking polling’s future. When the pre-election pollsters do make their next misstep, when some inevitably fall on the wrong side of 50-50 during the next election, we should not throw the baby out with the bathwater. We should remember that political polling is more like a weather prediction than a medical instrument; that the margin of error, at least twice as big as the one pollsters and journalists report, does not assign binary outcomes to elections but rather detects the probable distribution of opinions among the population. We should remember that aggregation and modeling do not remove the chance for all polls to be biased in the same direction. We must internalize the vision of polls as indicating a range of potential outcomes for an election, ballot initiative, or constitutional referendum, rather than a hyper-accurate point-prediction. Polls were not invented to produce such a thing — and due to the statistical laws of survey sampling and the complexities of psychology and human behavior, they never will.

Excerpted from Strength in Numbers: How Polls Work and Why We Need Them, which is now available for purchase. Copyright (c) 2022 by G. Elliott Morris. Used with permission of the publisher, W. W. Norton & Company, Inc. All rights reserved.

G. Elliott Morris is a data journalist and US correspondent for the Economist, where he writes on a range of topics including American politics, elections, and public opinion.

Source link

Author: G. Elliott Morris