I tend to think that the biggest problem with polling today isn't its ubiquity, but rather how pundits (as well as more casual observers) overinterpret the polls, something I myself am guilty of. As you alluded to, given the reality of margins of error, the polling of US presidential races can basically only tell us if the national/state picture looks competitive or not. Coming to firmer conclusions than that is dangerous, and all too common.
I think a lot of responsibility here falls on media outlets and journalists to better contextualize polls, not single out any particular poll, and to explain to readers the basic statistical concepts underlying polls. If we saw more of that, the power and allure of polling would perhaps be diminished, but in a good way, allowing for better conversations about elections.
I pretty much agree, but it is a bit of a chicken and egg situation. The poll is published, and hyped by the publisher/network/media sponsor, which creates more clicks/reads, which incentivizes more polls, etc. And then you have bad actors who produce polls purely for political purposes. And now you have markets being generated for betting on elections, which are treated as quasi-polls. It's a growing messy industry. But yes, too much analysis and consideration given to something that is just not more than say 10-20% as useful as the attention given to them.
(a) If polling could have removed Biden from the Dem ticket it would have happened months earlier. A single nationally-televised debate had a larger impact on that question than all the polling results of the previous six months combined -- for the simple reason that 50 million voters watched that debate live and could draw their own direct conclusions about his readiness for a second four-year term.
(b) Among the 90 percent of American voters who aren't political obsessives, campaign polling has been losing credibility and persuasiveness for years now. The median voter's interest in polling is at its lowest during my (long) adult lifetime, and still declining.
Not quite. Polls didn’t remove Biden - his cognitive and physical declines were the cause. The polls were simply a tool to force Biden - kicking and screaming like a denied toddler in a supermarket - to do the right thing and withdraw from the race. And thank God that he did.
The validity of the strong assumption used in polling that the sample distribution is a reasonable proxy for the population distribution at the time of the sample depends on meeting the requirement that each member of the population has an equal chance of being included and that each sample comes from the same distribution. (Shorthanded i.i.d.) The low response rate creates a non-response bias jeopardizes achieving i.i.d. The New York Times/Sienna poll has a response rate of only 2% (1 in 50 attempts). https://www.nytimes.com/article/times-siena-poll-methodology.html?smid=nytcore-ios-share&referringSource=articleShare&sgrp=c-cb
1. Independent and Identically Distributed (i.i.d.) Requirement:
- Independence: Each sample should be drawn independently of the others.
- Identically Distributed: Each sample should come from the same probability distribution.
2. How Non-Response Bias Violates i.i.d.:
a) Non-independent sampling:
- The decision to respond (or not) may be correlated among certain groups, violating independence.
- For example, people with strong political views might be more likely to respond, creating a dependency in the sampling process.
b) Non-identical distribution:
- Respondents and non-respondents likely come from different distributions.
- The 1% who responded may have characteristics that systematically differ from the 99% who didn't, violating the identically distributed assumption.
3. Implications for Random Sampling:
- Self-selection bias: The 2% who chose to respond have essentially self-selected into the sample, compromising randomness.
- Unequal probability of selection: Different subgroups in the population now have unequal probabilities of being included in the final sample.
4. Effect on Statistical Inferences:
- The violation of i.i.d. undermines the statistical foundations for inference.
- Standard errors and confidence intervals calculated assuming i.i.d. are likely to be underestimated.
- The sample may no longer be representative of the broader population.
5. Challenges in Analysis:
- Traditional random sampling theory may not apply directly.
- More complex methods, such as propensity score adjustments or multi-level modeling, may be necessary to account for non-response patterns.
6. Practical Considerations:
- The extreme non-response rate (98% suggests that the final sample is likely to be systematically different from a true random sample.
- Any inferences drawn from this sample should be heavily caveated.
7. Potential Remedies:
- Post-stratification weighting: Can help adjust for known demographic discrepancies but doesn't fully solve the i.i.d. violation.
- Non-response follow-up studies: Investigating characteristics of non-respondents can help assess and potentially adjust for bias.
- Multiple imputation: For handling missing data, though this has limitations with such high non-response rates.
In conclusion, significant non-response bias, as in this case, fundamentally challenges the i.i.d. assumption of random sampling. It transforms what was intended to be a probability sample into something more akin to a non-probability sample. This shift necessitates a different approach to analysis and interpretation, one that acknowledges the limitations of traditional random sampling theory in this context.
The key takeaway is that with such high non-response, we're no longer dealing with a simple random sample, and any analysis or reporting should reflect this limitation explicitly.
I tend to think that the biggest problem with polling today isn't its ubiquity, but rather how pundits (as well as more casual observers) overinterpret the polls, something I myself am guilty of. As you alluded to, given the reality of margins of error, the polling of US presidential races can basically only tell us if the national/state picture looks competitive or not. Coming to firmer conclusions than that is dangerous, and all too common.
I think a lot of responsibility here falls on media outlets and journalists to better contextualize polls, not single out any particular poll, and to explain to readers the basic statistical concepts underlying polls. If we saw more of that, the power and allure of polling would perhaps be diminished, but in a good way, allowing for better conversations about elections.
I pretty much agree, but it is a bit of a chicken and egg situation. The poll is published, and hyped by the publisher/network/media sponsor, which creates more clicks/reads, which incentivizes more polls, etc. And then you have bad actors who produce polls purely for political purposes. And now you have markets being generated for betting on elections, which are treated as quasi-polls. It's a growing messy industry. But yes, too much analysis and consideration given to something that is just not more than say 10-20% as useful as the attention given to them.
(a) If polling could have removed Biden from the Dem ticket it would have happened months earlier. A single nationally-televised debate had a larger impact on that question than all the polling results of the previous six months combined -- for the simple reason that 50 million voters watched that debate live and could draw their own direct conclusions about his readiness for a second four-year term.
(b) Among the 90 percent of American voters who aren't political obsessives, campaign polling has been losing credibility and persuasiveness for years now. The median voter's interest in polling is at its lowest during my (long) adult lifetime, and still declining.
I take exception to your characterization of President Biden as a toddler kicking and screaming in a grocery store.
Have some respect for the man and the office.
To whom was this comment directed?
Not to you Paul!
Not quite. Polls didn’t remove Biden - his cognitive and physical declines were the cause. The polls were simply a tool to force Biden - kicking and screaming like a denied toddler in a supermarket - to do the right thing and withdraw from the race. And thank God that he did.
The validity of the strong assumption used in polling that the sample distribution is a reasonable proxy for the population distribution at the time of the sample depends on meeting the requirement that each member of the population has an equal chance of being included and that each sample comes from the same distribution. (Shorthanded i.i.d.) The low response rate creates a non-response bias jeopardizes achieving i.i.d. The New York Times/Sienna poll has a response rate of only 2% (1 in 50 attempts). https://www.nytimes.com/article/times-siena-poll-methodology.html?smid=nytcore-ios-share&referringSource=articleShare&sgrp=c-cb
1. Independent and Identically Distributed (i.i.d.) Requirement:
- Independence: Each sample should be drawn independently of the others.
- Identically Distributed: Each sample should come from the same probability distribution.
2. How Non-Response Bias Violates i.i.d.:
a) Non-independent sampling:
- The decision to respond (or not) may be correlated among certain groups, violating independence.
- For example, people with strong political views might be more likely to respond, creating a dependency in the sampling process.
b) Non-identical distribution:
- Respondents and non-respondents likely come from different distributions.
- The 1% who responded may have characteristics that systematically differ from the 99% who didn't, violating the identically distributed assumption.
3. Implications for Random Sampling:
- Self-selection bias: The 2% who chose to respond have essentially self-selected into the sample, compromising randomness.
- Unequal probability of selection: Different subgroups in the population now have unequal probabilities of being included in the final sample.
4. Effect on Statistical Inferences:
- The violation of i.i.d. undermines the statistical foundations for inference.
- Standard errors and confidence intervals calculated assuming i.i.d. are likely to be underestimated.
- The sample may no longer be representative of the broader population.
5. Challenges in Analysis:
- Traditional random sampling theory may not apply directly.
- More complex methods, such as propensity score adjustments or multi-level modeling, may be necessary to account for non-response patterns.
6. Practical Considerations:
- The extreme non-response rate (98% suggests that the final sample is likely to be systematically different from a true random sample.
- Any inferences drawn from this sample should be heavily caveated.
7. Potential Remedies:
- Post-stratification weighting: Can help adjust for known demographic discrepancies but doesn't fully solve the i.i.d. violation.
- Non-response follow-up studies: Investigating characteristics of non-respondents can help assess and potentially adjust for bias.
- Multiple imputation: For handling missing data, though this has limitations with such high non-response rates.
In conclusion, significant non-response bias, as in this case, fundamentally challenges the i.i.d. assumption of random sampling. It transforms what was intended to be a probability sample into something more akin to a non-probability sample. This shift necessitates a different approach to analysis and interpretation, one that acknowledges the limitations of traditional random sampling theory in this context.
The key takeaway is that with such high non-response, we're no longer dealing with a simple random sample, and any analysis or reporting should reflect this limitation explicitly.