• How many LGBT people are there?

    Knowing how many lesbian, gay, bisexual, or transgender (LGBT) people there are matters for health policy. Gay men have an elevated HIV risk. LGBT kids have a higher risk of suicide. But no one knows this number. Recent surveys estimates range from 2% to 6% of the population.

    However, there are compelling new data suggesting that whatever the number is, current survey methods underestimate it. Because there is still stigma associated with same sex desire and behaviour, some people are reluctant to answer survey questions truthfully. Similarly, there are many other health behaviours whose prevalence is difficult to estimate including smoking, alcohol consumption, illicit drug use. So it’s possible that even on anonymous surveys, people under-report sensitive behaviors.

    But if people under-report even when they are promised anonymity, how can we  estimate the true prevalence of LGBT orientation? Katherine Coffman, Lucas Coffman, and Keith Ericson have a clever way at getting at this question:

    a randomly chosen control group of participants is asked to report how many of N items are true for themselves, where the items are neutral and non-sensitive in nature. The rest of the respondents report how many of N+1 items are true, with N items being identical to the control group’s items, and the N+1st item being a sensitive item, e.g. “I am not heterosexual”. With a large enough sample, the researcher can estimate the population mean for the N+1st item, the sensitive item, by differencing out the mean of the sum of the N other items as estimated from the control group.

    The Coffmans estimated how badly current survey methods underestimate LGBT orientation by comparing two ways of asking sensitive questions. The first way was the standard survey method, that is, directly asking anonymous respondents about their sexual orientation. The second way was the “veiled” method described above:

    The veiled method increased self-reports of non-heterosexual identity by 65% (p<0.05) and same-sex sexual experiences by 59% (p<0.01).

    The Coffmans also used the same design to find out whether survey respondents accurately report whether they hold socially undesirable attitudes like prejudice against gays or lesbians:

    The veiled method also increased the rates of anti-gay sentiment. Respondents were 67% more likely to express disapproval of an openly gay manager at work (p<0.01) and 71% more likely to say it is okay to discriminate against lesbian, gay, or bisexual individuals (p<0.01).


    The results show non-heterosexuality and anti-gay sentiment are substantially underestimated in existing surveys, and the privacy afforded by current best practices is not always sufficient to eliminate bias.

    So there may be quite a few more LGBT people than we thought there were. Should that change anyone’s beliefs about whether same sex desire and behaviour are licit? In my view, there is no ‘moral question’ about the legitimacy of same sex desire or behaviour, just as there is no moral question about eating hot sauce on scrambled eggs. But suppose that there was a question about this. It is unclear how that question would depend on the prevalence of same sex desire or behaviour.


    • You have raised a rather difficult question to answer. Thanks to the widespread of internet and media, we are now aware of the LGBT people. They are human beings with a different belief system. I believe that if the society hold NO preconditioned idea about how a person should feel affectionate towards whom or how a person should feel feminine or masculine because of the way they were born, we would get a very accurate data.

    • This is very smart research. Sociology and psychology researchers have long known that people tend to under-report differences in sexual orientation due to stigma.

      I should add: I don’t think this technique is adequate to estimate the prevalence of transgender identity. Sexual orientation and gender identity should be thought of as distinct (orthogonal) categories. Most people are straight and cis-gendered. But trans-gendered people can be straight or non-heterosexual. In any case, the authors asked questions along the lines of “are you non-heterosexual?”. We do not know what the distribution of answers would be among transgendered individuals. Also, there aren’t very many transgendered individuals, making it hard to do survey research (or at least, hard to not have to use snowball sampling).

    • Even with the veiled method, it is likely to understate the “true” incidence of non-heterosexuality. If you look in national surveys like BRFSS (for the subsets where the question was asked) and NHANES, There’s generally a large margin of people who admit to homosexual behaviors on a regular basis–often exclusively–but nevertheless consider themselves “straight.” Historically, the percentage admitting to homosexual behaviors has remained relatively stable, but the fraction considering themselves “straight” has plunged from around 60% in 2000 to 30% or so in 2011. This trend is almost certainly due to the increasing acceptance of homosexuality over that decade (gay sex was only decriminalized in 2003, for example), but it’s hard to see why–why would respondents be less honest about being gay than about having gay experiences? To me, this suggests the problem is not so much dishonesty as denial–ie, 60% of gay people in 2000 really did consider themselves straight.

      This raises a few problems. For one thing, though the fraction admitting to homosexual behaviors has been more stable over time, it is not a more accurate metric for the incidence of homosexuality. We know that because around 25% of those responding as non-heterosexual claimed to be virgins (or at least, never had a same-sex sexual experience). Moreover, it is hard to say what “true incidence” even means. If a person has same-sex attractions but will never, ever identify as LGBT in his lifetime, does it make any scientific or ethical sense to call him gay?

    • This type of methodology seems to be an extension of “Randomized Response”, Randomized Response has been around since at least the 60’s, and was likely in development before that.

      Surprising that it’s not used more often, but likely most people have either (a) never heard of it and/or (b) don’t understand the math behind it that allows one to draw these sorts of conclusions.