On January 20, 2025, President Trump signed executive orders declaring that the United States officially recognizes only “two sexes, male and female,” and ending “DEI programs and preferencing” from all federal policies and practices.On January 29, 2025, a memorandum was sent to all department and agency heads instructing them to comply with the orders by removing webpages and documents that “promote gender ideology” by January 31, 2025. Based on these directives, public-use datasets, webpages, and published research containing information or wording related to sexual orientation and gender identity and HIV/AIDS were removed or edited to remove references to sexual orientation, gender identity, and DEI from governmental websites. These actions were temporarily blocked by a court order. Because of these executive orders and the chaotic manner in which they were carried out, it is uncertain whether data that have been removed and restored temporarily will be permanently deleted and whether future surveys will capture information on sexual orientation and gender identity.
Just 10 days after the executive orders had been signed, datasets, questionnaires, codebooks, and methodology documents related to the Behavioral Risk Factor Surveillance System (BRFSS) and the Youth Risk Behavior Surveillance System (YRBS) were removed from the CDC website. The Census Bureau’s Household Pulse Survey (HPS) data and documentation had been deleted, and Census data could no longer be downloaded from the FTP site. Other datasets related to monitoring HIV/AIDS in the U.S. and around the world were also removed. Data removals also included years of research results, including scientific papers, reports, and presentations.
Among the threatened documents are important sources of information accumulated over years that researchers, policymakers, and the public rely on to understand LGBT populations and to design policies and interventions that affect them. The vast range of actions by the administration so far threatens to adversely impact the health and well-being of LGBT populations.
History of Sexual Orientation and Gender Identity Data Collection
In 1999, when the Public Health Service was designing its next iteration of a 10-year blueprint of health priorities for the U.S., a group of researchers advising the process recommended the inclusion of LGBT populations in that plan. They concluded, however, that there was not sufficient quality population data to allow rigorous assessment of the health needs of LGBT people. The researchers noted that “clinical and public health research for these populations has been scarce … [and] there is currently no public health infrastructure for funding and supporting research on the health of LGBT communities.” Using data mostly from non-probability studies, sufficient evidence was culled to include lesbian, gay, and bisexual people (LGB) in the plan for the nation’s health, Healthy People 2010. (The transgender population was included in Healthy People 2020).
In part because of the inclusion of public health goals for LGBT populations in Healthy People 2010 and 2020 (published in 2000 and 2010, respectively) and an influential Institute of Medicine (IOM) report published in 2011, data collection of sexual orientation (and later gender identity) increased beginning in the early 2000s through 2024 with some interruptions during the first Trump administration.
Federal surveys that have collected sexual orientation and gender identity data for years include the CDC’s National Health Interview Survey (NHIS), National Health and Nutrition Examination Survey (NHANES), Behavioral Risk Factor Surveillance System (BRFSS), and Youth Risk Behavioral Survey (YRBS), the Bureau of Justice Statistics’ National Inmate Survey (NIS) and National Crime Victimization Survey (NCVS), and the Census Bureau’s Household Pulse Survey (HPS), among others.
Other surveys were being considered for the inclusion of sexual orientation and gender identity data before President Trump took office in 2025. Particularly important is the American Community Survey (ACS), a vital source of information about employment and earnings, housing conditions and expenses, education, citizenship, family composition, veteran status, disability, and insurance coverage. Researchers at the Census have been testing the inclusion of sexual orientation and gender identity measures for over a year and were expected to report their findings soon. Other surveys recently began to include questions about sexual orientation and gender diversity, such as the Department of Housing and Urban Development’s American Housing Survey (AHS) and the Administration for Community Living’s National Survey of Older Americans Act Participants (NSOAAP). The NSOAAP includes important topics affecting LGBT aging populations, such as economic and food insecurity, mental and physical health outcomes, and barriers to receiving health care and social support. The Williams Institute had submitted public comments to support the addition of sexual orientation and gender identity questions to these and other public surveys.
The Importance of LGBT Federal Data Collection
The collection of data about LGBT people in federal datasets has consisted of the inclusion of sexual orientation and gender identity questions in ongoing, large governmental surveys of the U.S. population (or regions therein). This inclusion allows researchers to identify the subpopulation of LGBT people among the larger study group, providing estimates specific to LGBT populations in various areas of interest and enabling comparisons between LGBT and non-LGBT populations.
The inclusion of sexual orientation and gender identity measures required a long process and significant public investment that included determining which questions to use, the validity and reliability of these questions, the impact of adding questions to the integrity of the survey as a whole, as well as political considerations relevant to federal and state governments.
Federal datasets are especially unique and important because they typically provide large numbers of participants—for example, the BRFSS surveys over 400,000 adults annually across all 50 U.S. states, the District of Columbia, Puerto Rico, Guam, and the U.S. Virgin Islands. Because the proportion of LGBT people in the general population is small (ranging from 0.5% for transgender individuals to 5.5% for sexual minority individuals), a large number of survey participants is required to obtain a large enough LGBT sample to arrive at accurate statistical estimates for this population. Such efforts are impossible for non-governmental agencies to replicate.
The future of LGBT data is now uncertain. It seems likely that government agencies perceive that compliance with the president’s executive orders requires that at least gender identity questions, and potentially sexual orientation questions, be eliminated. Indeed, reports of removals of both sets of questions have recently begun to surface. Funding of research by independent researchers through National Institutes of Health (NIH) grants and contracts, for example, is also at risk under the president’s executive orders. As just one example, the NIH has recently removed many grants that were scheduled for review because they focused on or included LGBT populations.
Federal Datasets that Include LGBT Populations
Federal data play a crucial role in monitoring and addressing the health and well-being of LGBT adults and youth in America. This is demonstrated by their widespread use among researchers nationwide and the wealth of knowledge gathered over time on various issues affecting the LGBT community.
The federal data on LGBT populations is a result of ongoing efforts to improve and expand data collection over several decades. At the onset of the AIDS epidemic in 1981, national and state data did not exist in the United States to allow an assessment of the impact of the epidemic on LGBT communities despite the heavy burden of AIDS-related illnesses and deaths among gay and bisexual men, men who have sex with men, and transgender women.
The CDC began collecting national data on behavioral risk factors for HIV only in 2003—more than 20 years after the beginning of the AIDS pandemic—through the establishment of the National HIV Behavioral Surveillance (NHBS) system. With the NHBS, the CDC was able to begin to systematically collect information on risk factors for HIV and assess prevention strategies to inform HIV prevention programs. Even with the important contributions of the NHBS to our nation’s public health goals, the surveys used suboptimal methods for recruiting study participants— including methods that aimed to estimate probability samples (e.g., venue-based sampling, respondent-driven sampling) but were not true probability sampling methods (e.g., random digit dialing, address-based sampling). These methods have been criticized for their limitations in simulating probability samples and suitability for sampling LGBT people. To comply with President Trump’s executive order on gender identity, the CDC has announced that it will now stop collecting data on transgender identity.
In 2002, the National Survey of Family Growth (NSFG) became one of the first federal surveys to ask respondents about their sexual orientation. NSFG provides important data about LGB people, same-sex couples, and their families. Researchers using these data have published hundreds of articles on LGBQ families, parenting intentions among LGB people, economic and food insecurity among LGB people, sex education and HIV testing rates among men who have sex with men, adverse pregnancy experiences, contraceptive use among lesbian and bisexual women, and various other topics. The Williams Institute used NSFG data from 2002 to produce some of the first estimates of gay men and lesbians who had or wanted to have children.
Census Bureau surveys played a key role in early demographic research of LGB people by first providing a source for calculating population counts and estimates of same-sex couples and parents based on the sex of spouses and unmarried partners and later by adding the option for same-sex couples to be identified beginning with the 2013 ACS and the 2020 Decennial Census. In 2021, the Household Pulse Survey (HPS) was the first Census Survey to directly include both sexual orientation and gender identity questions. With that, researchers were able to track detailed economic- and health-related experiences of LGBT people.
The National Health Interview Survey (NHIS), which began asking questions about sexual orientation in 2013, is another important source of health data, providing population-based evidence about illness and health care access and allowing policymakers, researchers, and the public to track progress over time toward health objectives set by the Public Health Service. As of February 2025, approximately 145 scientific papers have been published using NHIS data on LGB individuals. Many of these studies focused on investigating the prevalence of physical and mental health outcomes such as cancer and psychological distress, health-related behaviors such as smoking, disease screening and vaccination, and health care access and insurance coverage among LGB people and compared to heterosexual people. Through this, researchers have been able to identify significant differences in health outcomes and characteristics related to one’s sexual orientation.
The BRFSS and the YRBS are two national surveys widely used by federal, state, and local governments, researchers, health policymakers, and public health interventionists. These datasets have provided essential health information about the U.S. population, including invaluable data on LGBT adults and youth (respectively).
The BRFSS, which began including an optional sexual orientation and gender identity module in 2014, is the largest continuously conducted health survey in the world, with 400,000 survey participants. It collects data on health-related risk behaviors, chronic conditions, and preventive service use among adults, providing a detailed picture of public health across the U.S. Since the implementation of the sexual orientation and gender identity module in 2014, the number of states electing to include the module has grown, from about 20 states in 2014 to 35 states in 2022. Due to the large sample recruited by BRFSS, researchers can assess differences among LGBT subgroups, such as groups defined by race/ethnicity, age, and socioeconomic status.
The BRFSS has been foundational to LGBT research and public policy in many ways. For example, BRFSS data were used by the Williams Institute to estimate the LGBT population and separately the number of transgender adults. Research using the LGBT module of the BRFSS has resulted in over 215 scientific articles. This research has provided invaluable insights, such as documenting health disparities between LGBT and heterosexual cisgender populations and estimating the number of parents and rate of poverty among LGBT subgroups in the United States.
The YRBS is indispensable for understanding youth health behaviors and shaping effective interventions. The survey is conducted every two years in schools across the U.S. among youth aged 13-17. Sexual orientation questions were added in 2014, and gender identity questions were added in 2017. Since then, over 200 scientific articles and hundreds more publications have cited YRBS sexual orientation and gender identity data. These data have identified serious disparities in suicidality for LGBT youth, something that was suggested previously but could not be studied as rigorously before the YRBS data became available. The YRBS has documented the increased stress and violence experienced by LGBT students compared to non-LGBT students, including skipping school because they felt unsafe, having property stolen or damaged at school, high levels of bullying and cyberbullying, and involvement in more fights. It has also shown disparities in many other important health risk behavior areas, like smoking.
In addition to these health surveys, federal data are collected on an array of topics relevant to the well-being of LGBT populations. The National Crime Victimization Survey (NCVS) added sexual orientation and gender identity questions in 2016. The NCVS is important because it collects victim reports of victimization regardless of whether the incidents were reported to the police or persecuted. Analyses of NCVS data have shown alarming rates of victimization of LGBT people in the United States. Most recently, 2022-2023 NCVS data analyzed by the Williams Institute has shown that LGBT people experienced violent victimizations at a rate five times higher than non-LGBT persons. LGBT people experienced a higher rate of serious violence, defined as rape or sexual assault, robbery, or aggravated assault, than non-LGBT people, including higher rates of violence involving a weapon and serious violence resulting in injuries, and they were more likely than non-LGBT people to experience violent hate crimes.
In the area of criminal justice, the National Inmate Survey (NIS) has been a unique source of data on criminal justice system-involved LGBT people. Conducted by the Department of Justice as part of the mandate of the Prison Rape Elimination Act of 2003, the NIS provided data on a nationally representative sample of persons incarcerated in jails and prisons. Analysis of 2011-2012 NIS data showed that the rate of incarceration of LGB people is approximately three times higher than the already-high general U.S. incarceration rate and that incarcerated LGB people were more likely to experience mistreatment, harsh punishment, and sexual victimization than straight inmates.
Including questions about sexual orientation and gender identity in these surveys has been a long process of deliberation and planning that required methodological assessments, testing, and consideration of political contexts. These surveys have an important role because of the vast aspects of life they cover. They uniquely offer large numbers of respondents that allow for analysis of subpopulations based on sexual orientation and gender identity. Furthermore, they enable researchers to assess the intersection of LGBT identity with other important demographic characteristics, such as race/ethnicity, socioeconomic status, and urbanicity. These datasets are also unique for their scientific rigor and their longevity, which allows tracking of changes over time. The removal of such data from the public record and the loss of future data would set the United States decades backward to a time when little was known about the current demography, health, and well-being of the 14 million LGBT people in the United States.