After 1,400+ hours on the NSF Green Bank Telescope, scientists unveil the largest, most sensitive dataset of molecules from deep space’s TMC-1 cloud

A groundbreaking new dataset from the U.S. National Science Foundation Green Bank Telescope (NSF GBT) is now publicly available, opening the door for scientists worldwide to make discoveries in one of the richest molecular clouds in our galaxy, TMC-1. After 1,438 hours of observations and years of data processing pipeline development [1], astronomers in the “GBT Observations of TMC-1: Hunting Aromatic Molecules” research survey, known as GOTHAM [2], have released a spectral line survey with largest amount of telescope time ever conducted, charting more than 100 molecular species—including many with complex and aromatic structures—only found in deep space.
TMC-1 is a region within the Taurus Molecular Cloud known for its incredible diversity of interstellar molecules, the perfect “cosmic laboratory” for astrochemistry. Using the GOTHAM survey, researchers identified ten individual aromatic molecules and nearly a hundred other chemical species, helping decode how molecules form and evolve before stars are born. Unlike regions closer to newborn stars, TMC-1’s chemistry is dominated by large hydrocarbons and nitrogen-rich compounds, providing tantalizing clues about the building blocks of planets and organic matter in the universe.
Until now, most telescope data remained inaccessible or too cumbersome for outside researchers to analyze, limiting discoveries to the original teams that collected the data. By releasing a fully-reduced and calibrated dataset, the GOTHAM project invites the global scientific community to pursue new questions, develop advanced chemical models, and potentially uncover phenomena no one expected. For the first time, astronomers everywhere can explore the deepest secrets of TMC-1 without needing advanced computing or data-cleaning skills.
“Sharing GOTHAM’s research in this way allows us to democratize access to big data in astronomy,” shares Brett McGuire, Associate Professor, Department of Chemistry, Massachusetts Institute of Technology (MIT), and an Adjunct Assistant Astronomer with the NSF National Radio Astronomy Observatory (NSF NRAO.) Data sharing efforts have been a mission of collaborative teams producing large datasets using NSF NRAO instruments for nearly two decades.
“It’s a lot of hard work to prepare and package this data for access. We’re really excited to see what the scientific community does next with this, we want to spread word far and wide that it’s available,” adds Ci (Ceci) Xue, co-PI of GOTHAM and lead author of the paper that shares the process behind the automated pipeline her team developed for data reduction and calibration. Xue, formerly a post doc with MIT’s Department of Chemistry, is now a post doc fellow at the NSF-Simons AI Institute for Cosmic Origins [3], of which the NSF NRAO is a partner.
The GOTHAM dataset is the largest and most comprehensive survey of its kind, setting a new benchmark for astronomical legacy data. Astronomers at MIT, the NSF NRAO, University of British Columbia, and partners are excited for new opportunities for collaboration and cross-disciplinary breakthroughs. The dataset includes calibrated spectra, detailed molecular abundances, and the cutting-edge software used for analysis, all publicly accessible for scientific exploration and innovation.
The release of this GOTHAM dataset is the product of a diverse collaboration spanning multiple institutions and specialties, led by McGuire, and featuring support from the NSF NRAO, NASA Goddard, and the U.S. National Science Foundation. As new molecule discoveries continue to be made in TMC-1, astronomers anticipate more groundbreaking advances in our understanding of how cosmic chemistry shapes our universe.
About
The National Radio Astronomy Observatory and Green Bank Observatory are major facilities of the U.S. National Science Foundation, operated under cooperative agreement by Associated Universities, Inc.