People with cancer have different amounts of a type of repetitive DNA -; called Alu elements -; than people without cancer. Now, machine learning can measure that from a blood draw. Researchers at the Johns Hopkins Kimmel Cancer Center have used this finding to improve a test that detects cancer early, validating and reproducing the results by starting with a sample size tenfold larger than typical of such types of studies.
The research was published Jan. 24 in the journal Science Translational Medicine.
Alu elements are small: around 300 base pairs long out of 2 billion steps in a DNA ladder. But, changes in the proportion of Alu elements in people's blood plasma occur regardless of where cancer originates, explains lead study author Christopher Douville, Ph.D., an assistant professor of oncology at Johns Hopkins.
Blood testing holds great promise for the earlier detection of cancers before people exhibit any symptoms. However, analyzing results with machine learning "has not necessarily translated into long-term success for patients when minor fluctuations produce widely different predictions in these complex models. To have a long-term impact on patient care, physicians and patients must have confidence that models consistently and reproducibly classify cancer status. In our manuscript, we evaluated 1,686 individuals multiple times to assess whether our machine learning model consistently delivers the same answer."
Christopher Douville, Ph.D., assistant professor of oncology, Johns Hopkins
Douville and colleagues developed a test to detect aneuploidy, chromosome copy number alterations found in cancers. The test measured aneuploidy through a blood test called liquid biopsy, which detects fragments of cancer cell DNA circulating in the bloodstream.
However, Douville observed an unexplained signal that distinguished cancer from noncancer but could not be explained by chromosomes being gained or lost.
The team decided to combine their previous test -; able to check 350,000 repetitive locations in DNA -; with an unbiased machine learning approach.
Douville and colleagues collected samples from 3,105 people with solid cancers and 2,073 without. The study covered 11 cancer types and 7,615 blood samples. The repeats were used as replicates to see how well the model worked. They reached 98.9% specificity, which meant they could minimize false-positive test results. "This is crucial when screening asymptomatic patients, so people aren't told incorrectly that they have cancer," says Douville.
In an independent validation cohort, adding Alu elements to the machine learning model caught 41% of cancer cases missed by eight existing biomarkers and the group's previous test, making "a greater contribution," authors wrote in the paper, "than aneuploidy or proteins." The type of repetitive DNA contributing most to cancer detection was the largest subfamily of Alu elements, called AluS; the blood plasma of people with cancer had less of it than usual.
The model was called A-PLUS (Alu Profile Learning Using Sequencing). The code is available online.
Despite making up 11% of DNA from humans and other primates, Alu elements have been long touted as too difficult to use as a biomarker, Douville says. "They are small and repetitive -; technically difficult. But this research shows that counting repetitive lengths of DNA in blood plasma -; a motley crew of DNA fragments hailing from organs throughout the body -; is cost-effective and enhances early cancer detection," Douville says. They envision their Alu-based cancer detection as a complement to the toolkit of other cancer tests available to clinicians. The next step is prioritizing which biomarkers seem the most promising and aggregating them together.
Study co-authors included Kamel Lahouel, Albert Kuo, Haley Grant, Bracha Erlanger Avigdor, Samuel D. Curtis, Mahmoud Summers, Joshua D. Cohen, Yuxuan Wang, Austin Mattox, Jonathan Dudley, Lisa Dobbyn, Maria Popoli, Janine Ptak, Nadine Nehme, Natalie Silliman, Cherie Blair, Katharine Romans, Christopher Thoburn, Jennifer Gizzi, Michael Goggins, Ie-Ming Shih, Anne Marie Lennon, Ralph H. Hruban, Chetan Bettegowda, Kenneth W. Kinzler, Nickolas Papadopoulos, Bert Vogelstein and Cristian Tomasetti of the Johns Hopkins University School of Medicine and City of Hope.
Additional authors were from the Department of Medicine and Department of Epidemiology at the University of Pittsburgh; the Department of Surgery at NYU Langone; and abroad in Vietnam (Pham Ngoc Thach University of Medicine and Saigon Precision Medicine Research Center) and Australia (the Walter and Eliza Hall Institute of Medical Research, the University of Melbourne, the University of Technology Sydney and the University of New South Wales).
This study was supported by the NIH (grants U01CA271884, R21NS113016, RA37CA230400, U01CA230691, 5P50CA062924-22, Ovarian Cancer SPORE DRP 80057309), Oncology Core CA 06973, the Virginia and D.K. Ludwig Fund for Cancer Research, the John Templeton Foundation (62818), the Commonwealth Fund, the Thomas M. Hohman Memorial Cancer Research Fund, Alex's Lemonade Stand Foundation, The Sol Goldman Sequencing Facility at Johns Hopkins, the Conrad R. Hilton Foundation, the Benjamin Baker Endowment (80049589), Swim Across America, Burroughs Wellcome Career Award for Medical Scientists, the Thomas M. Hohman Memorial Cancer Research Fund, and the NHMRC (Investigator Grant APP1194970).
Under a license agreement between Exact Sciences Corp. and The Johns Hopkins University, Tomasetti and the university are entitled to royalty distributions. Tomasetti has patent applications for I.P. related to cancer early detection, is a member of the scientific advisory board of PrognomiQ Inc., an adviser for Haystack Oncology, and a paid consultant for the Rising Tide Foundation and Bayer AG. Vogelstein, Kinzler and Papadopoulos are founders of Thrive Earlier Detection, an Exact Sciences Company, and hold equity in and are consultants to CAGE Pharma. Kinzler, Papadopoulos and Douville are consultants to Thrive Earlier Detection. Vogelstein, Kinzler, Papadopoulos and Douville hold equity in Exact Sciences. Papadopoulos and Douville are consultants to Thrive Earlier Detection. Vogelstein, Kinzler, Cohen and Papadopoulos are founders of and own equity in Haystack Oncology and ManaT Bio. Kinzler and Papadopoulos are consultants to Neophore. Vogelstein is a consultant to and holds equity in Catalio Capital Management. Bettegowda is a consultant to Depuy-Synthes, Bionaut Labs, Haystack Oncology and Galectin Therapeutics, and is a co-founder of OrisDx. Bettegowda and Douville are co-founders of Belay Diagnostics.
The companies named above, as well as other companies, have licensed previously described technologies related to the work described in this paper from The Johns Hopkins University. Vogelstein, Kinzler, Papadopoulos, Bettegowda and Douville are inventors of some of these technologies. Licenses to these technologies are or will be associated with equity or royalty payments to the inventors as well as to The Johns Hopkins University. Patent applications on the work described in this paper may be filed by The Johns Hopkins University. The terms of these arrangements are managed by The Johns Hopkins University in accordance with its conflict-of-interest policies.
Source:
Journal reference:
Douville, C., et al. (2024). Machine learning to detect the SINEs of cancer. Science Translational Medicine. doi.org/10.1126/scitranslmed.adi3883.