Skip to main content
Image of a healthcare provider in a white coat looking at a screen with diagrams

Using real-world health data from the UC Health Data Warehouse, researchers across UC’s academic health system are identifying disease risk earlier, including neurodegenerative disorders, Alzheimer’s disease, hospital delirium and cancer. Sustained federal science funding is critical to enabling this life-changing research.

Early detection is increasingly recognized as one of the most effective strategies for improving health outcomes, lowering long-term health care costs and enabling timely clinical intervention. 

Researchers across the University of California’s academic health system published multiple studies showing how large-scale, real-world health data can be used to identify disease risk earlier, strengthen clinical decision-making and support more personalized care.  

Together, these studies underscore the value of data-driven research for California’s patients and health systems—and the critical role of federal science agencies, including the National Institutes of Health, in supporting this work.

Empowering data-driven medical research

All findings were powered by the UC Data Discovery Platform (UCDDP), UC Health’s secure data science environment that enables researchers to query and analyze a dataset that complies with federal health data privacy regulations known as HIPAA. The dataset is derived from the UC Health Data Warehouse (UCHDW). The Center for Data-driven Insights and Innovations (CDI2) manages the UCHDW, which was created in 2018 to enhance clinical operational improvements, promote quality patient care, and enable the next generation of clinical research, including artificial intelligence and machine learning. 

Thanks to CDI2's outreach and engagement, the total number of UC researchers working in the Data Discovery Platform has greatly increased, up 113 percent from the prior year. As a result, UC researchers have used the UCDDP to produce 62 publications since 2020, including 16 in 2025. Publications have covered a wide range of clinical practices, risk factors, exposures, and disease endpoints, including cirrhosis, endometriosis, cancer, diabetes and Alzheimer’s disease.

One of the key themes recently for UC investigators who leveraged the systemwide dataset was to study real-world disease patterns and advance early detection across a range of complex conditions. 

Identifying disease trajectories in progressive supranuclear palsy

Researchers from UCLA, including the David Geffen School of Medicine, examined progressive supranuclear palsy, a rare and rapidly progressing neurodegenerative disorder with limited treatment options. Supported by the National Institutes of Health, the National Science Foundation, and state and foundation funding, the study moved beyond single risk factors to analyze how diagnoses unfold over time.

Using diagnosis data from the UCHDW and advanced analytics, researchers identified more than 250 unique illness trajectories experienced by patients before they received a diagnosis of progressive supranuclear palsy (PSP). These trajectories formed three distinct clusters of health issues: eye movement disorders, Parkinson’s disease or other neurodegenerative conditions. By mapping out these medical journeys and their timing, the authors found an opportunity to improve PSP risk assessment. Ultimately, this work may support earlier diagnosis, enable more personalized care, strengthen patient outcomes and inform future research.

Precision approaches to Alzheimer’s disease detection and treatment

A multidisciplinary team from UCSF, including investigators at the Bakar Computational Health Sciences Institute, explored Alzheimer’s disease using a novel, cell–specific approach. Supported by the National Institute on Aging, the study integrated single-cell data, drug databases, and real-world electronic medical records.

Researchers identified two existing drugs—letrozole and irinotecan—that were associated with lower Alzheimer’s risk in patient medical records. When researchers gave this drug combination to mice with Alzheimer's, the mice showed much better memory and had fewer signs of the disease in their brains. This work demonstrates how early molecular signals, paired with real-world clinical data, may support earlier intervention and more precise treatment strategies for Alzheimer’s disease in humans.

Predicting and preventing hospital delirium

A study led by researchers from UCSF and UC San Diego focused on delirium, a serious and often underrecognized condition among hospitalized patients. Supported by multiple National Institutes of Health grants, the team analyzed tens of thousands of patient records across the UC system.

The study identified metabolic abnormalities, psychiatric conditions and sex-specific risk factors that frequently precede delirium diagnoses. Longitudinal analyses showed that even a single episode of inpatient delirium is associated with long-term mortality risk. These findings highlight the importance of identifying at-risk patients earlier and implementing more proactive hospital-based prevention strategies.

Using AI to anticipate cancer relapse

A study led by UCSF researchers, in collaboration with CDI2 investigators, examined how well a CAR T-cell therapy works for patients with an aggressive form of lymphoma using real-world data from UC’s academic health system. Supported in part by the National Institutes of Health, the study found that while outcomes were comparable to those seen in clinical trials, more than half of patients experienced relapse over time. To address this challenge, the team developed a machine-learning model that uses a patient’s age and six routine laboratory tests collected within 24 hours of treatment to identify those at higher risk of relapse within six months. By identifying high-risk patients early, the model could help clinicians intervene sooner with additional therapies, with the goal of improving survival and long-term outcomes. 

The study was dedicated to Atul J. Butte, UC Health’s chief data scientist and inaugural director of the UCSF Bakar Computational Health Sciences Institute, who passed away in June 2025 and whose leadership and mentorship were instrumental in shaping UC’s data-driven research enterprise.

Systemwide impact for California

By bridging the gap between immense datasets and clinical applications, this collaborative approach is paving the way for a new standard of predictive health care.

“These 62 publications show how CDI2 and the data discovery platform are helping researchers detect disease earlier and understand how conditions develop over time in ways that were not possible before,” said Cora Han, UC Health chief health data officer.

Read the full list of publications powered by the UC Health Data Warehouse.

About University of California Health

University of California Health comprises six academic health centers, 21 health professional schools, a Global Health Institute and systemwide services that improve the health of patients and the University’s students, faculty and employees. All of UC’s hospitals are ranked among the best in California and its medical schools and health professional schools are nationally ranked in their respective areas.