Document Type


Publication Date



Accurate detection and estimation of behavioral health conditions, such as self-harm and opioid use disorder (OUD), is crucial for identifying at-risk individuals, determining treatment needs, tracking prevention and intervention efforts, and finding treatment-naive individuals for clinical trials. Despite the underdiagnosis and undercoding of these conditions in electronic health records (EHRs), our work aims to accurately estimate both the probability of a given patient having these conditions and the overall population prevalence.

We have developed a novel machine learning algorithm, “Positive Unlabeled Learning Selected Not At Random (PULSNAR)”, to estimate the prevalence of undiagnosed or unrecorded behavioral health conditions. Positive unlabeled learning differentiates between labeled positive instances and a mix of positive and negative instances (unlabeled). Our algorithm addresses the limitations of traditional methods, which do not accurately reflect the true prevalence of behavioral health conditions due to the fact that known, coded cases are not representative of undetected cases. Cases are generally not selected at random, for example, because more serious cases are more likely to generate a healthcare encounter.

In a study of 6,037,479 commercially insured patients with major mental illness (MMI) and 1,329,120 veterans, our PULSNAR algorithm estimates 3.97% visit-level self-harm among patients with MMI and 10.46% lifetime self-harm among Veterans, compared to the 0.453% and 1.85% coded in their EHR data, respectively. Chart review of 97 unlabeled individuals among the Veteran population confirmed that PULSNAR provides well-calibrated classification.

In a study of 1,000,000 patients with at least one opioid prescription fill, PULSNAR estimated 5.3% (53,144) of patients have OUD, compared to the 2.0% (20,079) that have a recorded diagnosis of OUD.

PULSNAR accurately estimates the prevalence of underdiagnosed/unrecorded behavioral health conditions, including self-harm and OUD. This has the potential to inform public health, guide screening efforts, identify health disparities, and reduce the negative impacts of these conditions.


Poster presented at the Brain & Behavioral Health Research Day 2023



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.